Web Engineer, Trust & Safety
Epic Games
Bellevue, WA
Site Reliability at Epic
What we do
Our team's mission is to keep our games and platform up and running.
Post Incident Review
There is always an interesting form of something not working as we expect. We focus on how we learn from these production surprises and improve our systems and processes to be more reliable over time. We work with a diverse set of development teams on helping understand incidents.
Production, Event and Launch Readiness
We run large scale production events and we work with many teams on readiness and operational excellence. We own the process and review for service and product launches and game events.
Development focused on Reliability
While we help with incidents and readiness, we also work on engineering on tooling, services or other systems and processes that can improve our systems reliability.
What you'll do
In the role of a Site Reliability Engineer you will tackle problems that impact reliability of our products as a whole. Part of this role is analyzing gaps or risk areas for our products and working with engineering to determine the best course of action. You will participate in post incident reviews, readiness programs and engineering and development efforts. This role is expected to have breadth over depth, but depth in building and running reliable systems.
At Epic we embrace a Service Owner (You build it, you run it) mentality. In this role we are stewards for operational excellence and we are service owners for tools, systems and services that we build.
In this role, you will
• Write code and develop systems and services that help us with operational excellence. Most of our tools will require web interfaces and APIs.
• Contribute to services, tools and code across the organization that focuses on our team goals.
• Help develop best practices across our organization and tools that help us distribute those.
• Work with development teams on understanding systems and helping them be successful with service ownership.
• Work on cloud based services in AWS.
What we're looking for
• You have working cross functionally or across a large number of teams in an organization.
• You have experience working with and building reliable services on AWS.
• A passion for the reliability engineering space.
• Strong preference for candidates who are already in, or are willing to relocate to Cary, NC or Seattle, WA
Epic Games deeply values diverse teams and an inclusive work culture, and we are proud to be an Equal Opportunity employer. Learn more about our Equal Employment Opportunity (EEO) Policy here.
Jobcode: Reference SBJ-rvkw8m-3-85-9-208-42 in your application.
Founded in 1991, Epic Games is a leading interactive entertainment company and provider of 3D engine technology. Epic operates Fortnite, one of the world’s largest games with over 350 million accounts and 2.5 billion friend connections. Epic also develops Unreal Engine, which powers the world’s leading games and is also adopted across industries such as film and television, architecture, automotive, manufacturing, and simulation.