company_logo

Full Time Job

Liveops Sre

Epic Games

London, United Kingdom 04-10-2024
Apply @ Employer
  • Paid
  • Full Time
Job Description
LiveOps SRE

What We Do

The Epic LiveOps team provides the best possible experience for our players. We dive deep into the data to understand player needs, minimize disruption, and manage Epic's incident response process.

What You'll Do

You will be the voice of the customer in a wide variety of contexts across Epic's business. You will dive deep into incidents to make sure we are providing our players the best possible experience, we hold a relentlessly high bar for Epic's service quality, we focus the attention of our business and tech teams on the right priorities, and operate Epic's incident management process. When other mechanisms fail, LiveOps is the backstop that ensures that Epic operates in the best interest of our players' experience.

In this role, you will
• Respond to alerts and manage issues in the production environment
• Our Site Reliability Engineers manage the development and operation of our Incident Management Tooling, ensuring robust tooling support for the Incident process
• Produce specifications and determine the operational feasibility of our tooling
• Develop quality standards, documentation and testing for our tools codebase
• Maintain, improve, troubleshoot, debug and update our codebases
• Develop automated tooling features to drive incident management improvements and reduce the operational cost of the Incident process
• Work across the stack: Backend, frontend, infrastructure, operation to test, deploy, and iterate based on stakeholder feedback

What we're looking for
• You thrive on ambiguity. You can understand a diverse set of product features and both identify how an issue impacts a single customer, and can quantify the business impact. You're capable of identifying larger trends surrounding disparate issues and enable product teams to solve the real underlying issues
• You have a strong technical basis and know how to learn new things. Strong analysis and problem solving skills are essential to do this role successfully (we live in Grafana, Tableau, and similar tools)
• You are a problem solver with experience with AWS and other cloud infrastructure tools will make you comfortable in this role and the ability to script and automate actions in languages like Python, Ruby, or Go is a bonus
• You have experience working cross-functionally or across a large number of teams in multiple organizations
• You have extensive experience working with and building reliable services on AWS or other major cloud infrastructure providers
• You have a passion for the reliability engineering space

Note to Recruitment Agencies: Epic does not accept any unsolicited resumes or approaches from any unauthorized third party (including recruitment or placement agencies) (i.e., a third party with whom we do not have a negotiated and validly executed agreement). We will not pay any fees to any unauthorized third party. Further details on these matters can be found here.

Jobcode: Reference SBJ-rzvzz1-18-191-223-123-42 in your application.

Company Profile
Epic Games

Founded in 1991, Epic Games is a leading interactive entertainment company and provider of 3D engine technology. Epic operates Fortnite, one of the world’s largest games with over 350 million accounts and 2.5 billion friend connections. Epic also develops Unreal Engine, which powers the world’s leading games and is also adopted across industries such as film and television, architecture, automotive, manufacturing, and simulation.