What We Do
Our Observability team is looking for a Senior SRE to help us build and operate the infrastructure our teams rely on to keep our platforms, games and online services running.
Our Observability team works closely with teams across Epic to implement industry best practices and develop new monitoring capabilities.
What You'll Do
As an SRE on Observability you will tackle problems that impact how we understand and operate our products at scale. Part of this role is advancing the state of the art for observability at Epic. Building tooling to standardize and make our systems easier to understand. In this role you will build and operate the systems that process and transport the large volumes of telemetry data generated by services at Epic.
In this role, you will
• Service Ownership - At Epic we embrace a Service Owner (You build it, you run it) mentality. In this role you will work together with other members of the Observability team to operate the infrastructure our developers depend on to operate their own services.
• Develop and Ship - You will work to modernize key portions of our observability infrastructure. Building new data processing pipelines for telemetry data as well as writing software to automate processes and generate new insights
• Collaborate - You will work with teams across Epic as an observability subject matter expert to provide guidance on observability best practices.
What we're looking for
• Experience with executing meaningful change in a fast-paced interrupt driven environment
• Self-starter, you approach challenges creatively and methodically, seeing them through to final resolution
• Experience working across teams in a collaborative environment
• Ability to adapt and be effective in new situations within a highly dynamic environment
• Experience working with large scale systems in AWS
• Ability to write code for simple services and process automation
• Are familiar with application/service monitoring strategies and technologies.Including projects such as OpenTelemetry, Prometheus, Grafana, FluentD, New Relic, Datadog, Honeycomb and Sumo Logic.
Jobcode: Reference SBJ-rjn8e1-18-208-132-74-42 in your application.