company_logo

Full Time Job

Senior Site Reliability Engineer

Netflix

Los Angeles, CA 09-22-2021
 
  • Paid
  • Full Time
  • Senior (5-10 years) Experience
Job Description
About Netflix

Team Description

The Critical Operations and Reliability Engineering team's goal is to drive customer joy by thoughtfully managing risk and minimizing impact across Netflix. We do this through cross-functional engagement with other engineering teams, managing issues when they happen, as well as promoting reliability and resilience practices throughout the organization. Our team is seeking individuals with a broad set of technical skills with an impressive history of unique career and life experiences to bring diverse views to our team. This role is rewarding for people who can collaborate in a complex environment.
Outcomes
- Increase our reliability through an automation focused mindset to solving problems
- Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks
- Form and maintain relationships with internal and external partners
- Develop deeper insights into the quality of experience for our customers

We Value
- Curiosity about how complex socio-technical systems successfully operate at scale when failure is inevitable
- The ability to develop alignment to cultivate relationships and driving impact
- Collaboration, continuous improvement, and iteration as the path forward
- A desire to grow expertise, inform, and educate others
- Comfort with being uncomfortable in ambiguous situations
Our Work
- Incident escalation & on-call rotation
- Drive incidents to resolution by collaborating with multiple engineering teams
- Identify sources of instability in distributed systems and drive operational excellence
- Analyze complex systems from a reliability and resilience perspective
- Engage with product teams to diagnose and correct operational surprises
- Improve availability, reliability, and observability of Netflix services and reduce the burden of human toil with tooling and automation
- Robust communication with team members and customers

Nice to Have
- Involvement with incident management and response
- Development with Python, Go, Java, or JavaScript/Node.js
- Knowledge of cloud platforms like AWS and microservices architecture

Ability to travel when required; 10-15% for business meetings and team offsites.

Be sure to review our culture page and long-term view to learn more about the unique Netflix culture and the opportunity to be part of our team. If any of these things sound interesting to you, please apply.

Jobcode: Reference SBJ-gm1po2-3-131-13-37-42 in your application.