Engineering Manager, Production Operations (Remote Job)
*This is a remote job opportunity*
The Engineering Manager, Production Operations for the Pluto TV Team is responsible for leading the team that runs our production infrastructure (and more), ensuring reliable, highly available and performant systems, delivering content to our many viewers.
This is a critical role with a wide range of responsibilities, managing a team that needs to:
• Plan, design, build, monitor, and improve our production infrastructure and operational processes.
• Work with SRE, DevOps and Observability teams to build and maintain all cloud infrastructure with focus on reliability, efficiency, scalability and costs.
• Work with development, engineering, and STE (Software Test Engineering) to prepare and support all production releases and upgrades.
• Participate in the lifecycle process for incident response, mitigation, escalation, analysis and reporting.
• Working with Engineering and Operations teams to resolve any active incidents as well as proactively mitigate future platform issues.
• Leverage our observability platforms to detect variance/outliers in our key systems and work with engineering teams and the NOC team to proactively avoid incidents, optimize MTBF & reduce MTTR.
• Help review and improve existing processes and operational runbooks for the NOC team.
Qualities / Experience We're Seeking
We believe the right individual will have the following skills and experience in order to be successful in the role:
• Effective, clear communicator, able to articulate across all levels of the organization.
• 4+ years of supervisory experience leading a production facing ops/devops/SRE team.
• 4+ years of experience with incident management or ITIL processes.
• 4+ years of experience with a visibility/observability platform like Datadog, SignalFX, Prometheus, or similar.
• 2+ years of experience with configuration management tools such as Terraform (preferred), Cloudformation, Ansible, Chef, or Puppet.
• 4+ years of DevOps experience for large scale AWS services including EKS, EC2, VPC, S3, Lambda, Cloudwatch.
• Ideal candidate will have experience with Kubernetes and MongoDB
ViacomCBS is an equal opportunity employer (EOE) including disability/vet.
Jobcode: Reference SBJ-gpm5oo-3-236-28-137-42 in your application.