company_logo

Full Time Job

Engineering Manager, NOC

Pluto TV

New York, NY 04-26-2022
 
  • Paid
  • Full Time
  • Senior (5-10 years) Experience
Job Description
Summary

The Manager, NOC for the Pluto TV Team is responsible for leading the incident response to any issues that may impact our quality of service, whether related to infrastructure, code deploys, partner outages and other related issues.

Responsibilities Include

This is a critical role with a wide range of responsibilities, including:
• Lead the lifecycle process for incident response, mitigation, escalation, analysis and reporting.
• Create the necessary processes and documentation for responding to early alerts and detection of common patterns of issues that may escalate into larger issues.
• Working with Engineering and Operations teams to resolve any active incidents as well as proactively mitigate future platform issues.
• Leverage our observability platforms to detect variance/outliers in our key systems and work with engineering teams and the Production Operations team to proactively avoid incidents and reduce MTBF and MTTR.
• Measure everything, establish and publish relevant site/service metrics and alerting (SLA/SLO).
• Review and improve existing processes and operational runbooks for the NOC team.
• Assist with FinOps (reporting, visibility and optimization of cloud costs/spend).

Qualities / Experience We're Seeking

We believe the right individual will have the following skills and experience in order to be successful in the role:
• Effective, clear communicator, able to articulate across all levels of the organization.
• 6+ years of supervisory experience leading a NOC or Incident Response command center.
• Strong incident management or ITIL background with increasing responsibility.
• 2+ years of experience with configuration management tools such as Terraform (preferred), Cloudformation, Ansible, Chef, or Puppet.
• 2+ years of DevOps experience for large scale AWS services including EKS, EC2, VPC, S3, Lambda, Cloudwatch.
• Ideal candidate will have experience with Vault and Elasticsearch/Kibana.
• Monitoring experience using Prometheus and Grafana is a plus.

ViacomCBS is an equal opportunity employer (EOE) including disability/vet.

Jobcode: Reference SBJ-g3xnk5-3-135-190-232-42 in your application.