The Platform Operations Manager will collaborate closely with the Director of DevOps, Lead Platform Operations Engineer, Lead SRE Engineer, Cyber Security Manager and other Technical Engineering Directors to ensure that the SportsEngine Platform is robust, scalable, supportable and secure. They will be responsible for:
• Proactively driving operational excellence, innovation and team performance through a progressive people and project management approach that includes:
• Managing and monitoring individual and team progress towards delivery and embedding a culture of ongoing learning, feedback, and knowledge sharing.
• Collaboration with colleagues and advising senior management on opportunities to improve such as implementing new processes and emerging technologies.
• Building capacity within and across the team through a strong focus and support of Infrastructure as Code, and Site Reliability Engineering principles.
• Strategic planning for platform growth and modernisation.
• Project management.
• Incident management; including leading to resolution, but also post incident RCA and preventative work planning and execution.
• Public Cloud spend planning.
The Platform Operations Manager will manage a team that is responsible for the following areas:
• Contributing to efforts that ensure the continuous and smooth operation of the SportsEngine platform while serving a large volume of traffic.
• Leveraging Amazon Web Services to build highly available services for the SportsEngine infrastructure platform built on top of the EKS, RDS and EC2.
• Developing Infrastructure as code using Terraform.
• Helping to foster a culture of cooperation, coordination, and continuous learning within the Platform Operations Team and with other Product Development teams throughout SportsEngine.
• Working closely with the SportsEngine Cyber Security Team to maintain and improve the security of the SportsEngine Platform.
• Contributing to and using our GitHub Pull Request-centered development pipeline as we continuously deliver value to our customers.
• Using tools such as NewRelic, Splunk and Datadog to monitor the health of the SportsEngine platform.
• Being an advocate for quality code and engineering practices that enable Continuous Delivery.
• Participation in a sustainable on-call schedule.
• 7+ years of experience in the field of Web Platform Operations, preferably in a Public Cloud environment
• 2+ years of Engineering Management experience:
- Experience of working at a senior level, contributing to managing a team of engineers, growing a team, and setting them up for success
- Experience of managing projects
• A strong foundation in modern infrastructure practices and the ability to deploy and operate maintainable, scalable secure infrastructure.
• In-depth knowledge of the AWS Platform with the ability to architect Public Cloud solutions based on business requirements.
• A commitment to Infrastructure as Code as the sustainable way to run operations at scale.
• A team-oriented attitude and seemingly endless intellectual curiosity.
• Excellent verbal and written communication skills.
• Must submit an attestation disclosing your COVID-19 vaccination status and, if partially or fully vaccinated, submitting your vaccination record no later than 7 days following commencement of employment.
Must be fully vaccinated against COVID-19 at the commencement of employment or adhere to enhanced protocols in select work settings or where jurisdictionally mandated.
• Must be willing to adhere to all Company COVID-19 workplace safety policies and protocols.
You will have the edge over the competition if you can also demonstrate any of the following;
• People Management
- Experience as an Engineering Manager
- Helping to empower, mentor and grow engineers
• Incident Management
- Experience leading incidents in Production, bringing to resolution and carrying out post incident review, RCA generation etc.
• Solutions Engineering and Project Management
- Ability to take business requirements and architect a secure, scalable and supportable solution using AWS Services.
• AWS Experience
- Fiscal management of AWS
RI purchase, or Savings Plan
- Experience in the following areas of AWS:
VPC – Subnets, Security Groups, NAT Gateways, Transit Gateways, ELB/ALB/NLB etc.
Managed data tiers - RDS/Elasticache etc.
• Experience in production with
- Production experience of running services in Kubernetes
- Ability to take a VM based application and migrate to Kubernetes
- Experience with CI/CD pipelines, assisting developers in delivering changes on a daily cadence
- Experience with TravisCI, Jenkins, Gitlab CI, Github Actions or similar technologies
- Ability to script automation in one of either Ruby, Python, Go etc
• Infrastructure as Code
Ability to author Terraform at a proficient level
Ability to break out reusable, opinionated and standardized actions into reusable Terraform modules
NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law. NBCUniversal will consider for employment qualified applicants with criminal histories in a manner consistent with relevant legal requirements, including the City of Los Angeles Fair Chance Initiative For Hiring Ordinance, where applicable.
If you are a qualified individual with a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access nbcunicareers.com as a result of your disability. You can request reasonable accommodations in the US by calling 1-818-777-4107 and in the UK by calling +44 2036185726.
Jobcode: Reference SBJ-d807k9-3-238-90-95-42 in your application.