company_logo

Full Time Job

Lead Platform Operations Engineer

NBCUniversal

Remote / Virtual 09-02-2022
 
  • Paid
  • Full Time
  • Senior (5-10 years) Experience
Job Description
Responsibilities

The Lead Platform Operations Engineer will be a key member of our Platform Operations Team. As the Lead Platform Engineer you'll be comfortable dealing with complex challenges, collaborating with Engineering and Product teams to find solutions. You'll be able to take business requirements, and work with the Platform Operations Team to ensure that scalability, security, supportability and automation is built into systems architecture. They help to build and support the core infrastructure of the SportsEngine Platform services and products through activities and key responsibilities that include:
• Contributing to efforts that ensure the continuous and smooth running of the SportsEngine platform while serving a large volume of traffic.
• Contribute to technical stewardship, including advocating for best practice adoption and increasing adoption of automation and Infrastructure as Code.
• Working with Engineering and Product teams to take business requirements and architect robust, scalable and supportable infrastructure.
• Leveraging Amazon Web Services to build highly available services for the SportsEngine infrastructure platform built on top of the EKS, RDS and EC2.
• Developing Infrastructure as code using tools like Terraform.
• Helping to foster a culture of cooperation, coordination, and continuous learning within the Platform Operations Team and with other Product Development teams throughout SportsEngine.
• Working closely with the SportsEngine Cyber Security Team to maintain and improve the security of the SportsEngine Platform.
• Contributing to and using our GitHub Pull Request-centered development pipeline as we continuously deliver value to our customers.
• Using tools such as NewRelic, Splunk and Datadog to monitor the health of the SportsEngine platform.
• Being an advocate for quality code and engineering practices that enable Continuous Delivery.
• Participation in a sustainable on-call schedule.

Qualifications
• 5 or more years of experience in the field of Software Engineering which operating web applications in a Site Reliability Engineering, Web Operations, or Cloud Engineering capacity.
• Proven ability to architect robust, scalable, and supportable infrastructure.
• Experience mentoring/growing junior engineers and ensuring adoption of best practices across a team.
• Ability to deal with complex systems, debugging and understanding interactions, resolving incidents, and producing RCA's.
• A strong foundation in modern infrastructure practices and the ability to deploy and operate maintainable, scalable secure infrastructure.
• Ability to write quality, modular, maintainable, secure, and testable infrastructure automation.
• A team-oriented attitude and seemingly endless intellectual curiosity.
• Excellent verbal and written communication skills.

Desired Characteristics
• Proven ability of architecting systems from business requirements, and considering the following:
- Scalability
- Security
- Supportability
- Documentation
- Financial management of Cloud Spend
• AWS Experience
- Experience in the following areas of AWS:
- EC2
- VPC – Subnets, Security Groups, NAT Gateways, Transit Gateways, ELB/ALB/NLB etc.
- IAM
- S3
- Managed data tiers - RDS/Elasticache etc.
+ Nice to have
- Experience in production with EKS
- Experience in production with OpsWorks
- Experience in production with Lambda
- Experience in production with DynamoDB
• Kubernetes
- Production experience of running services in Kubernetes
- Ability to take a VM based application and migrate to Kubernetes
• CI/CD
- Experience with CI/CD pipelines, assisting developers in delivering changes on a daily cadence
- Experience with TravisCI, Jenkins, Gitlab CI, Github Actions or similar technologies
• Automation
- Ability to script automation in one of either Ruby, Python, Go etc
• Infrastructure as Code: Terraform
- Ability to author Terraform at a proficient level
- Ability to break out reusable, opinionated and standardized actions into Terraform modules for reuse
• Chef/Ansible

Interested candidates must;
• Submit a resume/CV through www.nbcunicareers.com to be considered.
• Participate in a rotational ''on call'' schedule (24 hours a day / 7 days a week)

Successful candidates will be required to;
• Submit an attestation disclosing COVID-19 vaccination status and, if partially or fully vaccinated, submitting their vaccination record no later than 7 days following commencement of employment.
• Must be fully vaccinated against COVID-19 at the commencement of employment or adhere to enhanced protocols in select work settings or where jurisdictionally mandated.
• Be willing to adhere to all Company COVID-19 workplace safety policies and protocols.

Additional Information

NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law. NBCUniversal will consider for employment qualified applicants with criminal histories in a manner consistent with relevant legal requirements, including the City of Los Angeles Fair Chance Initiative For Hiring Ordinance, where applicable.

If you are a qualified individual with a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access nbcunicareers.com as a result of your disability. You can request reasonable accommodations in the US by calling 1-818-777-4107 and in the UK by calling +44 2036185726.

Jobcode: Reference SBJ-gm3x8v-18-227-190-93-42 in your application.