App Implementation Manager
New York, NY
Build the future of Streaming Media with us! We are the team that architects, codifies, owns and operates the scalable and resilient infrastructure that powers HBO's flagship streaming products HBO Max and HBO Go. Our products are enjoyed by millions of customers around the world. We are software engineers with a passion for infrastructure, automation, tooling and security. Building reliability, performance and security into our streaming products is our mission. We are evangelists for infrastructure as code, continuous delivery, operational visibility, and strongly believe that automation is the key to delivering reliable systems. We are looking for a senior engineer to help us expand our infrastructure as code efforts across our engineering organization and also be a core lead in helping us architect and deliver HBO Max on global scale!
As a Senior Software Engineer on WarnerMedia's Cloud Application Engineering team, you will design, code, scale and support for the global Cloud infrastructure of HBO's flagship OTT streaming platforms, along with the core applications and tooling that supports it. You will create robust, repeatable processes and automation to safeguard our customer experience as well as increase efficiency of our applications and team. Passion for learning, collaboration and fun is a must!
• Architect, codify and build out the international infrastructure platform for HBO Max, expanding and adapting our existing platform to a global audience.
• Champion best practices and uphold a culture that is committed to quality, test driven development and repeatable processes through automation and infrastructure as code, influencing not only our team, but also our client development and API engineering teams.
• Develop and support core functionality and components to support traffic shaping/routing.
• Build security into our systems and infrastructure to avert disruption and maintain uptime.
• Design & develop tooling that ensures resiliency and redundancy for our infrastructure with an eye towards reducing mean time to recovery in failure scenarios. Passion for making production deployments and severity events boring and uneventful.
• Review project objectives and determine best technologies for implementation. Review and evaluate emerging technologies.
• Ensure clear/straightforward design and comprehensive documentation of code. Look for ways to continually improve current codebase and test suite with each commit.
• Collaborate with other world-class software engineers across HBO to deliver ground-breaking content and features for the future of streaming media. Be a trusted resource across our software development teams for Cloud and IaC best practices.
• Share knowledge, mentor and grow more junior staff.
• Develop tooling and services that provide real time insight into service and system health for a large distributed system.
• Deftly balance between the architectural requirements for normal day-to-day operations vs unprecedented streaming events such as Game of Thrones finale and premieres.
• Strive for operational excellence by participating in an on-call rotation as well as contributing to our incident management and blameless post-mortem processes.
• 5+ years of experience building and operating large-scale, highly available applications in a cloud environment with broad exposure to AWS architecture, networking and cloud security practices
• Solid experience in designing, implementing and supporting container-based public cloud platforms with IaaS (AWS, Azure) and Kubernetes/Docker
• Solid Linux experience and experience with DevOps/Infrastructure as Code tooling such as Terraform and configuration management tools like Ansible
• Experience in systems engineering and operations, especially for systems that are multi-region or datacenter, and are designed for resiliency and scalability
• Experience in managing the container lifecycle in a service-oriented infrastructure
• Expert level software development experience writing large, distributed applications/services in languages such as NodeJS, Python, GoLang or Java
• Experience in monitoring and telemetry: Telegraf, Grafana, InfluxDB, and Prometheus
• Solid understanding of how the internet works and operates, particularly in client/server transactions with a keen knowledge of HTTP, DNS, REST, etc.
• Experience creating automated tests as part of the development lifecycle. Passion for test driven development.
• Full working knowledge of Git version control
• A passion for learning, sharing knowledge, mentoring, and working in a team setting with engineers of varying levels of experience
The Nice to Haves
• Experience with Amazon EKS architecture and operations
• Familiarity with automated infrastructure testing using tools such as Serverspec, AWSpec, or Terratest
• Experience with observability tools such as log aggregation (Splunk/ELK), time series databases (Prometheus/Graphite) and Distributed Tracing
• Experience creating SLAs, SLOs, and SLIs for web-based services
• Exclusive WarnerMedia events and advance screenings
• Paid time off every year to volunteer
• Access to well-being tools, resources, and freebies
• Access to in-house learning and development resources
• Part of the WarnerMedia family of powerhouse brands
WarnerMedia is a leading media and entertainment company that creates and distributes premium and popular content from a diverse array of talented storytellers and journalists to global audiences through its consumer brands including: HBO, HBO Now, HBO Max, Warner Bros., TNT, TBS, truTV, CNN, DC Entertainment, New Line, Cartoon Network, Adult Swim, Turner Classic Movies and others.