Senior Software Engineer
New York, NY
WarnerMedia seeks a Cloud Infrastructure Engineer for the Cloud Application Engineering department.
• Architect, codify and build out the international infrastructure platform for HBO Max, expanding and adapting our existing platform to a global audience.
• Champion best practices and uphold a culture that is committed to quality, test driven development and repeatable processes through automation and infrastructure as code, influencing not only our team, but also our client development and API engineering teams.
• Develop and support core functionality and components to support traffic shaping/routing.
• Build security into our systems and infrastructure to avert disruption and maintain uptime.
• Design & develop tooling that ensures resiliency and redundancy for our infrastructure with an eye towards reducing mean time to recovery in failure scenarios. Passion for making production deployments and severity events boring and uneventful.
• Review and propose team objectives and determine best technologies for implementation. Review and evaluate emerging technologies.
• Collaborate with other world-class software engineers across HBO to deliver ground-breaking content and features for the future of streaming media. Be a trusted resource across our software development teams for cloud best practices.
• Share knowledge, mentor and grow more junior staff.
• Develop tooling and services that provide real time insight into service and system health for a large distributed system.
• Deftly balance between the architectural requirements for normal day-to-day operations vs unprecedented streaming events such as Game of Thrones finale and premieres.
• Strive for operational excellence by participating in an on-call rotation as well as contributing to our incident management and blameless post-mortem processes.
• 8+ years of experience architecting, implementing and operating large-scale, highly available applications in a cloud environment with broad exposure to AWS architecture, networking and cloud security practices
• Extensive experience with infrastructure as code and configuration management tools like Terraform and Ansible and writing software to solve operations and reliability challenges.
• Experience in systems engineering and operations, especially for systems that are multi-region or datacenter, and are designed for resiliency and scalability
• Solid understanding of how the internet works and operates with a deep knowledge of HTTP, DNS, TCP/IP, REST, etc.
• A passion for learning, sharing knowledge, mentoring, and working in a team setting with engineers of varying levels of experience
• Hold yourself and your team to high standards while maintaining friendly, respectful relationships
• Experience with Amazon EKS architecture and operations
• Experience with observability tools such as log aggregation (Splunk/ELK), time series databases (Prometheus/Graphite) and Distributed Tracing
• Experience creating SLAs, SLOs, and SLIs for web-based services
Jobcode: Reference SBJ-rv1qyw-35-170-64-36-42 in your application.