Sinclair Broadcast Group
Hunt Valley, MD
Sinclair Broadcast Group is looking for an exceptional Site Reliability Engineer to expand the DevOps practices across the Sinclair Technology Partners team. The individual will continually improve our virtualized and cloud-based platforms, ensuring uninterrupted service for Sinclair customers and demonstrating force multiplying expertise through automation.
The larger mission of the SRE DevOps team is to build foundational platform engineering, tooling, and automation allowing product teams to release and scale reliably and predictably. SREs advance the architecture and performance of software systems and train their peers in topics such as debugging distributed systems, building self-healing infrastructure, and ensuring availability to fuel the company's growth.
• Operate, monitor, and maintain high availability of software service for Sinclair products running in a multi-region VMware and AWS environments
• Automate, scale, and manage the high availability of software service for Sinclair products running in a multi-region VMware and AWS environments
• Work with multiple stakeholder teams to establish service level objectives and monitor to ensure the objectives are met
• Continually improve cloud operations automation and tooling to monitor and maintain enterprise cloud-based applications
• Troubleshoot infrastructure and application issues and work with development and operations teams to resolve issues, escalating quickly when help is needed.
• Identify and improve on possible points of failure in the infrastructure/applications
• Participate in performance, stress, and security testing
• Participate in blame-free root cause analysis meetings in the event of a production-systems incident so that the team can learn from mistakes and improve our systems and run books
• Vigilance in securing our data and access policies and adhering to best practices in securing our cloud and on-prem infrastructures
• Plan and perform security patches on our applications and underlying infrastructure
• Collaborate with a great team to maintain, monitor, and improve applications that deliver content for end-users
• Take pride in the quality of your code, the work it takes to make great software, and the value delivered to the end-user
Experience, Skills & Competencies:
• A bachelor's degree in Computer Science, Computer Information Systems, Management of Information Systems, or a related field
• 6+ years of relevant systems analysis/support/SRE/Operations experience
• 6+ years of relevant experience directly administering and supporting CentOS-based environments
• 2+ years of relevant experience administering and supporting Windows Server-based environments
• 4+ years of experience automating IT processes through Ansible/Terraform/Python/Shell Scripting
• 3+ years' experience with source control management such as Git
• Must have led or built something new at enterprise scale (more than 5 users)
• Must have supported developers in a CI/CD environment
• One or more AWS certification that is current
• Understanding of AWS cloud services and how to leverage them for compute, storage, and managed services (more than just ''EC2'')
• Exposure to VMware environments and how to leverage them for compute, storage, and managed services
• Experienced with modern DevOps engineering and security practices and comfortable with diverse technical problem sets, across the entire technology stack, including the virtualized hardware
• Equipped with a proactive security mindset and a solid understanding of information security and privacy principles
• Used to keeping everything you do in source control and automating (scripting) any task you have to do more than once
• Comfortable operating in environments subject to regulatory, compliance, and risk-based security requirements
• Able to troubleshoot issues across the entire stack from UI- > API – > Application – > Database, including the operating system and the underlying (virtual) hardware
• Enthusiastic about cutting-edge technologies and fresh challenges that come with them
• Possesses service and customer-oriented mindset and a willingness to dig into the application rather than throw the problem over the wall
• Excellent verbal and written communication skills being able to convert complex topics into simple to understand language to educate stakeholders and executives of Site Reliability Concepts and Designs
• A solid cross section of the Experience, Skills, & Competencies
• Excited about monitoring technologies, the metrics they provide, and using the data to extract information about the performance characteristics, and error modes of a cloud-based software stack
• Proficient as a developer, experienced writing code and solving problems with infrastructure as code
• Experienced maintaining and supporting feature-rich applications using modern software frameworks
• Comfortable working in a fast paced environment with the ability to think quickly to solve engineering challenges.
• Understanding of computer networking and how it applies in cloud environments
• Related technical experience in cybersecurity, preferably in a cloud environment
• Experience securing corporate networks, cloud networks, and VPNs.
Please note: this position can be fully remote and can be based anywhere in the country
Sinclair Broadcast Group, Inc. is proud to be an Equal Opportunity Employer and Drug Free Workplace!
Jobcode: Reference SBJ-gm5vzm-3-238-173-209-42 in your application.