Sorry! Looks like this position has been filled by the employer and the listing was closed on 11/01/2021

Full Time Job

Senior Site Reliability Engineer

Vox Media

Remote / Virtual 09-30-2021

Paid
Full Time

Job Description

About Coral:

This is a dangerous time to be a journalist on the internet. Online comments are often filled with rumors, insults and threats, pushing away readers and reducing community engagement. It doesn't have to be this way.

The Coral team at Vox Media believes that healthy online conversation can exist, given the right systems and tools – and that a strong democracy depends on it. The Coral community platform now supports journalists on more than 180 news sites, helping them engage with their communities, share knowledge, empower discussions, and reduce the impact of trolls. And it doesn't share or sell anyone's data while doing it.

Coral users include The Washington Post, the Wall Street Journal, The Financial Times, New York Magazine, and the LA Times.

About the role:

Under general supervision of Coral's SRE Engineering Manager, the Senior Site Reliability Engineer is responsible for the scaling, performance, availability and security of Coral's hosted client platform, websites, applications and services. The Senior SRE is also responsible for managing the tools and infrastructure that support the above. They will have a primary role in the leadership and execution of infrastructure initiatives from conception to production.
Our stack:
• Kubernetes, GKE, Google Cloud, Terraform, Docker
• MongoDB Atlas, Redis
• Nodejs, Go, Python, GraphQL
• Our open source codebase: https://github.com/coralproject/talk

What you'll bring:
• Familiarity with our stack:
• Kubernetes, GKE, Google Cloud, Terraform, Docker
• MongoDB Atlas, Redis
• Nodejs, Go, Python, GraphQL
• Ability to right-size and capacity plan for high-availability, high-traffic SaaS infrastructure
• Experience owning, managing, scaling, optimizing and monitoring high volume Kubernetes and cloud SaaS infrastructure
• Experience managing and utilizing GitOps and DevOps workflows
• Experience managing MongoDB, including familiarity with data exports, imports and ETL

What you'll do:
• Monitor and improve service stability and performance of Coral's hosted platform, website, applications and services
• Implement and automate tools and processes to improve reliability and efficiency of Coral's hosted platform, websites, applications and services.
• Participate in on call rotation, respond to service interruptions and stability and performance alerts
• Develop custom tools or replace existing tools when necessary to facilitate or improve monitoring, automation, performance and stability
• Assist with the development and implementation of contingency and disaster recovery plans
• Assist in the development of capacity and budget planning and forecasts
• Build out customer facing hosted infrastructure to ensure reliability, availability, efficiency and cost-effectiveness of technical requirements
• Configure and operate Google Cloud, GKE, Kubernetes, and other cloud tools and services
• Utilize and develop GitOps workflows to update and maintain Kubernetes deployments in GKE
• Utilize Terraform to declare, provision, and maintain GC resources
• Enhance, monitor and troubleshoot storage and backup systems to ensure reliability, performance and durability of data
• Manage and assist customers through their integration process
• Troubleshoot customer issues and update or create documentation where necessary to correctly address questions and concerns
• Investigate and reproduce customer and internally reported bugs and issues
• Reproduce and document steps that lead to unexpected behavior, and recommend fixes to dev team where appropriate
• Evaluate existing software, applications and systems on a regular basis to ensure that critical security and stability patches or upgrades are applied

Are you passionate about this opportunity, but worried that you don't have 100% of the experience we're looking for? We still want to hear from you!

Jobcode: Reference SBJ-d28jk5-216-73-217-144-42 in your application.

Find More Jobs Like This

Company Profile

Vox Media

As the leading independent modern media company, Vox Media ignites conversations and influences culture. Across digital, podcasts, TV, streaming, live events, and print, we tell stories that affect our audience's daily lives and entertain as much as they inform.

https://www.voxmedia.com/pages/careers-jobs

Full Time Job

Senior Site Reliability Engineer

Vox Media

Job Description

Find More Jobs Like This

Company Profile

Vox Media

Similar Listings

Manager, IT Infrastructure

Senior Software Engineer -web

Distributed Systems Engineer 5 - Ad Server Platform

Principal Core Science Machine Learning Scientist