WarnerMedia seeks a Software Enginner II for the CNN department. We are looking for a Machine Learning Ops Engineer to help us significantly scale how we experiment, build, and collaborate on ML-based data products. We are currently optimizing our first ML platform to help accelerate our experimentation and model production, and we would love for you to join us in making a huge impact on the overall organization.
Here's some of the problems you'll be helping us solve:
The news cycle moves fast. What can we learn from readers' engagement with breaking news to recommend writing or video which deepens their understanding?
What's the best way for a machine learning group to work closely with journalists and editors to keep our audience engaged and informed?
We are growing to multiple full-stack machine learning implementation teams to build new features across our apps.
How should we balance enabling innovation and rapid prototyping and delivery in those teams with building a common set of platforms and tools to allow teams to move more quickly in the future?
• Partner with machine learning engineers on your team to productionize training, testing, and deployment of new models.
• Collaborate with other ML/Ops engineers to develop and improve core components, infrastructure and architecture of our ML Platform to train, deploy, and serve models at scale.
• Author, test, review, and optimize production-level code in Python and Golang while following best practices in version control and code integration.
• Use and build upon open-source cloud computing technologies.
• Collaborate with data scientists, machine learning engineers, product teams and other key stakeholders and drive ML projects from conception to completion.
Who You Are
• You can design and build real time distributed systems for machine learning at scale.
• You are excited about working with machine learning teams to architect and implement tools and infrastructure to turn their ideas into shipped products and features.
• You're even more excited about finding ways to make all of it faster, easier to understand, more efficient, and more self-service.
• You enjoy keeping up with emerging tech, but get more satisfaction from building things with real user impact.
• You understand the constraints of working with a growing team and thrive in an environment that is fast-paced and sometimes scrappy.
• You understand that working with user data in recommendations or other systems comes with a host of privacy and security concerns, and you are both creative and principled in how you approach it.
• You can work independently when needed, and can contribute to multiple team projects simultaneously.
• You have a deep curiosity and are proactive in seeking innovative solutions to business problems.
• You believe in iterating quickly, and removing any and all obstacles that slow down turning ideas into deployable code.
• You are an efficient communicator with your team and collaborators.
• You know when and why to offer feedback, to get help, or to advocate for your ideas.
• You have collaborated with others on a team to design and build web scale distributed systems.
Things You Should Know
• How to write robust code in Python, Golang or an equivalent modern programming language.
• How to wrangle data - working with databases (relational or columnar), how to write SQL, how to work with a data pipeline, and other big data technologies and working with large data sets.
• Experience with deploying services to a cloud platform.
• Experience with containerization (Docker) and container-orchestration systems such as Kubernetes.
• Experience with Terraform, AWS CloudFormation or a similar IaC (infrastructure- as-code) tool.
Things You Might Know
• Prior experience collaborating with data scientists, machine learning engineers, and product stakeholders on machine learning products.
• Commonly used machine learning frameworks (like Keras, PyTorch or TensorFlow) and libraries (like scikit-learn).
• Machine learning tools such as Sagemaker, MLFlow and Metaflow Graph databases.
• How to communicate effectively with distributed, remote teams.
• How to take a product problem and choose the right ML approach to prototype, evaluate, and tweak a solution quickly before taking it to production scale.
• Common architectures and approaches to creating ML training and inference systems that operate on streams of real-time data rather than batch.
• If you don't know any of these, that's OK- you'll get the opportunity to learn them on the job once you join!
Jobcode: Reference SBJ-gkee8k-3-230-144-31-42 in your application.