company_logo

Full Time Job

Sr. Software Engineer - Data & Feature Infrastructure, Ml Platform

Netflix

Los Gatos, CA 07-25-2021
 
  • Paid
  • Full Time
  • Mid (2-5 years) Experience
Job Description

ML models can only be as good as the data that we provide to them, which is why we continue to innovate on making feature engineering as simple, scalable and reliable as possible. Would you like to build and scale our Data & Feature Infrastructure that powers various ML and optimization use cases spanning the entire lifecycle of a content from pitch to play? This encompasses determining what pitches to greenlight, determining the release date, prioritization of marketing budgets, and doing personalized recommendations for our members to discover and enjoy watching it.

The Opportunity

In this role, you will have the opportunity to build scalable fact and feature stores that ML practitioners can easily leverage to assemble high quality training datasets for diverse ML use cases across Netflix. Unlocking access to these datasets will foster innovation through ML in new business areas that otherwise wouldn't have been feasible. You will get to work with the rest of the ML Platform organization while enhancing Metaflow to provide a cohesive end user experience that greatly improves the productivity of ML practitioners. In this role you will gain intimate knowledge of Netflix personalization models, content demand & valuation models etc. while working for a unique and pioneering company that is redefining how video content is consumed globally.

Here are some examples of the types of things you would work on:
• Design and manage fact stores that can be leveraged for generating ML features for an arbitrary time in the past without any online/offline skewness
• Standardize the generation of ML features, build and scale feature stores that can be leveraged to easily discover and share features and efficiently load them into training frameworks such as PyTorch and TensorFlow
• Increase ML practitioner productivity by making it easy to access and explore data for offline experimentation and productization
• Build and optimize libraries that can efficiently read large datasets from S3

To learn more, here are some talks/blog posts from the team:
• Metaflow: Building a Human-Centric infrastructure
• Distributed Time Travel for Offline Feature Generation
• 2018 Spark Summit presentations
• Netflix ML Platform Research website

Minimum Qualifications
• 4+ years of relevant experience building ML infrastructure
• Strong empathy and passion for providing a fantastic user experience to ML practitioners
• Experience in large scale data processing frameworks and columnar data structures
• Experience working and optimizing Python based data pipelines
• Experience with Cloud Computing platforms like Amazon AWS

Preferred Qualifications
• Experience working with Spark and Scala
• Experience working with container (Docker) platforms
• Experience working with Notebooks such as Jupyter or Polynote

Netflix is an equal opportunity employer and strives to build diverse teams from all walks of life. We offer a unique culture of freedom and responsibility with a clear long-term view of our business. We recommend reading through these to understand what working at Netflix is like.

Jobcode: Reference SBJ-gx3en1-18-119-139-50-42 in your application.