Manager - Site Reliability Engineer, Core
Los Gatos, CA US
Are you interested in using Machine Learning to transform the way we run and operate large cloud infrastructures? The Algorithm Engineering team is looking for a passionate Machine Learning expert to join us and lead the way in the research and development of new algorithms to optimize our Netflix-wide compute infrastructure and platform, helping all our engineering teams do more, with better and more reliable performance, and at lower cost.
In this role, you will conduct applied research by investigating, conceptualizing, designing, implementing, and validating new algorithms aiming at optimizing our Netflix-wide computation platform. This includes running offline experiments and building live tests to run on production systems, as well as being responsible for our production models. To be successful in this role, you need a strong machine learning background, solid software development skills, a love of learning and collaborating with multidisciplinary teams. You will need to exhibit strong communication and leadership skills, an ability to set priorities, and an execution focus in a dynamic environment. We are just getting started in this area and have seen some promising early indicators, so this is an extraordinary time for you to join to have a large impact! For more details on some of our research in that space, you can read about our work on predictive isolation of containers.
To learn more about our research work, you can visit our research page here.
What we are looking for:
• 5+ years of research experience with a track record of delivering quality results
• Experience in successfully applying machine learning to real-world problems
• Expertise in machine learning spanning supervised and unsupervised learning methods
• Comfortable operating at all levels of the predictive stack, from data collection to modeling to low-latency online serving
• Strong mathematical skills with knowledge of statistical methods
• Strong software development experience in at least one statically typed language such as Go, C, C++, Java or Scala
• Great interpersonal skills
• PhD or MS in Computer Science, or related field
Preferred, but not required, additional areas of experience:
• Experience in applying Machine Learning to optimize infrastructure
• Cloud computing platforms and large web-scale distributed systems
• Optimization algorithms and numerical computation
• Experience working with large datasets, e.g. with Spark
• Experience working in fast pace green field environments
The application for this position is hosted at the Employer's site. Click on the button below to open the application page in a new tab in your browser.Apply at Employer's Site