The big data platform is the core foundation in driving all of our product decisions that directly impact our customer experience. It is leveraged across the company to build data models for recommendations, analyze customer streaming experience, content take-rate, plus more. It is critical to the success of our business as we heavily rely on analyzing data to revolutionize internet TV!
We are the data infrastructure team that builds an ecosystem of microservices and near real-time data ingestion pipelines that expose the big data warehouse and big data platform as a robust and highly available service to the rest of Netflix. We are constantly evolving our infrastructure. We are designing new microservices to take our architecture to the next level to keep pace with our increasing data and metadata needs.
Specifically, we are working on:
• an intelligent set of services that process operational data in real-time from thousands of machines and distributed services not only to detect, surface, and aggregate jobs symptoms for off-line and real-time platforms but accurately diagnose, and remediate failures when a process fails.
• a predictive SLA service that can intelligently detect which processing pipelines are keeping up or lagging behind by combining information like SLAs, dependency graphs, past performance, with system load and performance
• an insights infrastructure to provide visibility and observability into the big data platform's operability, resilience, and overall health.
• a data-driven data platform that collects all the data it needs not only to interpret failures and quantify systems performance and reliability; but also to predict job trends.
• If this aligns well with your passion, we would love to hear from you!
This would be your dream job if you enjoy:
• Working with a massive amount of data (100+ PB) that is growing rapidly.
• Understanding and solving real business needs at a large scale by applying your software engineering and analytical problem-solving skills.
• Building a real-time operational insights platform that needs to scale and support our rapid growth of batch processing and real-time jobs and you can help influence that roadmap.
• Architecting and building a robust, scalability, and highly available distributed infrastructure.
• Leading cross-functional initiatives and collaborating with engineers across teams.
• Sharing our experiences with the open source communities and contributing to Netflix OSS.
• You have 6+ years of experience in building large-scale distributed applications.
• You are proficient in designing and writing HTTP and RESTful APIs
• You have designed, built, and operated scalable, resilient services
• You are an expert in Java. Python expertise is a plus.
• You are obsessed with customer satisfaction
• You have a BS/MS/Ph.D. in Computer Science or a related field.
• Most importantly, you are thrilled to help solve our big data challenges while revolutionizing internet TV!
To learn more about the team, here is our recent talk at re:Invent conference (in slides or video) and Strata that describes what we do in the big data space! Lastly, please read our culture memo for a candid look into how we operate.
Jobcode: Reference SBJ-g33745-3-237-16-210-42 in your application.