company_logo

Full Time Job

Sr ML Ops Engineer

Lucasfilm

Nicasio, CA 1 day ago
Apply @ Employer
  • Paid
  • Full Time
  • Senior (5-10 years) Experience
Job Description
The Skywalker Sound Development Group is seeking a highly skilled Sr ML Ops Engineer to build and maintain the infrastructure powering our machine learning and AI frameworks. This position is crucial in enabling seamless workflows for model training, retraining, and deployment, ensuring that cutting-edge AI solutions operate reliably at scale.
As a Sr ML Ops Engineer, you will act as the backbone of our AI/ML efforts, bridging the gap between data science, research, and production engineering. Your expertise in DevOps principles, model deployment strategies, and scalable infrastructure will support the development of transformative audio solutions for speech processing, style transfer, and source separation in media production workflows.
This role is considered Hybrid, which means the employee will work 2-3 days onsite at our Nicasio, CA office and occasionally from home.
What You'll Do:
Develop, deploy, and maintain scalable infrastructure for machine learning model training, retraining, and inference.
Design and optimize CI/CD pipelines specifically tailored for machine learning workflows, ensuring efficient delivery from research to production.
Implement robust monitoring and logging systems to track model performance and identify potential issues in production environments.
Collaborate with AI researchers and data scientists to ensure infrastructure aligns with project requirements and supports iterative experimentation.
Manage compute resources (cloud and on-premises) to enable large-scale distributed training and inference tasks.
Containerize machine learning models and applications using Docker and deploy them via Kubernetes or equivalent orchestration systems.
Automate deployment workflows for serving ML models using frameworks such as TorchServe, TensorFlow Serving and FastAPI.
Implement model versioning, rollback strategies, and governance for maintaining production stability.
Optimize cost efficiency and performance of machine learning workflows in cloud environments such as AWS, GCP, or Azure.
Stay updated with emerging ML Ops tools and practices, integrating them into existing workflows to improve performance and reliability.
What We're Looking For:
Bachelor's in Computer Science, Engineering, or a related field. Master's Degree is preferred
5+ years of experience in DevOps, Site Reliability Engineering, or a related role, with at least 2+ years focusing on ML Ops.
Expertise in building and maintaining CI/CD pipelines for machine learning applications.
Strong proficiency with containerization (Docker) and orchestration tools (Kubernetes).
Proficiency in deploying machine learning models using frameworks such as TensorFlow Serving, TorchServe, or custom APIs.
Deep understanding of cloud infrastructure and services (AWS, GCP, or Azure) for ML workloads, including GPUs and TPU utilization.
Experience managing large-scale distributed training workflows and optimizing resource allocation.
Familiarity with tools like MLflow, DVC, Weight+Biases, or similar for data and model tracking and versioning.
Solid understanding of security best practices for machine learning systems and sensitive data handling.
Strong scripting and programming skills in Python, Bash, or Go.
Preferred Qualifications:
Experience with data orchestration tools like DataChain, Weights and Biases, etc, for managing ML workflows.
Hands-on experience with automated hyperparameter tuning and optimization frameworks.
Familiarity with model monitoring tools like Prometheus, Grafana, or custom solutions for model drift and data quality checks.
Experience integrating pre-trained foundational models and managing their deployment at scale.
Contributions to open-source ML Ops projects or relevant research publications.
The hiring range for this position in San Francisco, CA is $152,100 to $203,900 per year. The base pay actually offered will take into account internal equity and also may vary depending on the candidate's geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial, and/or other benefits, dependent on the level and position offered.
About Lucasfilm:
Lucasfilm is a global leader in film, television and digital entertainment production. In addition to its motion-picture and television production, the company's activities include visual effects, audio post-production and cutting-edge digital animation, interactive entertainment software, and the management of the global merchandising activities for its entertainment properties including the legendary STAR WARS and INDIANA JONES franchises. Lucasfilm Ltd. is headquartered in northern California.

This position is with Lucasfilm Ent Co Ltd, LLC Payroll Svc, which is part of a business we call Lucasfilm.
Lucasfilm Ent Co Ltd, LLC Payroll Svc is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, religion, color, sex, sexual orientation, gender, gender identity, gender expression, national origin, ancestry, age, marital status, military or veteran status, medical condition, genetic information or disability, or any other basis prohibited by federal, state or local law. Disney champions a business environment where ideas and decisions from all people help us grow, innovate, create the best stories and be relevant in a constantly evolving world.

Jobcode: Reference SBJ-4k5z8z-216-73-216-96-42 in your application.

Salary Details
Salary Range: $152,100 to $203,900 Per Year ($ USD)
Company Profile
Lucasfilm

Lucasfilm is among the world’s leading entertainment service companies, a pioneer in visual effects and sound across multiple mediums, and is home to the legendary Star Wars and Indiana Jones franchises.