.
• Support hundreds of deployments over 10+ EKS clusters and 2+ GKE clusters.
• Provide expert level support to a streaming platform for broadcast radio and Data Team leveraging AI tools like Big Querry, Vertex AI, Agentic AI, Code Assist, Gemini API, Juniper Notebooks, etc.
• Work with developer teams to automate their release pipelines using GitLab and empowering them to own the deployment lifecycle.
• Contribute to project planning and influence solution architecture design that satisfies business goals while maintaining Platform Engineering standards.
• Deliver 24/7 on-call rotational support of applications and infrastructure.
• Monitor system and application performance and troubleshoot/resolve escalated issues.
• Cross functional collaboration in the care and feeding of existing Kubernetes architecture like controller, cluster, and add-on upgrades.
• Continually identify opportunities to optimize and improve all operational aspects of our technical solutions.
• Build S...Qualifications
What Makes You A Good Fit:
• At least 3-5 years of experience in a DevOps or Platform Engineering role.
• Proficient experience around pipeline automation with GitLab and automation with Helm like pre-deploy hooks will be helpful.
• Practical experience with most core GCP services: BigQuery, Agentic AI, Gemini, Vertex AI, Juniper Notebooks, Airflow, Google Transfer Services, Cloud Run, Cloud Storage, Cloud Spanner, etc.
• Proven experience deploying and supporting GCP services: BigQuery, Agentic AI, Gemini , VertexAI, GKE, Kubeflow pipelines, Networking, etc.
• Exposure to multiple cloud accounts that leverageVPC peering and Transit Gateway for systems that span cross-account.
• Hands-on experience managing Kubernetes in a production setting.
• Demonstrated experience with GitLab, Bitbucket, or GitHub automation.
• Experience writing infrastructure as code in Terraform.
• Experience with logging, monitoring, and alerting solutions like Grafana,DataDog, ...What Makes You Stand Out:
• At least 3-5 years of experience with GitLab pipelines and managing runners.
• Experience working on Agentic AI, Gemini and BQ.
• Experience working with Confluent Cloud and managing clusters/Kafka topics.
• Experience automating pipelines for language learning models in Google Cloud.
• Experience using Grafana Alloy in both contexts of EC2 and Kubernetes.