Want to know how Playstation Defines SRE?
Future Technology Engineering
As a part of Sony Interactive Entertainment, the Future Technology Group (FTG) is leading the revolution in cloud gaming as we push the boundaries of what's possible! Would you like to use your skills, time, and passion on meaningful projects that are building the future? We're looking for people who want to make a difference, love working with creative, intelligent, and collaborative teammates. If that sounds like something you would love to do, we want to hear from you!
Staff Service Reliability Engineer
Our Service Reliability Engineering team plays a significant role in delivering on the promise of a great cloud gaming experience to our customers. We do this by influencing design and operational decisions towards the overall stability of the gaming service. Our SREs focus on three main things: overall ownership of production, production code quality, and deployments. The successful candidate will be self-directed and able to participate in the way we make decisions at different levels.
We expect our SREs to have opinions on the state of our service and provide critical feedback during different phases of the operational lifecycle. We are engaged throughout the S/W development lifecycle, ensuring the operational readiness and stability.
• Minimum of 9+ years working experience in Software Development, SRE, DevOps, and/or Linux Systems Administration role.
• Strong interpersonal, written, and verbal communication skills.
• Experience with coaching and mentoring other team members.
• Available to be scheduled for Tier 3 on-call.
Skills & Knowledge
• Proficient as a Linux Production Systems Engineer, with experience managing large scale Web Services infrastructure.
• Development experience in one or more of the following programming languages:
• Python (preferred)
• Bash, Go, Java, C++, or Rust
• In addition, experience with at least 3 of the following topics:
• Distributed data storage at scale (Hadoop, Ceph)
• NoSQL at scale (MongoDB, Redis, Cassandra)
• Data Aggregation technologies. (ElasticSearch, Kafka)
• Scaling and running traditional RDBMS (PostgreSQL, MySQL) with High Availability
• Monitoring & Alerting (Prometheus, Grafana), and Incident Management toolsets
• Kubernetes and/or AWS (deployment and management)
• Software Distribution (Package management and distribution at scale)
• Configuration Management (ansible, saltstack, puppet, chef)
• S/W Performance analysis and load testing (QA or SDET experience: a plus)
• Capacity planning, disaster recovery, and risk assessment
• Statistical analysis, data science, or machine learning
• Owning and driving ongoing improvements in Reliability and Scalability
• Work closely with SRE Management to define critical metrics, processes and drive continuous improvement
• Influence the architecture and implementation of solutions within the division
• Mentor SRE staff and enable them for success
• Act as a voice to represent SRE in the wider organization
• Represent the operational scalability of solutions in the wider division
• Create and own projects from inception to implementation
• Design platform-wide solutions and provide technical leadership during their implementation
• Demonstrate a high-level of organizational skills and initiative in the role
Equal Opportunity Statement:
Sony is an Equal Opportunity Employer. All persons will receive consideration for employment without regard to gender (including gender identity, gender expression and gender reassignment), race (including colour, nationality, ethnic or national origin), religion or belief, marital or civil partnership status, disability, age, sexual orientation, pregnancy or maternity, trade union membership or membership in any other legally protected category.
We strive to create an inclusive environment, empower employees and embrace diversity. We encourage everyone to respond.
Jobcode: Reference SBJ-re4807-3-238-180-255-42 in your application.