Software Engineer II, Google Kubernetes Engine
Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.
The GKE AI Scalability team is dedicated to engineering Google Kubernetes Engine (GKE) to manage the most extreme AI/ML workloads. We architect solutions for MegaClusters, pushing kubernetes performance and scale to support our largest customers and their needs, up to several million accelerators. Our work directly enables exceptional AI research and deployment on Google Cloud.
In this role, you will build the foundation for the future of large-scale AI on kubernetes. Your contributions will directly impact Google Cloud's ability to support unprecedented AI workloads, working on unique issues in distributed systems and performance engineering. This role offers the chance to shape a critical component of GKE for massive AI workloads.
Google is an engineering company at heart. We hire people with a broad set of technical skills who are ready to take on some of technology's greatest challenges and make an impact on users around the world. At Google, engineers not only revolutionize search, they routinely work on scalability and storage solutions, large-scale applications and entirely new platforms for developers around the world. From Google Ads to Chrome, Android to YouTube, social to local, Google engineers are changing the world one technological achievement after another.
Responsibilities
- Write product or system development code. Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
- Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback. Triage product or system issues and debug/track/resolve.
- Design, develop, and operate software and systems to enhance GKE's scalability for massive AI/ML workloads. Diagnose and resolve performance bottlenecks across the kubernetes stack at scale.
- Collaborate with teams across Google Cloud to deliver highly reliable and performant large-scale cluster solutions. Participate in on-call rotations to ensure the stability of large-scale GKE clusters.
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 1 year of experience with software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript).
- 1 year of experience with data structures or algorithms.
- 1 year of experience building and developing large-scale infrastructure or distributed systems.
Preferred qualifications:
- Master's degree in Computer Science or a related technical field.
- Experience with Go.
- Experience with kubernetes or cloud products and highly scalable systems.