Software Engineer III, Managed AI/ML Infrastructure
Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.
In this role, you will rely on the computational power, network connectivity, and persistent storage delivered through our Virtual Machines (VMs). You will take part in building solutions for managing the life-cycle of a global population of Virtual Machines in Google Cloud, shape the market, and will have an opportunity to become a partner in conversations with the set of customers.Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
In this role, you will rely on the computational power, network connectivity, and persistent storage delivered through our Virtual Machines (VMs). You will take part in building solutions for managing the life-cycle of a global population of Virtual Machines in Google Cloud, shape the market, and will have an opportunity to become a partner in conversations with the set of customers.Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
Responsibilities
- Implement and maintain highly reliable, large-scale distributed systems.
- Contribute to the analysis and design of solutions for managing the orchestration and life-cycle of millions of Virtual Machines.
- Execute high-visibility projects supporting Google's Tensor Processing Unit (TPU) and Graphics Processing Unit (GPU) fleet.
- Collaborate and integrate with many stakeholders from across the Google Cloud stack.
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 2 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree.
- 1 year of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
- 1 year of experience with ML infrastructure (e.g., model deployment, model evaluation, optimization, data processing, debugging).
Preferred qualifications:
- Master's degree or PhD in Computer Science or a related technical field.
- 2 years of experience with data structures and algorithms.
- Experience developing accessible technologies.
- Experience in building highly reliable systems.
- Experience in delivering cloud based products.