Both public clouds and our hyperscale private cloud have evolved into complex infrastructures with millions of servers spanning numerous global data center regions. Leaving users to manage the complexity of deploying global services across regions incurs significant operational toil and often leads to suboptimal outcomes. Users must select regions, align global traffic routing with service deployments and ensure disaster preparedness, while optimizing cost. They must continually repeat these processes to adapt to workload changes. To eliminate these manual burdens, we introduce the Global Service Placer (GSP), which autonomously places services across regions based on user-defined latency SLOs, abstracting away details of regions from users, such as their geo-distribution and resource availability, while optimizing efficiency. The generality and efficacy of GSP has been demonstrated via onboarding a diverse set of big and complex services. We provide a case study for highly complex AI inference workload and show significant GPU savings.
- WATCH NOW
- 2025 EVENTS
- PAST EVENTS
- 2024
- 2023
- 2022
- February
- RTC @Scale 2022
- March
- Systems @Scale Spring 2022
- April
- Product @Scale Spring 2022
- May
- Data @Scale Spring 2022
- June
- Systems @Scale Summer 2022
- Networking @Scale Summer 2022
- August
- Reliability @Scale Summer 2022
- September
- AI @Scale 2022
- November
- Networking @Scale Fall 2022
- Video @Scale Fall 2022
- December
- Systems @Scale Winter 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- Blog & Video Archive
- Speaker Submissions