Ray + vLLM in Action: Lessons from Pinterest and Deepseek Deployments
Join us for our next Ray Meetup where we’ll explore batch inference at scale with Ray and vLLM! Learn how Pinterest scales batch inference using Ray, and get a first look at Anyscale’s latest tools—Ray Serve and Data LLM—for orchestrating large-scale LLM inference. We’ll cover topics like batch inference, prefill-decode disaggregation, DP/EP parallelism, and custom request routing.
📆 Tuesday, June 10th, 2025
🕔 5:00pm
📌 55 Hawthorne St, San Francisco
Speakers:
Chia-Wei Chen, Software Engineer, ML Training Infra, Pinterest
Kourosh Hakhamaneshi, AI Lead, Anyscale
Agenda:
5:00pm: Doors open, check-in, networking
6:00pm: 📈 From Struggle to Scale: Lessons from Scaling Ray Batch Inference on Hundreds of Kubernetes Nodes, Chia-Wei Chen, Pinterest
Scaling Ray batch inference from a single node to hundreds exposed important challenges and best practices required at scale. Using an internal CLIP architecture based text + image embedding model as an example, we discuss key lessons learned, including establishing reliable checkpointing, robust node failure recovery, and autoscaling with KubeRay to ensure incremental progress. By fusing Ray data operations and optimizing for data locality, we substantially reduced memory footprint and data transfer costs. Fine-tuning memory configurations and upgrading Ray and PyTorch were also critical to achieving stability at scale. With these optimizations, we met our target throughput, transforming our workflow from struggling at 2% completion to delivering consistent, reliable processing at 300x the original scale.
6:30pm: 💥 Deploying Deepseek Inference Stack with vLLM and Ray, Kourosh Hakhamaneshi, Anyscale
7:00 - 8:00pm: 🤝 networking & 🍕
About Anyscale
Anyscale, the company behind Ray open source, is a fully-managed, enterprise-ready unified AI platform. With Anyscale, companies can build, deploy, and manage all their AI use cases, bringing transformational AI products to market faster.
Join the Ray Community
Join the Ray Community
Take the Ray Community Pulse Survey
We are hiring! Check out the job openings