

Scaling LLM Inference with RayServe and Amazon EKS
Join us for a free, hands-on workshop to learn how to deploy and optimize large language models (LLMs) for scalable inference at enterprise scale. Participants will learn to orchestrate distributed LLM serving with RayServe and vLLM on Amazon EKS, enabling robust, flexible, and highly available deployments. The session demonstrates how to utilize AWS Trainium hardware within EKS to maximize throughput and cost efficiency, leveraging Kubernetes-native features for automated scaling, resource management, and seamless integration with AWS services. You'll gain hands-on experience using observability tools like Ray Dashboard, CloudWatch, and neuron-tools to effectively monitor inference workloads.
AWS Solution Architects will guide you through the process and be available to answer questions about your specific use cases. After the workshop, stick around to enjoy some refreshments and network with other attendees and AWS team members.
Please bring a laptop to participate.
Speakers:
Sindhura Palakodety, Senior Solutions Architect, AWS
Arnab Sinha, Senior Solutions Architect, AWS
Robin Singh, Product Lead AI Platform, Anyscale