Scaling LLM Inference with RayServe and AWS Trainium

Name: Scaling LLM Inference with RayServe and AWS Trainium
Start: 2025-06-12T14:30:00.000-07:00
End: 2025-06-12T18:30:00.000-07:00
Location: AWS GenAI Loft

Hosted by Jackie Moffett, AWS Builder Loft (Formerly AWS GenAI Loft) & Ruchi Bhatia

AWS GenAI Loft

San Francisco, California

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Join us for a free, hands-on workshop to learn how to deploy and optimize large language models (LLMs) for scalable inference at enterprise scale. Participants will learn to orchestrate distributed LLM serving with RayServe and vLLM on Amazon EKS, enabling robust, flexible, and highly available deployments. The session demonstrates how to utilize AWS Trainium hardware within EKS to maximize throughput and cost efficiency, leveraging Kubernetes-native features for automated scaling, resource management, and seamless integration with AWS services. You'll gain hands-on experience using observability tools like Ray Dashboard, CloudWatch, and neuron-tools to effectively monitor inference workloads.

AWS Solution Architects will guide you through the process and be available to answer questions about your specific use cases. After the workshop, stick around to enjoy some refreshments and network with other attendees and AWS team members.

Please bring a laptop to participate.

Speakers:

Sindhura Palakodety, Senior Solutions Architect, AWS

Arnab Sinha, Senior Solutions Architect, AWS

Robin Singh, Product Lead AI Platform, Anyscale

Location

AWS GenAI Loft

525 Market St, San Francisco, CA 94105, USA

Hosted By

179 Went

AI

Scaling LLM Inference with RayServe and AWS Trainium

​Speakers:

Speakers: