Cover Image for Scaling LLM Inference with RayServe and Amazon EKS
Cover Image for Scaling LLM Inference with RayServe and Amazon EKS
20 Going

Scaling LLM Inference with RayServe and Amazon EKS

Hosted by Jackie Moffett & AWS Builder Loft (Formerly AWS GenAI Loft)
Registration
Welcome! To join the event, please register below.
About Event

Join us for a free, hands-on workshop to learn how to deploy and optimize large language models (LLMs) for scalable inference at enterprise scale. Participants will learn to orchestrate distributed LLM serving with RayServe and vLLM on Amazon EKS, enabling robust, flexible, and highly available deployments. The session demonstrates how to utilize AWS Trainium hardware within EKS to maximize throughput and cost efficiency, leveraging Kubernetes-native features for automated scaling, resource management, and seamless integration with AWS services. You'll gain hands-on experience using observability tools like Ray Dashboard, CloudWatch, and neuron-tools to effectively monitor inference workloads.

AWS Solution Architects will guide you through the process and be available to answer questions about your specific use cases. After the workshop, stick around to enjoy some refreshments and network with other attendees and AWS team members.

Please bring a laptop to participate.

Speakers:

Sindhura Palakodety, Senior Solutions Architect, AWS

Arnab Sinha, Senior Solutions Architect, AWS

Robin Singh, Product Lead AI Platform, Anyscale

Location
AWS GenAI Loft
525 Market St, San Francisco, CA 94105, USA
20 Going