Personal

 for a talk on everything you need to know to run async inference on AI models in production.

Async inference can increase GPU utilization, unlock batch processing workloads, and handle long-running tasks. Robust async infrastructure requires building everything from a priority queue to handling payloads securely, and Samiksha will share how AI teams should build with async inference in production today.

All AI engineers, MLOps specialists, and technical leaders are invited to join us for the talk and Q&A. Refreshments will be served.

 - a week of events hosted by VCs and startups to bring together the tech ecosystem

Production-Ready Async Inference #TechWeek

Han Yue

Talha Masood

Adam