

Finetuning to Inference: Serve Open Models with Tensorfuse & AWS
Join Tensorfuse and AWS at the GenAI Loft in San Francisco for a hands-on workshop on fine-tuning and deploying open source LLMs like Llama, Qwen, Deepseek, in your AWS VPC.
Unlike other events, this will be a co-build session where you will:
Use Tensorfuse to provision an EKS cluster in your AWS VPC optimised for running AI workloads.
Customise open source LLMs by fine-tuning LoRA adapters
Deploy the fine-tuned model and run inference using vLLM
Best Part: For this workshop, we will be using readily available on-demand GPUs like A10G and L40S on AWS.
Agenda
2:00-2:30 p.m. | Check-in and brief about Tensorfuse
2:30-4:30 p.m. | Deploy-athon: We will configure the Tensorfuse runtime in your AWS account and fine-tune and deploy Llama models
4:30-5:00 p.m | Networking, Snacks and Drinks!
Pre-requisite
Everyone interested in customizing and self-hosting LLMs is welcome. However, you’ll get the most from this event if you:
Have an active AWS account
Have sufficient on-demand GPU quotas for G and VT instances (ideally in us-east-1 or us-west-2). Check your quotas here
Permissions to configure resources in your AWS.
We have limited spots (~100), book your seat now.
Excited to host you all at the GenAI Loft in SF!