Hosting LLM Apps @ Scale with Haystack, Titan ML & Jina AI
⚠️ Sign up here to join virtually
Haystack is an open-source LLM framework by deepset. For this meetup, we are joined by Titan ML and Jina AI to talk about deploying LLMs and building useful applications at scale. Specifically, we will take a look at getting LLMs to generate structured data.
Schedule
18:00 - Doors open
18:00-19:00 - Networking
19:00-20:30- Talks
1) Bending LLMs To Your Will (and your will is to create structured data), by Tuana Celik from deepset
2) Challenges in serving self-hosted LLM applications at enterprise scale, by Fergus Finn, from Titan ML
3) From Lab to Life: The Evolution of Jina Embeddings in Practical Applications, by Isabelle Mohr from Jina AI
Talks
1) Title: Bending LLMs To Your Will (and your will is to create structured data)
Speaker: Tuana Çelik, Developer Relations Lead, deepset
Description: In this talk, we will have a look at how Haystack and its pipeline architecture (which is essentially a graph) allows you to build highly customized LLM applications. As an example, we will look at how we can create an application that can create structured data from text, that abides by our definition of structure. Let’s say you want to create a list of GitHub issues based on a conversation you had with colleagues. Wouldn’t it be great to have a system that extracts all of the relevant information from your meeting notes and creates GitHub issues that abide by your labels, and issue structure? Let’s see how Haystack’s custom components and pipeline architecture let you do just that.
2) Title: Challenges in serving self-hosted LLM applications at enterprise scale.
Speaker: Fergus Finn, CTO, Titan ML
Description: In this talk, we will explore the challenges and solutions in effectively designing, serving, and scaling Large Language Model (LLM) applications.
We’ll focus on the practical challenges we experienced when designing our efficient Takeoff inference server, including how to manage workload with different batching strategies and performance enhancements like quantization and caching.
We’ll also discuss how to efficiently enforce LLM output formats, such as JSON and regex, which are vital for getting predictable and repeatable outputs in applications.
This session aims to provide actionable insights for professionals looking to optimize their LLM applications for large-scale operations.
3) Title: From Lab to Life: The Evolution of Jina Embeddings in Practical Applications
Speaker: Isabelle Mohr, Software Engineer, Jina AI
Description: We released Jina Embeddings V2 in November 2023, the first open-source long context embedding models. This talk covers how we addressed technical challenges while building and training these models, and how they are effectively used in practical applications