Unstructured Data in LLMs
Details
Topic: Connecting your unstructured data with LLMs
We are meeting for our first happy hour/discussion group about Unstructured Data and its future in machine learning and LLM apps!
What we’ll do:
Have some snacks and refreshments. Have a couple of talks and then unstructured networking.
***
Schedule
6:00-6:55 - Open Doors, food and open networking
7:00-7:30 - Multi Agent Systems with Mistral AI, Milvus and llama-agents - Stephen Batifol, Zilliz
7:30-8:00 - The Journey of Large Language Models at GetYourGuide - Meghana Satish, GetYourGuide
8:00-8:05 - Small Break
8:05-8:30 - From CLIP to JinaCLIP: General Text-Image Representation Learning for Search and Multimodal RAG - Wang Bo, Jina AI
8:00-9:00 - Networking
Who Should attend:
Anyone interested in talking and learning about Unstructured Data and LLM Apps.
When:
Sept 5th, 2024
6:00PM
HelloFresh
Sponsored by Zilliz & HelloFresh
***
Tech Talk 1: Multi-Agent Systems with Mistral AI, Milvus and llama-agents
Speaker: Stephen Batifol, Developer Advocate, Zilliz
Abstract: Agentic systems are on the rise, helping developers create intelligent, autonomous systems. LLMs are becoming more and more capable of following diverse sets of instructions, making them ideal for managing these agents. This advancement opens up numerous possibilities for handling complex tasks with minimal human intervention in so many areas. In this talk, we will see how to build agents using llama-agents. We’ll also explore how combining different LLMs can enable various actions. For simpler tasks, we'll use Mistral Nemo, a smaller and more cost-effective model, and Mistral Large for orchestrating different agents.
Tech Talk 2: The Journey of Large Language Models at GetYourGuide
Speaker: Meghana Satish, Machine Learning Engineer, GetYourGuide
Abstract: Integrating Large Language Models (LLMs) into our workflows at GetYourGuide has been quite the adventure. In this talk, I’ll share our experience with LLMs, focusing on the products we’ve built , the challenges we faced, and the impact on our business.
We’ll explore the exciting use cases, technical hurdles like integration and scaling, as well as our architectural decisions. Additionally, I’ll discuss our approach to dealing with hallucinations, a common downside of LLMs.
By sharing real examples from GetYourGuide, I’ll highlight what worked well and what didn’t, offering a handy guide for other organisations looking to tap into the power of LLMs.
Tech Talk 3: From CLIP to JinaCLIP: General Text-Image Representation Learning for Search and Multimodal RAG
Speaker: Bo Wang, Engineering Manager, Jina AI
Abstract: CLIP (Contrastive Language-Image Pretraining) is commonly used to train models that can connect images and text by representing them as vectors in the same embedding space. These models are crucial for tasks like multimodal information retrieval, where you need to search and match across both images and text.
However, when it comes to purely text-based tasks, CLIP models don’t perform as well as models that are specifically built for text. This causes inefficiencies because current systems often need to maintain separate models and embeddings for text-only and multimodal tasks, which adds complexity.
In this talk, Bo will explain the multi-task contrastive training scheme behind JinaCLIP, discuss the modality gap between different data types, and introduce JinaCLIP V2—our latest and most capable multilingual multimodal embedding model.
Check out Zilliz blog, join our Discord and Milvus Github
***
📸Important note: Please be advised that this event will be recorded and photographed. If you prefer not to be included in any recordings or photographs, please do not hesitate to let us know during the event. Your comfort and privacy are important to us.