Cover Image for Open source data ingestion for RAGs with Dlt
Cover Image for Open source data ingestion for RAGs with Dlt
Avatar for DataTalks.Club events
DataTalks.Club is a global online community of people who love data.
43 Going

Open source data ingestion for RAGs with Dlt

YouTube
Registration
Welcome! To join the event, please register below.
About Event

About the event

In this hands-on workshop, we’ll learn how to build a data ingestion pipeline using dlt to load data from a REST API into LanceDB so you can have an always up-to-date RAG.

​We’ll cover the following steps:

  • ​Extract data from REST APIs

  • ​Loading and vectorizing into LanceDB, which unlike other vector DBs stores the data and the embeddings

  • ​Keeping your data up to date with incremental loading

​By the end of this workshop, you’ll be able to write a portable, OSS data pipeline for your RAG that you can deploy anywhere, such as Python notebooks, virtual machines, or orchestrators like Airflow, Dagster, or Mage.

​This event is sponsored by dlthub.

​​​​DataTalks.Club is the place to talk about data. Join our slack community!

Avatar for DataTalks.Club events
DataTalks.Club is a global online community of people who love data.
43 Going