Cover Image for LlamaIndex Webinar: ColPali - Efficient Document Retrieval with Vision Language Models
Cover Image for LlamaIndex Webinar: ColPali - Efficient Document Retrieval with Vision Language Models
Hosted By
448 Going

LlamaIndex Webinar: ColPali - Efficient Document Retrieval with Vision Language Models

Hosted by Jerry Liu
Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Enterprise RAG systems face a significant challenge when processing PDFs with complex layouts, tables, and figures. Conventional RAG pipelines typically convert these documents into text through parsing, OCR, or extraction before building lexical or semantic indexes. However, this approach often results in the loss of crucial visual information, potentially impacting retrieval performance.

We're excited to host the authors of "ColPali: Efficient Document Retrieval with Vision Language Models" (Faysse et al.). The key idea is to bypass traditional conversion processes by directly embedding page screenshots with VLMs. By working with image representations, ColPali offers a streamlined, end-to-end optimizable system that achieves state-of-the-art results on a newly proposed benchmark.

We've been working towards this direction with LlamaParse, and and we're looking forward to a technical discussion of what a multimodal document RAG future looks like.

Hosted by Simon Suo, co-founder/CTO of LlamaIndex

Hosted By
448 Going