Optimizing RAG With LLMS: Exploring Chunking Techniques and Reranking for Enhanced Results
Large Language Models have the power to answer questions specific to your data. There are a few limitations, like the limited input token length and retrieval of the wrong context. Chunking is used in LLMs to break down long text documents into smaller sections. For example, you can chunk a PDF document by the headings or use a text splitter. Implementing ranking models is an effective strategy to enhance search quality. With these techniques in mind, we can use LLMs in production and interact with your specific data.
Unstructured will chunk the documents and store the objects in the Weaviate vector database. Weaviate is used to orchestrate the inference for the embeddings and the Cohere Reranking API. We will then evaluate the results in Phoenix and spot the cases where the retrieval failed.
Join this session to learn more about:
Strategies for optimizing Retrieval Augmented Generation (RAG)
Using chunking techniques to streamline processing
Implementing ranking models to enhance search quality