Cover Image for Unstructured Data Meetup New York

Presented by

meetups for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs. This meetup is sponsored by Zilliz.

Hosted By

295 Went

Featured in

New York

Unstructured Data Meetup New York

Name: Unstructured Data Meetup New York
Start: 2024-10-23T17:30:00.000-04:00
End: 2024-10-23T20:30:00.000-04:00
Location: 159 W 25th St

Unstructured Data Meetup

159 W 25th St

New York, New York

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

This is an in-person event! Registration is required to get in.

Topic: Connecting your unstructured data with Generative LLMs

What we’ll do:
Have some food and refreshments. Hear three exciting talks about unstructured data, vector databases and generative AI.

5:30 - 6:00 - Welcome/Networking/Registration
6:00 - 6:20 - Tim Spann, Principal DevRel, Zilliz
6:20 - 6:45 - Uri Goren, Urimax
7:00 - 7:30 - Lisa N Cao, Product Manager, Datastrato
7:30 - 8:00 - Naren, Unstract
8:00 - 8:30 - Networking

Intro Talk:

Hiring?
Need a Job?
Cool project?
Meetup Logistics
Trick-Or-Treat
Using Milvus as a Ghost Trap

Tech talk 1: Introduction to Vector search
Uri Goren, Argmx CEO
Deep learning has been a game-changer for modern AI, but deploying it in production environments poses significant challenges. Vector databases (VDBs) have become the go-to solution for real-time, embedding-based queries. In this talk, we’ll explore the problems VDBs address, the trade-offs between accuracy and performance, and what the future holds for this evolving technology.

Tech talk 2: Metadata Lakes for Next-Gen AI/ML
Lisa N Cao, Product Manager, Datastrato

As data catalogs evolve to meet the growing and new demands of high-velocity, unstructured data, we see them taking a new shape as an emergent and flexible way to activate metadata for multiple uses. This talk discusses modern uses of metadata at the infrastructure level for AI-enablement in RAG pipelines in response to the new demands of the ecosystem. We will also discuss Apache (incubating) Gravitino and its open source-first approach to data cataloging across multi-cloud and geo-distributed architectures.

Tech talk 3:
Unstructured Document Data Extraction at Scale with LLMs: Challenges and Solutions

Unstructured documents present a significant challenge for businesses, particularly those managing them at scale. Traditional Intelligent Document Processing (IDP) systems—let's call them IDP 1.0—rely heavily on machine learning and NLP techniques. These systems require extensive manual annotation, making them time-consuming and less effective as document complexity and variability increase.

The advent of Large Language Models (LLMs) is ushering in a new era: IDP 2.0. However, while LLMs offer significant advancements, they also come with their own set of challenges, particularly around accuracy and cost, which can become prohibitive at scale. In this talk, we will look at how Unstract, an open source IDP 2.0 platform purpose-built for structured document data extraction, solves these challenges. Processing over 5 million pages of unstructured documents per month, Unstract uses various techniques to extract structured data with accuracy and cost efficiency, chief among them—the use of vector databases.

Naren H - Co-founder/COO, Unstract

Naren H is the co-founder at Unstract, an open source startup building an LLM-powered platform that extracts data from unstructured documents, helping automate critical business processes. Before Unstract, Naren founded Mediavak, a digital marketing agency, and co-founded Social Animal and Tweeple Search, building tools that made social media analytics and content marketing a breeze. He holds a Master’s in Computer Science from the State University of New York at Buffalo. He has a knack for turning data chaos into order — occasionally, he even manages to keep his emails under control.

Speaker LinkedIn Profile: https://www.linkedin.com/in/naren87/

Who Should attend:
Anyone interested in talking and learning about Unstructured Data and Generative AI Apps.

159 West 25th Street 3rd Floor Mohammad Ali Room

When:
October 23, 2024
5:30PM

Where:
This is an in-person event! Registration is required to get in. Registration will close 2 days before the event. Sponsored by Zilliz maintainers of Milvus.

Location