South Bay Systems Meetup: Wrangling Data with DuckDB
The South Bay Systems meetup is back! This time we’re excited to have Hannes Mühleisen, co-founder of DuckDB, to dive into Wrangling Data with DuckDB.
This meetup is generously hosted by Databricks at their Mountain View office. Food and drinks will be provided!
Abstract :
Data wrangling is the thorny hedge that higher powers have placed in front of the enjoyable task of actually analyzing or visualizing data. Common struggles come from importing data from ill-mannered CSV files, the tedious task of orchestrating efficient data transformation, or the inevitable management of changes to tables. Data wrangling is rife with questionable ad-hoc solutions, which can sometimes even make things worse. The design rationale of DuckDB is to support the task of data wrangling by bringing the best of decades of data management research and best practices to the world of interactive data analysis in R or Python. For example, DuckDB has one of the world's most advanced CSV readers, native support for Parquet files and Arrow structures, an efficient parallel vectorized query processing engine, and support for efficient atomic updates to tables. All of this is wrapped up in a zero-dependency package available in a programming language near you for free.
Agenda
6:00 PM: Doors Open, Food and Socializing
6:30 - 7:30 PM: Talk
7:30-: Community Socializing!