Cover Image for Multimodal Weekly 80: Streaming Event Detection and Grounded Video Caption Generation

Presented by

This webinar series happens every Friday from 1:30 - 2:30 PM PST. Each webinar will have speakers who will share their startups, projects, or research work in the Multimodal AI space.

Hosted By

3 Went

AI

Multimodal Weekly 80: Streaming Event Detection and Grounded Video Caption Generation

Multimodal Weekly

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

In the 80th session of Multimodal Weekly, we have two exciting presentations on streaming event detection and grounded video caption generation.

✅ Cristobal Eyzaguirre will present a novel task for multimodal video understanding - Streaming Detection of Quert Event Start (SDQES). The goal of SDQES is to identify the beginning of a complex event as described by a natural language query, with high accuracy and low latency.

✅ Evangelos Kazakos will present a novel approach for captioning and object grounding in video, where the objects in the caption are grounded in the video via temporally dense bounding boxes.

Join the Multimodal Minds community to connect with the speakers!

Multimodal Weekly is organized by Twelve Labs, a startup building multimodal foundation models for video understanding. Learn more about Twelve Labs here: https://twelvelabs.io/

Presented by

Multimodal Weekly

This webinar series happens every Friday from 1:30 - 2:30 PM PST. Each webinar will have speakers who will share their startups, projects, or research work in the Multimodal AI space.

Hosted By

3 Went

AI