Cover Image for Multimodal Weekly 80: Streaming Event Detection and Grounded Video Caption Generation
Cover Image for Multimodal Weekly 80: Streaming Event Detection and Grounded Video Caption Generation
Avatar for Multimodal Weekly
Presented by
Multimodal Weekly
Hosted By
3 Went

Multimodal Weekly 80: Streaming Event Detection and Grounded Video Caption Generation

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

​​​​​In the 80th session of Multimodal Weekly, we have two exciting presentations on streaming event detection and grounded video caption generation.

​​​​✅ Cristobal Eyzaguirre will present a novel task for multimodal video understanding - Streaming Detection of Quert Event Start (SDQES). The goal of SDQES is to identify the beginning of a complex event as described by a natural language query, with high accuracy and low latency.

​​​​✅ Evangelos Kazakos will present a novel approach for captioning and object grounding in video, where the objects in the caption are grounded in the video via temporally dense bounding boxes.

Join the Multimodal Minds community to connect with the speakers!

Multimodal Weekly is organized by Twelve Labs, a startup building multimodal foundation models for video understanding. Learn more about Twelve Labs here: https://twelvelabs.io/

Avatar for Multimodal Weekly
Presented by
Multimodal Weekly
Hosted By
3 Went