Cover Image for Multimodal Weekly 52: HCI for How-to Videos, Feeling & Building Multimodal Intelligence, and Visually-Grounded Video QA
Cover Image for Multimodal Weekly 52: HCI for How-to Videos, Feeling & Building Multimodal Intelligence, and Visually-Grounded Video QA
Avatar for Multimodal Weekly
Presented by
Multimodal Weekly
Hosted By
12 Went

Multimodal Weekly 52: HCI for How-to Videos, Feeling & Building Multimodal Intelligence, and Visually-Grounded Video QA

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

​​In the 52nd session of Multimodal Weekly, we have three exciting researchers working in Human-Computer Interaction for video understanding, large-scale multimodal models, and video question answering.

​​​✅ Saelyne Yang, Ph.D. Candidate at KAIST, will present her work on enhancing how people learn procedural tasks through how-to videos.

​​​✅ Bo Li and Yuanhan Zhang, Ph.D. students at NTU Singapore, will introduce recent works at LMMs-Lab, including LLaVA-NeXT, LongVA, and LMMs-Eval.

​​​✅ Junbin Xiao, a Research fellow at NUS Singapore, will present his work on visually-grounded video question-answering.

Join the Multimodal Minds community to connect with the speakers!

Multimodal Weekly is organized by Twelve Labs, a startup building multimodal foundation models for video understanding. Learn more about Twelve Labs here: https://twelvelabs.io/

Avatar for Multimodal Weekly
Presented by
Multimodal Weekly
Hosted By
12 Went