Lossfunk talks || Tokenize Everything (Images Edition) || Prashant Shishodia

Name: Lossfunk talks || Tokenize Everything (Images Edition) || Prashant Shishodia
Start: 2025-04-30T18:00:00.000+05:30
End: 2025-04-30T19:00:00.000+05:30
Location: Lossfunk

Hosted by Lossfunk . & Sourav Mohanty

Lossfunk

Bengaluru, Karnataka

Past Event

Welcome! Please choose your desired ticket type:

You will be asked to verify token ownership with your wallet.

About Event

How can we build a single AI model that native understands and generates all modalities - images, audio, and video? I argue that discrete tokenization can make the problem much simpler. We focus on images & discuss whether images can be tokenized into discrete tokens, and if those tokens are useful for understanding & generation. We'll review the evolution from pixel/waveform-based models to the dominant two-stage latent generation approach. Key techniques like VQ-VAE (discrete) and KL-regularized VAEs (continuous) will be compared, highlighting their impact on model efficiency and capabilities. We will critically examine the challenges – balancing compression, perceptual quality, and modelability; controlling latent capacity; and the inflexibility of grid structures.

Location

Lossfunk

3rd Floor, 309, 100 Feet Rd, 1st Stage, Binnamangala, Stage 1, Indiranagar, Bengaluru, Karnataka 560038, India

Hosted By

10 Went

AI