Personal

Deep Neural Networks break the classic ML theory of bias-variance tradeoff. These models often have millions or billions of parameters, and yet they don't overfit the training data (as seen by continual decrease in test loss). What's happening? In this talk, we will build intuition about how deep learning works from a generalization point of view. We will talk about topics such as double descent, benign overfitting, flat basins, and so on. We will also fit a 1.8 million model on 100 data points, and show how it doesn't overfit!

(please request an invite first via the form below as we have limited space)

Why Deep Learning Works So Well (even with 100 data points!)

Shri

Sharvan

Sudhansu Sekhar Swain

Naman Anand

Kartik Chauhan

Naren R

Naman Maheshwari

Prakhar Tripathi

Venkatesh Alampally

shivam

Lossfunk .

Paras Chopra

Sourav Mohanty

Sourabh Patravale

Lavender

Online (Google meet)