Hosted By
8 Going
Sparse Autoencoders (Papers + Code)
Hosted by Campbell Hutcheson & Era Qian
Registration
About Event
We will do readings on Sparse Autoencoders:
Toy Models of Superposition https://transformer-circuits.pub/2022/toy_model/toy_model.pdf
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning https://transformer-circuits.pub/2023/monosemantic-features
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet https://transformer-circuits.pub/2024/scaling-monosemanticity/
We will then explore some sparse autoencoders in a notebook. We will make the notebook available in Github so you can follow along if you like.
Hosted By
8 Going