Cover Image for IIT Delhi AI Safety Reading Meetup #2
Cover Image for IIT Delhi AI Safety Reading Meetup #2
Hosted By
4 Went

IIT Delhi AI Safety Reading Meetup #2

Hosted by Basil Labib
Registration
Past Event
Welcome! To join the event, please register below.
About Event

We are excited to announce the second meetup for IIT Delhi AI Safety Reading Meetup. We meet to read and discuss one of the leading papers on AI safety and alignment research.

​Eligibility

​This is only for IIT Delhi students! We have limited seating so we will prioritise people with prior experience in ML or related fields. If in doubt, feel free to apply and we'll reach out to you!

​Session structure

​The session is divided into two parts:

​45 mins - silent reading of the given paper
45 mins - a vote-based selection of questions from the participants followed by discussion.

For this session, we have a list of three papers and we will read based on majority vote.

1. Zoom In: An Introduction to Circuits by Olah et. al. Link: https://distill.pub/2020/circuits/zoom-in/

2. Universal and Transferable Adversarial Attacks on Aligned Language Models by Zou et. al. https://arxiv.org/pdf/2307.15043

3. Supervising strong learners by amplifying weak experts by Christiano et. al. https://ar5iv.labs.arxiv.org/html/1810.08575


For any queries, please email basillabib01@gmail.com

See you there!

Location
Student Activity Center IIT Delhi
Hosted By
4 Went