Cover Image for Moonshot Alignment Program Demo Day
Cover Image for Moonshot Alignment Program Demo Day
Avatar for AI-Plans
Presented by
AI-Plans
20 Going

Moonshot Alignment Program Demo Day

Virtual
Registration
Welcome! Please choose your desired ticket type:
About Event

The Moonshot Alignment Program ends with a public poster session and job fair. Teams will present their work in a virtual conference format on GatherTown. Each team has a space to display their results, answer questions, and defend their method. Senior researchers will review the posters and vote on standout projects.

Following the presentation is a job fair where research orgs, labs, and startups can host booths, meet researchers, and share open roles.

About the Moonshot Alignment Program:

Moonshot Alignment: 5 Week Intensive Research Program

A 5 week program for directly tackling the hard part of alignment.

Most alignment research focuses on subproblems. This fellowship tackles the core challenge directly: getting values into models with strong empirical evidence that the methods work and scale. Previous research experience is advised—everyone is welcome to apply, but due to mentorship bandwidth limitations, there is a limit to how many applications we can accept.

We guarantee personalized feedback to the first 300 applicants.

Different tracks have suggested requirements, however, lots of people can make valuable contributions even if they don't have the specific requirements for a track. For example, a data scientist with zero neuroscience background could help in the neuroscience track by extracting key information from brain scans. Someone with a strong interpretability, robustness or evals background could contribute to almost any team, by helping the team find out if the alignment method they're trying is actually working.

Program Details

Duration: 5 weeks, 10 hours a week
Format: Teams of 3-5 researchers
Start: Kickoff call with Kabir Kumar, 2nd August
End: Poster Evening followed by Careers Fair

Research Tracks

1. Agent Foundations

Either solve mathematical problems in agent foundations or implement existing theoretical work like Infrabayesianism.
Strong applicants to this track will:
- Be competent in Bayes nets, measure theory and propositional logic
- Be able to quickly learn new math

It would be helpful to know:
- Decision theory
- Computability/provability theory

Please apply even if you don't have any/all of these prerequisites—assuming a basic math background (e.g., a Masters), if you're hard working and willing to learn a lot, you can likely contribute. Formal qualifications are not necessary.

2. Brain-Based AI Safety

Develop architectures based on how morals/values are encoded in the human brain.

Strong applicants to this track could:
- Have a background in neuroscience and machine learning
- Understand why it's difficult to find out specifically what the brain is doing

3. Improved Preference Optimization Methods

Create non-shallow methods of scalable oversight that demonstrably embed values deeply into models. Must show generalization beyond training distribution through interpretability-based evaluations.

4. Original/Other Methods

Novel approaches to the core alignment problem that don't fit other tracks.

Program Structure

Week 1: Form teams, define specific approaches, design falsifiable experiments
Weeks 2-3: Build implementations, run experiments, test for generalization and for larger models
Week 4: critique approach, red-team, document failure modes
Week 5: write research summary, prepare poster and get final feedback

What We Provide

  • GPU compute for experiments

  • Mentorship from senior alignment researchers

  • Collaboration infrastructure

  • Stipends for full-time participants

  • Presentation opportunity to AI lab representatives

Application Process

Stage 1 requirements:

  1. CV

  2. Confidence level (1-10) you can commit for 30 days

  3. Brief explanation of why you can commit

  4. (Optional) Additional relevant work/ideas/links

Final Events

Poster Evening

Teams present their work in Gather Town. Senior alignment researchers judge projects. Conference-style format where attendees can visit your virtual booth.

Careers Fair

Representatives from DeepMind, Anthropic, Redwood Research, Apart Research, MIRI, Conjecture, CAIS, FAR AI, Ought, and others will discuss opportunities at their organizations.

Expected Outcomes

  • Working implementations of value alignment techniques

  • Empirical evidence of generalization and scaling

  • Falsifiable predictions with test results

  • Open-source contributions

  • Direct connections to alignment organizations

Questions? Contact kabir@ai-plans.com

Avatar for AI-Plans
Presented by
AI-Plans
20 Going