AI Wednesdays: The One About Alignment
More about the Sharings
Isaac Lim (Data Scientist, GovTech) will share more on AI Practice's recent work on performing safety alignment on SEA-Lion-v2.1-Instruct for Singlish toxicity. The sharing will cover both technical and practical concepts around safety alignment, including preference alignment algorithms and scalable serverless training, along with some interesting experimental findings. (Technical Level: 300)
Noam Youngerman and Jacob Zhitomirsky (fmr.ai, an early stage AI introspection startup) will share their work on applying AI introspection techniques for Alignment use-cases. The talk will quickly overview central AI introspection results from the literature, then demonstrate a ‘Polygraph’ that can quantitatively measure subjective opinions held by open-weight models. Finally discuss how these tools can be used to re-align models at both fine-tuning and inference. (Technical Level: 200)
More about the Speakers
Isaac is a Data Scientist from AI Practice's Capability Development team. He mainly works on RAG, finetuning and LLM safety. He was previously an ML Engineer at a Seattle-based fintech startup working on municipal bond pricing.
Jacob Zhitomirsky is Chief Scientist at fmr.ai. Previously he has 9 years of research experience in fields ranging from cybersecurity research, static code analysis, ML introspection & explainability. Noam Youngerman is CTO at fmr.ai. In previous roles he headed the R&D Division of Planet Nine, leading research across cybersecurity, ML explainability, and reverse engineering of AI models.
🛠️ Event details are still cooking, and we’ll serve more up soon. Stay tuned! 🚀
Cowabunga, folks! 🤙💡