ALIGN Webinar #12 with Jesse Hoogland
In the twelfth episode of the ALIGN webinar series, we are pleased to invite Jesse Hoogland, a prominent researcher in the field of Singular Learning Theory (SLT) and its applications to AI alignment. Jesse Hoogland will delve into the foundational concepts of SLT, a theory pioneered by Japanese statistician Sumio Watanabe, which offers a unique perspective on understanding large language models by examining the geometric structure of their loss landscapes (developmental interpretability).
In this webinar, Jesse Hoogland will demonstrate how applying SLT to the training dynamics of transformers reveals distinct ‘phases’ of learning—analogous to gas, liquid, and solid states in physics—that influence model behavior and development. This novel perspective not only enhances our understanding of model interpretability and developmental stages but also opens new avenues for rigorous evaluation methodologies and alignment strategies to bolster AI safety.
Jesse Hoogland’s expertise in mathematical modeling and his innovative approach to SLT have made significant contributions to the field of AI alignment. His work is widely recognized, and he continues to play a pivotal role in advancing the theoretical foundations of alignment and reliable AI systems.
10:00–10:05 (JST): Housekeeping by ALIGN
10:05–10:50 (JST): Jesse Hoogland on Singular Learning Theory for AI Safety
10:50–10:55 (JST): Q&A and discussion with participants
10:55– Closing
The event will be held in English, with slides in English, but the audience is welcome to ask questions in Japanese.
ALIGNウェビナーシリーズの第12回では,特異学習理論(Singular Learning Theory, SLT)の分野で活躍し,AIアライメントへの応用に取り組む Jesse Hoogland 氏をお招きします.Jesse Hoogland 氏は,日本の統計学者 渡辺澄夫氏によって提唱されたSLTの基本概念を解説し,大規模言語モデルの学習時における損失地形の幾何学的構造を通じてその内的仕組みを理解する方法論(developmental interpretability)を紹介します.
このウェビナーでは,Jesse Hoogland 氏がトランスフォーマーモデルの学習ダイナミクスにSLTを適用することで,物理学の「気体」「液体」「固体」に例えられる学習の「相」がどのようにモデルの挙動や発達に影響を与えるかを示します.この視点は,モデルの解釈性や発達段階の理解を深めるだけでなく,AIの安全性を高めるためのより厳密な評価方法論やアライメント戦略への道を開きます.
10:05–10:50:Jesse Hoogland氏による「AI安全性のための特異学習理論」
10:55– 終了