Preference Optimization for Large Language Models: A UPenn Lecture | Sponsored by Turing
You are invited to join a special virtual stream of Prof. Mayur Naik’s CIS 7000 course on Large Language Models at the University of Pennsylvania, sponsored by Turing.
Hanjun Dai, Staff Research Scientist and Research Manager at Google DeepMind, will present a high-level overview of reinforcement learning (RL) in LLM training. His talk will cover key topics such as policy and reward modeling, optimization challenges, and infrastructure limitations. Additionally, Hanjun will introduce alternative methods like direct preference optimization (DPO) and best of n (BoN), sharing the latest research advancements and comparing various preference optimization techniques. The session will conclude with a discussion of open problems in this evolving field.
Speaker Bio:
Hanjun Dai holds a Ph.D. from Georgia Institute of Technology and focuses on efficient generative modeling for text, image, and structured data. His work has earned him several accolades, including the 2022 Google Research Tech Impact Award, and has been adopted in products such as Google Workspace, Gemini, and Cloud AI. He has also served as Area Chair for top conferences like AAAI, ICML, and NeurIPS.
This virtual stream opens up the classroom experience to a broader audience beyond the UPenn students. Anyone interested in AI, machine learning, or the cutting-edge techniques behind large language models will find this talk insightful and valuable.
Learn more about the course at CIS 7000 - Large Language Models.