TAI AAI #05 - Reinforcement Learning
Location: below, but please come to Floor 18.
Topic
Reinforcement Learning: building agents that learn from their own experience. Researchers from Tokyo will give talks on reinforcement learning with applications ranging from games to robotics.
Our Community
Tokyo AI (TAI) is a community composed of people based in Tokyo and working with, studying, or investing in AI. We are engineers, product managers, entrepreneurs, academics, and investors intending to build a strong “AI coreˮ in Tokyo. Find more in our overview: https://bit.ly/tai_overview
Sponsor
Thanks to Indeed for the kind sponsorship of the venue and nourishment!
Schedule
17:30-18:00 Doors Open
18:00-18:15 Introduction
18:15-18:45 Creating a racing agent for Gran Turismo (Craig Sherstan)
18:45-19:15 An introduction to deploying RL for UAVs (Jun Jet Tai)
19:15-19:45 Break/Networking
19:45-20:15 Optimizing MuZero for Diverse Action Spaces (Silver Lucoris)
21:15-20:45 From Playing to Solving Go (Ting-Han Wei)
21:00 Finish
Speakers
Craig Sherstan (profile)
Title: Creating a Racing Agent for Gran Turismo
Abstract: In 2021, GT Sophy---an AI agent trained via reinforcement learning---outraced the top human drivers in the racing simulator Gran Turismo. Since then, we've continued to evolve our AI racer, and as of 2024, millions of players worldwide can now race against GT Sophy in Gran Turismo 7. In this talk, I'll dive into the development and training of GT Sophy, the challenges we faced, and how we've refined the agent to ensure it's fun for all players.
Bio: PhD, studied reinforcement learning at the University of Alberta, under the supervision of Patrick Pilarski and Rich Sutton. His research focused on the application of reinforcement learning techniques to the control of prosthetic limbs. Upon earning his PhD in 2020, he joined Sony AI and has worked on deploying RL agents in both video games and robots. He moved to Tokyo in 2022 with his family.
Jun Jet Tai (profile)
Title: An introduction to deploying RL for UAVs
Abstract: I will give a brief introduction to quadrotor system architectures, some physics, and how they're conventionally used with AI. This will focus on a top-down view from hardware to software, as well as common pitfalls. Then I'll frame it from the perspective of RL, and what reward function designs are typically used.
Bio: Jet is a reinforcement learning researcher specializing in UAV applications, especially in quadcopter-based tasks for infrastructure projects. He obtained his PhD from Coventry University in the UK and is currently interning for Sony AI under the GT Sophy team.
Silver Lucoris (profile)
Title: Optimizing MuZero for Diverse Action Spaces
Abstract: MuZero has emerged as a powerful algorithm for planning and decision-making without requiring a model of the environment's dynamics. In this work, we explore the design and application of MuZero in increasingly complex action spaces, ranging from small, focused tasks to full-scale game strategies. By leveraging Monte-Carlo Tree Search (MCTS) for effective planning, we address key challenges in balancing action selection efficiency and scalability. We explain where MuZero fails to learn and how to design a version of MuZero that can learn depending on the game dynamics and action space. We apply our findings to a simulation of the game Team Fight Tactics in both a simplified version of positioning in the game and the full game itself.
Bio: I am a Reinforcement Learning hobbyist who has a great interest in teaching agents how to play games in simulated environments. I am the lead of the project to create the first ever bot to play the game Team Fight Tactics at a human level.
Ting-Han Wei (profile)
Title: From Playing to Solving Go
Abstract: Go has been used as a benchmark for artificial intelligence since the 1970s. The most significant breakthrough came in 2016 when AlphaGo defeated World Champion Lee Sedol 4-1 in a series of matches. Since then, anyone with access to an internet connection and a laptop can explore or even train superhuman-level Go-playing programs. However, playing the game at a high level is distinctly different from solving the game. To "weakly solve" Go, we need to determine the theoretical outcome of the game, which is achieved via optimal play from both sides and ensure that at each turn, our agent selects the correct move that leads to that outcome. In this talk, I will briefly introduce key techniques from machine learning and heuristic search that are being applied to tackle the challenging problem of solving two reduced versions of Go.
Bio: Prof. Ting Han Wei is a Professor in the School of Informatics at Kochi University of Technology, Japan. He earned his PhD in Computer Science from the National Chiao Tung University, Taiwan, where his research focused on distributed computing and game analysis systems. Following this, he became a postdoctoral fellow at the University of Alberta, Canada, where he worked on Go analysis, aiming to completely solve small board Go. In addition to these primary areas, Prof. Wei has co-supervised research projects exploring the intersection of reinforcement learning and heuristic search, with a particular emphasis on model-based methods and game AI. He has published papers in top-tier AI conferences, including NeurIPS, ICML, ICLR, AAAI, and IJCAI.
Organizers
Craig Sherstan: Research Scientist at Sony AI Tokyo. His current research is on the application of RL to create AI opponents for the video game Gran Turismo. Previously, he completed his PhD in Reinforcement Learning at the University of Alberta, Canada as part of the Bionic Limbs for Improved Natural Control Lab. Craig has past experience working with human-computer interfaces, robotics, and various software industries.
Kai Arulkumaran: Research Team Lead at Araya, working on brain-controlled robots as part of the JST Moonshot R&D program. Previously, he completed his PhD in Bioengineering at Imperial College London and had work experience at DeepMind, FAIR, Microsoft Research, Twitter Cortex, and NNAISENSE. His research areas are deep learning, reinforcement learning, evolutionary computation, and computational neuroscience.
Ilya Kulyatin: Fintech and AI entrepreneur with work and academic experience in the US, Netherlands, Singapore, UK, and Japan, with an MSc in Machine Learning from UCL.