Cover Image for Provable AI Safety seminars, April 2024
Cover Image for Provable AI Safety seminars, April 2024
Avatar for Guaranteed Safe AI Seminars
Monthly seminars on topics related to Guaranteed Safe AI. https://www.horizonevents.info/guaranteedsafeaisem…
31 Going

Provable AI Safety seminars, April 2024

Zoom
Registration
Past Event
Welcome! To join the event, please register below.
About Event

Synthesizing Gatekeepers for Safe Reinforcement Learning by Justice Sefas

"RL Gatekeepers"

There is no general mechanism to enforce hard boundary constraints on RL agents. Instead of merely discouraging certain actions through a negative reward signal or via soft constraints, we seek to build safe "gatekeepers" which block unsafe actions. In particular, we use model checking on an inverted pendulum and neural control barrier functions on a drone environment to create gatekeepers against which RL agents can learn to optimize their objectives safely.

Notes: This was a project I worked on during MATS. It demonstrates two methods, model-checking and control barrier functions, to create gatekeepers with which RL agents can be safely deployed.

Verifying Global Properties of Neural Networks by Roman Soletskyi

"Train neural networks to be verifiable by construction"

Verifiable reinforcement learning methods produce mathematical proof that an agent adheres to specific requirements, such as remaining within a safe area. The primary challenge is obtaining precise bounds on neural network outputs across large input regions. We introduce a straightforward benchmark, examine how the complexity of verification scales with current methods, and suggest new approaches to accelerate the process.

Notes: Neural networks verification is one of the core computational obstacles in Davidad's plan. By lowering its complexity and understanding when feasible, we can cast the plan into the practical domain and begin experimenting with it in reduced world models.

The Provable AI Safety seminars

The monthly seminar series on Provable AI Safety brings together researchers and experts to discuss and advance the field of making artificial intelligence systems provably safe.

Monthly on the second Thursday.

Apply to speak: https://airtable.com/appYgVucDqD7WkTyy/pagvLGoo9WCTI9ohk/form

More info, and event archives: https://www.horizonevents.info/provableaisafetyseminars

Avatar for Guaranteed Safe AI Seminars
Monthly seminars on topics related to Guaranteed Safe AI. https://www.horizonevents.info/guaranteedsafeaisem…
31 Going