Provable AI Safety seminars, April 2024

Guaranteed Safe AI Seminars

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Synthesizing Gatekeepers for Safe Reinforcement Learning by Justice Sefas

"RL Gatekeepers"

There is no general mechanism to enforce hard boundary constraints on RL agents. Instead of merely discouraging certain actions through a negative reward signal or via soft constraints, we seek to build safe "gatekeepers" which block unsafe actions. In particular, we use model checking on an inverted pendulum and neural control barrier functions on a drone environment to create gatekeepers against which RL agents can learn to optimize their objectives safely.

Notes: This was a project I worked on during MATS. It demonstrates two methods, model-checking and control barrier functions, to create gatekeepers with which RL agents can be safely deployed.

Verifying Global Properties of Neural Networks by Roman Soletskyi

"Train neural networks to be verifiable by construction"

Verifiable reinforcement learning methods produce mathematical proof that an agent adheres to specific requirements, such as remaining within a safe area. The primary challenge is obtaining precise bounds on neural network outputs across large input regions. We introduce a straightforward benchmark, examine how the complexity of verification scales with current methods, and suggest new approaches to accelerate the process.

Notes: Neural networks verification is one of the core computational obstacles in Davidad's plan. By lowering its complexity and understanding when feasible, we can cast the plan into the practical domain and begin experimenting with it in reduced world models.

The Provable AI Safety seminars

The monthly seminar series on Provable AI Safety brings together researchers and experts to discuss and advance the field of making artificial intelligence systems provably safe.

Monthly on the second Thursday.

Apply to speak: https://airtable.com/appYgVucDqD7WkTyy/pagvLGoo9WCTI9ohk/form

More info, and event archives: https://www.horizonevents.info/provableaisafetyseminars

Presented by

Guaranteed Safe AI Seminars

Monthly seminars on topics related to Guaranteed Safe AI. https://www.horizonevents.info/guaranteedsafeaisem…

Hosted By

31 Went

Provable AI Safety seminars, April 2024

​Synthesizing Gatekeepers for Safe Reinforcement Learning by Justice Sefas

​Verifying Global Properties of Neural Networks by Roman Soletskyi

​The Provable AI Safety seminars

Synthesizing Gatekeepers for Safe Reinforcement Learning by Justice Sefas

Verifying Global Properties of Neural Networks by Roman Soletskyi

The Provable AI Safety seminars