Aligning AI with Pluralistic Human Values
With the increased power and prevalence of AI systems, it's ever more critical that AI systems are designed to serve all: people with diverse values and perspectives.
The first-order challenge is to explore the extent to which AI systems can model pluralistic human values.
The BuzzRobot guest, Taylor Sorensen from the University of Washington, will introduce ValuePrism, a large-scale dataset of 218k values connected to 31k human-written situations.
Taylor and his collaborators conducted a large-scale study with annotators across diverse social and demographic backgrounds to understand whose values are represented.
With ValuePrism, they've built Kaleido, an open, lightweight, and structured language-based multi-task model that generates, explains, and assesses the relevance of human values within a specific context.
Humans prefer the sets of values output by this system over those by the teacher model GPT-4, finding them more accurate and with broader coverage.
Building on this, the researchers propose a roadmap to pluralistic alignment.