London AI4Code: "TALM: Tool Augmented Language Models" with Aaron Parisi (Google)
Transformer based language models (LMs) demonstrate increasing performance with scale across a wide variety of tasks. Scale alone however cannot enable models to solve tasks that require access to ephemeral, changing, or private data that was unavailable at training time. Many useful tasks may also benefit from LMs being able to access APIs that read or modify state.
The lead author of this work, Aaron Parisi (Google), will present Tool Augmented Language Models (TALM), a framework for augmenting language models with arbitrary tools. TALM has two key ideas. First, it models tool-use via a text-to-text interface. Second, it applies an iterative self-play technique to bootstrap high performance on tasks with few tool-use labelled examples. Taken together, this interface and technique make exploring additional tools and tasks possible, without requiring expensive data labeling efforts. TALM exhibits strong performance on both a knowledge-heavy QA task and a reasoning oriented math task with simple tools. At a given model scale, TALM significantly outperforms non-augmented LMs. TALM successfully performs out-of-distribution inferences on both QA and math tasks, where non-augmented LMs fail. The results suggest that Tool Augmented Language Models are a promising direction to enrich LMs' capabilities, with less dependence on scale.
Paper: https://arxiv.org/abs/2205.12255
Aaron Parisi is software developer at Google, currently employed in Google Brain, researching meta-reinforcement learning algorithms and language understanding models (aarontp@proton.me)