London AI4Code: "A Systematic Evaluation of Large Language Models of Code" with Vincent Hellendoorn

Hosted by Konstantina Dritsa & Gabriel Gordon-Hall

Zoom

Registration

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Large language models (LMs) of code have recently shown tremendous promise in completing code and synthesizing code from natural language descriptions. However, the current state-of-the-art code LMs (e.g., Codex (Chen et al., 2021)) are not publicly available, leaving many questions about their model and data design decisions. In this meetup, Vincent Hellendoorn will fill in some of these blanks through a systematic evaluation of the largest existing models: Codex, GPT-J, GPT-Neo, GPT-NeoX-20B, and CodeParrot, across various programming languages.

Paper: https://arxiv.org/abs/2202.13169

Vincent Hellendoorn is an Assistant Professor in computer science at CMU, where he builds intelligent tools for software engineering. His research work spans across three frontiers: 1. investigating new applications of AI in SE to understand the community’s needs broadly and provide practical value (e.g., FSE’18, FSE’21 IRV, EMSE’21); 2. analyzing current models and trends to make sure the results are impactful (ICSE’19, FSE’21, CACM’22); 3. building new models based on these insights (ICLR’20, NeurIPS’21, ICLR’22), including PolyCoder, briefly the largest open-source model trained exclusively on source code (get it here!).
Personal website: http://vhellendoorn.github.io/