Jane Pan will present her MSE talk "What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning" on Monday, April 24, 2023 at 2pm in CS 402.

The members of her committee are as follows:

Advisor: Danqi Chen; Reader: Karthik Narasimhan

Title: What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning

Abstract:

Large language models (LLMs) exploit in-context learning (ICL) to solve tasks with only
a few demonstrations, but its mechanisms are not yet well-understood. Some works suggest that LLMs only recall already learned concepts from pre-training, while others hint that
ICL performs implicit learning over demonstrations. We characterize two ways through
which ICL leverages demonstrations: task recognition (TR) captures the extent to which
LLMs can recognize a task through demonstrations – even without ground-truth labels –
and apply their pre-trained priors; task learning (TL) is the ability to capture new input-
label mappings unseen in pre-training. Using a wide range of classification datasets and two
LLM families (GPT-3 and OPT), we design controlled experiments to disentangle the roles
of TR and TL in ICL. We show that (1) models can achieve non-trivial performance with
only TR, and TR does not further improve with larger models or more demonstrations; (2)
LLMs acquire TL as the model size scales; and (3) the correlation between model sizes and TL
performance strengthens with the number of demonstrations. Our findings unravel two dif-
ferent forces behind ICL and we advocate for discriminating them in future ICL research due
to their distinct nature.