CS Colloquium Speaker

Tom Kwiatkowski, Google

Tuesday, December 3 - 1:30PM

Computer Science - Room 104

Host: Karthik Narasimhan

https://www.cs.princeton.edu/events/25905

New Challenges in Question Answering: Natural Questions and Going Beyond Word Matching

Abstract:

Recently, learned deep models have surpassed human performance on a number of question answering benchmarks such as SQuAD. However, these models resort to simple word matching and answer typing heuristics, and they are easily fooled. In this talk, I will present two different lines of work that aim to take us beyond the current status quo in question answering, and push us towards more robust representations of semantics, pragmatics, and knowledge.

First, I will present Natural Questions (NQ), a new question answering benchmark from Google. NQ contains real user questions, which require an understanding of the questioner's intent. NQ also requires systems to read entire Wikipedia pages to decide whether they fully answer a question. This is much harder than finding an answer given the knowledge that one is present. I will convince you that the question 'when was the last time a hurricane hit Massachusetts?' is under-specified with many reasonable answers, and I will tell you how we developed robust evaluation metrics to deal with this ambiguity.

In the second part of the talk I will present a complementary method of challenging today's question answering systems by removing access to evidence documents at inference time. Instead of building joint representations of questions and documents, we perform ahead of inference time reading and retrieve answers via fast maximum inner product search. I will show that this leads to large gains in accuracy and speed when finding answers in very large corpora. I will also show some preliminary results that show how our methods can be used to aggregate information from multiple diverse documents.