Location: CS 402

Date: Oct 9, 2019

Time: 3-4pm

Title:
Maximal Mutual Information Predictive Coding for Natural Language Processing

Abstract:
Neural predictive coding is an enormously successful approach to unsupervised representation learning in natural language processing. In this approach, a large-scale neural language model is trained to predict the missing signal (e.g., next word, next sentence) and the trained model is used in downstream tasks to produce useful text representations. While effective, it is computationally difficult to work with and yields uninterpretable representations. In this talk, I will present a new approach to neural predictive coding based on maximal mutual information (MMI). Instead of predicting the raw missing signal, we define a set of interpretable latent "codes" and directly predict the underlying code of the missing signal. The model is trained by maximizing the mutual information between the predicted codes. I will first present a simple and effective MMI predictive coding neural model that pushes the state-of-the-art performance in unsupervised part-of-speech tagging. I will next discuss an ongoing work on generalizing MMI predictive coding to structured representations.

Bio:
Karl Stratos is an assistant professor in the Computer Science Department at Rutgers University. His research centers on statistical approaches to unsupervised learning in natural language processing. He completed a PhD in computer science from Columbia University in 2016. During PhD, he was advised by Michael Collins and also worked closely with Daniel Hsu. After PhD, he was a senior research scientist at Bloomberg LP (2016-2017) and a research assistant professor at Toyota Technological Institute at Chicago (2017-2019).

Danqi Chen

http://www.cs.princeton.edu/~danqic/