[Ml-stat-talks] Wed: Shay Cohen on spectral inference

David Mimno mimno at CS.Princeton.EDU
Mon Oct 22 09:22:10 EDT 2012

If you asked a dozen machine learning researchers to name the most
exciting developments in the field, spectral inference would be on
most of their lists. This week we're excited to welcome Shay Cohen
from Columbia, whose research is at the front of this area.

The schedule for ML lunch for the rest of the semester can be found at


Shay Cohen, Columbia
CS402, Wed Oct 24, 12:30

Title: Consistent and Efficient Algorithms for Latent-Variable PCFGs

In the past few years, there has been an increased interest in the
machine learning community in spectral algorithms for estimating
models with latent variables. Examples include algorithms for
estimating mixture of Gaussians or for estimating the parameters of a
hidden Markov model.

Until the introduction of spectral algorithms, the EM algorithm has
been the mainstay for estimation with latent variables. Still, with EM
there is no guarantee of convergence to the global maximum of the
likelihood function, and therefore EM generally does not provide
consistent estimates for the model parameters. Spectral algorithms, on
the other hand, are often shown to be consistent.

In this talk, I am interested in presenting a spectral algorithm for
latent-variable PCFGs, a model widely used for parsing in the NLP
community. This model augments the nonterminals in a PCFG grammar with
latent states. These latent states re-fine the nonterminal category in
order to capture subtle syntactic nuances in the data. This model has
been successfully implemented in state-of-the-art parsers such as the
Berkeley parser.

The algorithm developed is considerably faster than EM, as it makes
only one pass over the data. Statistics are collected from the data in
this pass, and singular value decomposition is performed on matrices
containing these statistics. Our algorithm is also provably consistent
in the sense that, given enough samples, it will estimate
probabilities for test trees close to their true probabilities under
the latent-variable PCFG model.

If time permits, I will also present a method aimed at improving the
efficiency of parsing with latent-variable PCFGs. This method relies
on tensor decomposition of the latent-variable PCFG. This tensor
decomposition is approximate, and therefore the new parser is an
approximate parser as well. Still, the quality of approximation can be
theoretically guaranteed by inspecting how errors from the
approximation propagate in the parse trees.

Joint work with Michael Collins, Dean Foster, Karl Stratos and Lyle Ungar.

More information about the Ml-stat-talks mailing list