[Ml-stat-talks] High-Dimensional Structure Learning of Graphical Models: Trees, Latent trees and Beyond

David Blei blei at CS.Princeton.EDU
Tue Feb 15 12:43:34 EST 2011

hi ml-stat-talks,

this talk is at 2:00PM.


On Tue, Feb 15, 2011 at 12:32 PM, David Blei <blei at cs.princeton.edu> wrote:

> hi all
> anima anandkumar from u.c. irvine is giving an interesting talk tomorrow
> about graphical models.  see details below.
> best
> dave
> ---
> February 16, 2011, 2:00PM
> EQuad Room B-205
> Title : High-Dimensional Structure Learning of Graphical Models: Trees,
> Latent trees and Beyond
> Speaker: Anima Anandkumar (U.C. Irvine)
> Abstract:
> Graphical models or Markov random fields provide a graph-based framework
> for capturing dependencies between
> random variables of a large-scale multivariate distribution. This
> interdisciplinary topic has found widespread application in a variety of
> areas including image processing, bioinformatics, combinatorial
> optimization and machine learning. Estimating the graph structure of
> the model using samples drawn from it forms an important task, since the
> structure reveals important relationships between the
> variables. However, structure estimation has several challenges: in general
> graphical models, it is NP-hard, the models are typically
> in the high-dimensional regime where the number of variables is much larger
> than the number of samples obtained, and there could
> be many latent variables which are unobserved. I will address these
> challenges in the talk and provide solutions for certain classes of
> models I will focus on latent tree models in the first part of the talk.
> These are tree graphical models where there are latent variables,
> but there is no knowledge of the number or the location of the latent
> variables. We have developed two novel algorithms which are
> consistent, computationally efficient and have low sample complexity. These
> algorithms are based on the presence of an additive
> metric on the tree, due to the properties of correlations on a tree model.
> The first algorithm uses these properties to check for sibling
> relationships between node pairs and builds the tree in a recursive
> fashion. The second algorithm initially builds a tree over the
> observed variables, and then adds hidden nodes in a step-by-step fashion by
> only operating on small subsets of variables. This
> leads to considerable computational savings compared to the first
> algorithm. We modify the second algorithm for experiments on real
> data by trading off number of added latent variables with the accuracy of
> resulting model fitting via the Bayesian Information Criterion
> (BIC). Experiments on the S&P 100 monthly returns data and on the
> occurrence of words in newsgroups reveal interesting
> relationships. In the second part, I will talk about recent results on
> learning graphical models on sparse Erdos-Renyi random graphs.
> These random graphs are relevant in social networks. Since these graphs are
> locally tree-like, it is a natural question if structure
> learning is feasible in these models, given that learning tree models is
> tractable. We provide a positive answer when the model is in
> the so-called uniqueness regime, where there is a decay of long-range
> correlations. The algorithm is based on a set of conditional
> mutual information tests and is shown to be consistent for structure
> estimation with almost order-optimal sample complexity. A
> simpler algorithm based on correlation thresholding is also consistent, but
> under more stringent conditions. To the best of our
> knowledge, this is the first work to show that the regime of consistent
> structure learning contains the regime of correlation decay.
> Finally, depending on the time availability, I will briefly mention related
> works on consistent estimation of high-dimensional forest
> distributions and the characterization of extremal tree structures with
> respect to error rates for structure learning.
> BIO:
> Anima Anandkumar received her B.Tech in Electrical Engineering from the
> Indian Institute of Technology (IIT) Madras in 2004 and her MS
> and PhD degrees in Electrical Engineering from Cornell University, Ithaca,
> NY in 2009. She was at the Stochastic Systems Group at MIT, Cambridge, MA as
> a post-doctoral researcher. She has been an assistant professor at EECS
> Dept. at U.C.Irvine since July 2010. She is the recipient of the 2009 Best
> Thesis Award by the ACM Sigmetrics Society, 2008 IEEE Signal Processing
> Society Young Author Best Paper Award, 2008 IBM Fran Allen PhD fellowship,
> and student paper award at 2006 IEEE ICASSP. Her research interests are in
> the area of statistical-signal processing, network theory and information
> theory.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20110215/a80c178e/attachment.htm>

More information about the Ml-stat-talks mailing list