Andrej Risteski will present his FPO, "New techniques for learning and inference in Bayesian models" on Thursday,  October 26, 2017 at 1pm in CS 402.

The members of his committee are:
Readers: Sanjeev Arora (adviser), Rong Ge (Duke University), and Elad Hazan
Nonreaders: Sanjeev Arora, Elad Hazan, Bernard Chazelle, and Rong Ge (Duke University), and Mark Braverman

A copy of his thesis, is available in Room 310.  Everyone is invited to attend his talk. The talk abstract follows below.

Abstract:
A common theme in machine learning is succinct modeling of distributions over large domains. Probabilistic graphical models are one of the most expressive frameworks for doing this. The two major tasks involving graphical models are learning and inference. Learning is the task of calculating the best fit graphical model from raw data, while inference is the task of answering probabilistic queries for a known graphical model (for example what is the marginal distribution of one of the variables or what is the distribution of a subset of variables, after conditioning on the values of some other subset of variables). Learning can be thought of as finding a graphical model that explains the raw data, while the inference queries extract the knowledge the graphical model contains. This thesis introduces new provable techniques for performing both of these tasks, in the context of both latent-variable models – in which a portion of the variables in the graphical model are not observed, as well as fully-observable undirected graphical models (Markov Random Fields). Chapters 2 and 3 will focus on learning latent-variable models, while Chapter 4 will focus on inference in Markov Random Fields. In Chapter 2, I will contribute the first provable results for analyzing variational Bayes: a family of alternatingminimization style algorithms which is very popular in practice for learning latent-variable models. Despite it’s popularity with practitioners, the only theoretical guarantees prior to this work concerned convergence to local minima. We will prove that under reasonable assumptions, in the context of topic models, these algorithms will converge to the global minimum. Subsequently, in Chapter 3, we will use the method-of-moments along with new techniques in tensor decomposition and constrained matrix factorization to derive algorithms for learning noisy-OR networks – the textbook example of a probabilistic model for causal relationships. Importantly, these techniques were only applicable to linear latentvariable models – which noisy-OR is not. In Chapter 4, I will contribute a new understanding of a class of variational methods for calculating partition functions in Markov Random Fields. The key technical ingredient is a connection to convex programming hierarchies – a recent area of interest in combinatorial optimization, along with approximations of the entropy of a distribution based on low-order moment information.