Prem Gopalan will present his preFPO on Friday April 18 at 3PM in Room 402. The members of his committee are: David Blei, advisor; Rob Schapire and Jake Hofman (MSR), readers; Michael Freedman and John Storey (MOL), nonreaders. Everyone is invited to attend his talk. His abstract follows below. ---------------- Title: Scalable inference of discrete outcomes: networks, genotype and user consumption Latent variable models are probabilistic models that can be used to extract hidden structure in real data. They are important in many fields such as genetics, social network analysis and collaborative filtering. Data analysis using these models is useful in making predictions, exploring the data and in making better models. Will inference algorithms be able to cope with the scale of modern data sets? If yes, what properties of the model, data and the algorithms help in achieving scalability? In this talk, I will present advances in statistical models and scalable inference algorithms for identifying overlapping communities in networks, ancestral populations in human genetic variations, and latent structure in user consumption data. These algorithms lie in the framework of variational inference, an approach to approximate posterior inference that has been adapted to a variety of probabilistic models. As a detailed example, I will present hierarchical Poisson matrix factorization models for recommendation, and a corresponding variational inference algorithm. The algorithm scales to more than 100 million observations on a single CPU and predicts better than prior methods. A simple extension to this model allows for cold-start recommendations. I will end with a novel Bayesian nonparametric variant of Poisson matrix factorization that eases the burden of searching for the best number of latent components. This talk includes ongoing work.
participants (1)
-
Melissa M. Lawson