[Ml-stat-talks] Tamara Broderick, Wed, 2/12, 4:30pm, CS105
Robert Schapire
schapire at CS.Princeton.EDU
Mon Feb 10 17:24:52 EST 2014
Feature allocations, paintboxes, and probability functions
Tamara Broderick <http://www.stat.berkeley.edu/%7Etab/>, University of
California, Berkeley <http://berkeley.edu/index.html>
Wednesday, February 12, 4:30pm
Computer Science 105
Clustering involves placing entities into mutually exclusive categories.
We wish to relax the requirement of mutual exclusivity, allowing objects
to belong simultaneously to multiple classes, a formulation that we
refer to as "feature allocation." The first step is a theoretical one.
In the case of clustering the class of probability distributions over
exchangeable partitions of a dataset has been characterized (via
exchangeable partition probability functions and the Kingman paintbox).
These characterizations support an elegant nonparametric Bayesian
framework for clustering in which the number of clusters is not assumed
to be known a priori. We establish an analogous characterization for
feature allocation; we define notions of "exchangeable feature
probability functions" and "feature paintboxes" that lead to a Bayesian
framework that does not require the number of features to be fixed a
priori. The second step is a computational one. Rather than appealing to
Markov chain Monte Carlo for Bayesian inference, we develop a method to
transform Bayesian methods for feature allocation (and other latent
structure problems) into optimization problems with objective functions
analogous to K-means in the clustering setting. These yield
approximations to Bayesian inference that are scalable to large
inference problems.
Tamara Broderick is a PhD candidate in the Department of Statistics at
the University of California, Berkeley. Her research in machine
learning focuses on the design and study of Bayesian nonparametric
models, with particular emphasis on feature allocation as a
generalization of clustering that relaxes the mutual exclusivity and
exhaustivity assumptions of clustering. While at Berkeley, she has
been a National Science Foundation Graduate Student Fellow and a
Berkeley Fellowship recipient. She graduated with an AB in Mathematics
from Princeton University in 2007---with the Phi Beta Kappa Prize for
highest average GPA in her graduating class and with Highest Honors in
Mathematics. She spent the next two years on a Marshall Scholarship at
the University of Cambridge, where she received a Masters of Advanced
Study in Mathematics for completion of Part III of the Mathematical
Tripos (with Distinction) in 2008 and an MPhil by Research in Physics in
2009. She received a Masters in Computer Science from UC Berkeley in 2013.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20140210/605aafa4/attachment.html>
More information about the Ml-stat-talks
mailing list