[Ml-stat-talks] Tamara Broderick, Wed, 2/12, 4:30pm, CS105

Robert Schapire schapire at CS.Princeton.EDU
Mon Feb 10 17:24:52 EST 2014

  Feature allocations, paintboxes, and probability functions

Tamara Broderick <http://www.stat.berkeley.edu/%7Etab/>, University of 
California, Berkeley <http://berkeley.edu/index.html>
Wednesday, February 12, 4:30pm
Computer Science 105

Clustering involves placing entities into mutually exclusive categories. 
We wish to relax the requirement of mutual exclusivity, allowing objects 
to belong simultaneously to multiple classes, a formulation that we 
refer to as "feature allocation." The first step is a theoretical one. 
In the case of clustering the class of probability distributions over 
exchangeable partitions of a dataset has been characterized (via 
exchangeable partition probability functions and the Kingman paintbox). 
These characterizations support an elegant nonparametric Bayesian 
framework for clustering in which the number of clusters is not assumed 
to be known a priori. We establish an analogous characterization for 
feature allocation; we define notions of "exchangeable feature 
probability functions" and "feature paintboxes" that lead to a Bayesian 
framework that does not require the number of features to be fixed a 
priori. The second step is a computational one. Rather than appealing to 
Markov chain Monte Carlo for Bayesian inference, we develop a method to 
transform Bayesian methods for feature allocation (and other latent 
structure problems) into optimization problems with objective functions 
analogous to K-means in the clustering setting. These yield 
approximations to Bayesian inference that are scalable to large 
inference problems.

Tamara Broderick is a PhD candidate in the Department of Statistics at 
the University of California, Berkeley. Her research in machine
learning focuses on the design and study of Bayesian nonparametric 
models, with particular emphasis on feature allocation as a
generalization of clustering that relaxes the mutual exclusivity and 
exhaustivity assumptions of clustering. While at Berkeley, she has
been a National Science Foundation Graduate Student Fellow and a 
Berkeley Fellowship recipient. She graduated with an AB in Mathematics 
from Princeton University in 2007---with the Phi Beta Kappa Prize for 
highest average GPA in her graduating class and with Highest Honors in 
Mathematics. She spent the next two years on a Marshall Scholarship at 
the University of Cambridge, where she received a Masters of Advanced 
Study in Mathematics for completion of Part III of the Mathematical 
Tripos (with Distinction) in 2008 and an MPhil by Research in Physics in 
2009. She received a Masters in Computer Science from UC Berkeley in 2013.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20140210/605aafa4/attachment.html>

More information about the Ml-stat-talks mailing list