<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000'><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><style>p { margin: 0; }</style><div style="font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000"><b>Bayesian Nonparametric Models and "Big Data"</b>

<br>

<b><a href="http://www.cs.berkeley.edu/%7Ejpaisley/" target="_blank">John Paisley</a></b>, <a href="http://www.berkeley.edu/index.html" target="_blank">University of California, Berkeley</a>

<br>Monday, February 25, 2013, 4:30pm<br>Computer Science Small Auditorium, Room 105<br>

<br>

Bayesian nonparametrics is an area in machine learning in which models 

grow in size and complexity as data accrue. As such, they they are 

particularly relevant to the world of "Big Data", where it may be 

difficult or even counterproductive to fix the number of parameters a 

priori. A stumbling block for Bayesian nonparametrics has been that 

their algorithms for posterior inference generally show poor 

scalability. In this talk, we tackle this issue in the domain of 

large-scale text collections. Our model is a novel tree-structured model

 in which documents are represented by collections of paths in an 

infinite-dimensional tree. We develop a general and efficient 

variational inference strategy for learning such models based on 

stochastic optimization, and show that with this combination of modeling

 and inference approach, we are able to learn high-quality models using 

millions of documents.<br><br>John Paisley received the B.S.E. (2004), M.S. (2007) and Ph.D. (2010) in

 Electrical & Computer Engineering from Duke University, where his 

advisor was Lawrence Carin. He was a postdoctoral researcher with David 

Blei in the Computer Science Department at Princeton University, and 

currently with Michael Jordan in the Department of EECS at UC Berkeley. 

He works on developing Bayesian models for machine learning 

applications, particularly for dictionary learning and topic modeling.

</div><br></div><br></div></body></html>