[Ml-stat-talks] Fwd: [talks] Mehmet Basbug will present his FPO "Integrating Exponential Dispersion Models to Latent Structures" on Tuesday, January 10, 2017 at 1pm in CS401

Barbara Engelhardt bee at princeton.edu
Wed Jan 4 09:36:12 EST 2017

Talk of interest next week.

---------- Forwarded message ----------

*Department of Electrical Engineering*

*Final Public Oral Examination*


*Mehmet Basbug*

Integrating Exponential Dispersion Models to Latent Structures

Tuesday, January 10, 2017

Room 401CS

Computer Science Building

1:00 PM

All those interested are welcome to attend.

​Advisers: Robert E. Schapire and Barbara E. Engelhardt​

Non-technical ​Abstract:​

"Latent variable models have two basic components: a latent structure
encoding a hypothesized complex pattern and an observation model capturing
the data distribution. With the advancements in machine learning and
increasing availability of resources, we are able to perform inference in
deeper and more sophisticated latent variable models. In most cases, these
models are designed with a particular application in mind; hence, they tend
to have restrictive observation models. The challenge, surfaced with the
increasing diversity of data sets, is to generalize these latent models to
work with different data types. We aim to address this problem by utilizing
exponential dispersion models (EDMs) and proposing mechanisms for
incorporating them into latent structures.

First, we show that the common EDM families can be expressed as a
divergence from its mean in the dual domain. In particular, we argue that
each EDM family induces a unique topology. For example, the Gaussian family
relates to the Euclidean topology. We parametrize classes of EDM families
in terms of the induced topology. We then propose an adaptive algorithm
(AdaCluster) for clustering heterogeneous data sets. AdaCluster can, for
instance, identify if the underlying distribution of a multi-modal positive
continuous attribute is gamma, Gaussian or inverse-Gaussian.

Next, we generalize a Bayesian non-negative matrix factorization model
(Poisson factorization) to various data types using EDMs. Poisson
factorization has been successfully used to uncover the activity patterns
in large scale problems like the Netflix recommendation problem. We extend
the original model to other domains such as genomics and finance.
Furthermore, our model decouples the preference and activity
patterns–effectively distinguishing how much someone is interested in
seeing a given movie and what rating she would give to the movie.

Lastly, we use the Poisson factorization and EDMs within the context of
missing data. We show that an arbitrary data-generating model with EDM
output–such as Gaussian mixture model, probabilistic matrix factorization,
Poisson mixture model or linear regression model–can be coupled with a
Poisson factorization encoding the missing-data pattern. In particular, we
argue that the heteroscedastic impact of missing-data pattern on the
dispersion of observation variable can be captured with the proposed model."

“Please click on the attached .ics file to add this event to your calendar.”

talks mailing list
talks at lists.cs.princeton.edu
To edit subscription settings or remove yourself, use this link:

Barbara E Engelhardt
Assistant Professor
Department of Computer Science
Center for Statistics and Machine Learning
Princeton University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20170104/aa7c3ae2/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Mehmet Basbug - Final Public Oral Exam.ics
Type: text/calendar
Size: 4382 bytes
Desc: not available
URL: <http://lists.cs.princeton.edu/pipermail/ml-stat-talks/attachments/20170104/aa7c3ae2/attachment.ics>

More information about the Ml-stat-talks mailing list