[talks] M Dudik preFPO

Melissa M Lawson mml at CS.Princeton.EDU
Wed Feb 7 13:58:01 EST 2007


Miro Dudik will present his preFPO on Friday February 9 at 10AM in Room 402.  The 
members of his committee are Rob Schapire, advisor; David Blei and Stephen Phillips
(AT&T), 
readers; Moses Charikar and Olga Troyanskaya, nonreaders.  Everyone is invited to 
attend his talk.  His abstract follows below.
-----------------------------------------
Maximum entropy, generalized regularization, and modeling species habitats

Maximum entropy (maxent) approach, equivalent to maximum likelihood, is a widely used
method for estimating probability distributions.
However, when trained on small datasets, maxent is likely to overfit.
Therefore, many smoothing techniques were proposed to mitigate overfitting. In my
dissertation, I propose a unified treatment for a large and general class of smoothing
techniques including L1 and L2 regularization. As a result, it is possible to prove
non-asymptotic performance guarantees and derive novel regularizations based on structure
of the sample space. To obtain solutions for a large class of maxent problems, I propose
new algorithms derived from boosting and iterative scaling. Convergence of these
algorithms is proved using a novel method, which unifies previous approaches based on
information geometry and compactness.

As an application of maxent, I discuss an important problem in
ecology: modeling distributions of biological species. Regularized maxent fits this
problem well and offers several advantages over previous techniques. In particular, it
addresses the problem in a statistically sound manner and allows principled extensions to
situations when data is collected in a biased manner or when we have access to data on
many related species.
The utility of maxent is demonstrated on large real-world datasets.



More information about the talks mailing list