[talks] M Dudik preFPO

Melissa M Lawson mml at CS.Princeton.EDU
Wed Feb 7 13:58:01 EST 2007

Miro Dudik will present his preFPO on Friday February 9 at 10AM in Room 402.  The 
members of his committee are Rob Schapire, advisor; David Blei and Stephen Phillips
readers; Moses Charikar and Olga Troyanskaya, nonreaders.  Everyone is invited to 
attend his talk.  His abstract follows below.
Maximum entropy, generalized regularization, and modeling species habitats

Maximum entropy (maxent) approach, equivalent to maximum likelihood, is a widely used
method for estimating probability distributions.
However, when trained on small datasets, maxent is likely to overfit.
Therefore, many smoothing techniques were proposed to mitigate overfitting. In my
dissertation, I propose a unified treatment for a large and general class of smoothing
techniques including L1 and L2 regularization. As a result, it is possible to prove
non-asymptotic performance guarantees and derive novel regularizations based on structure
of the sample space. To obtain solutions for a large class of maxent problems, I propose
new algorithms derived from boosting and iterative scaling. Convergence of these
algorithms is proved using a novel method, which unifies previous approaches based on
information geometry and compactness.

As an application of maxent, I discuss an important problem in
ecology: modeling distributions of biological species. Regularized maxent fits this
problem well and offers several advantages over previous techniques. In particular, it
addresses the problem in a statistically sound manner and allows principled extensions to
situations when data is collected in a biased manner or when we have access to data on
many related species.
The utility of maxent is demonstrated on large real-world datasets.

More information about the talks mailing list