Young-suk Lee will present his research seminar/general exam on 
Tuesday May 15 at 2PM in Room 401 (note room!).  The members of 
his committee are:  Olga Troyanskaya (advisor), Mona Singh, 
and David Blei.  Everyone is invited to attend his talk and those 
faculty wishing to remain for the oral exam following are welcome 
to do so.  His abstract and reading list follow below.

A microarray experiment measures the abundance of thousands of
transcript in a given biological sample in order to quantify its
unique transcriptome.  Many research groups and institutions have used
microarrays, and have been publishing them and their associated sample
description on the GEO (Gene Expression Omnibus) database website.
The most valuable biological sample can arguably be human tissue
samples, and so most human microarray studies have been so far limited
to few experiments for certain tissue-types.  GEO has made these human
microarray data publicly available, but the free-text sample
description hinders a large-scale tissue-specific microarray analysis.

We present a hierarchical multi-label tissue prediction algorithm that
returns a rank of predicted tissue-types for a given microarray data.
This algorithm may be used to annotate the many human microarrays in
GEO in which their tissue information is hidden or even absent.  We
propose that all tissue prediction algorithms must return a rank
because most, if not all, biological samples consist of multiple
tissue types that have a hierarchical order. So even a single accurate
prediction may not completely describe the biological sample.  We
compare the performance of prediction algorithms with and without
hierarchical information, and our algorithm that uses Bayesian
correction to combine multiple tissue classifiers.  In the biological
community, this algorithm may be used as a comprehensive background or
sanity-check on new human microarray datasets that measures potential
contamination and tissue composition of the biological sample.

