[talks] Q Zhu general exam

Melissa Lawson mml at CS.Princeton.EDU
Thu Apr 14 11:21:46 EDT 2011

Qian Zhu will present his research seminar/general exam on Wednesday April 20 
at 10AM in Room 402.  The members of his committee are:  Olga Troyanskaya
(advisor), Moses Charikar, and Tom Funkhouser.  Everyone is invited to attend 
his talk and those faculty wishing to remain for the oral exam following are welcome 
to do so.  His abstract and reading list follow below.



The human gene expression data are growing at an amazingly fast pace. 
This large compendium represents a potential gold mine of biological 
knowledge for generating hypotheses about complex human diseases. 
Despite the rapid growth of data, we are still lagging behind in the 
development of analysis and search methods to examine the data 
effectively. To date, there is no search system that allows for the fast 
exploration of the entire human compendium, which has limited the 
ability of the biological researchers to examine the data totally and 

We propose an integrative search system that enables context-sensitive, 
query-driven search of genes and experiments in the human microarray 
compendium. The system not only searches for the expression of a set of 
query genes, but also has the ability to suggest, based on 
co-expression, other genes that might be functionally related to the 
query. With these genes, biologists can design further wet-lab 
experiments to discover novel biological relationships.

My talk will consist of two parts. In the first part, I will describe 
two major challenges that we face in the large-scale search of the human 
compendium. The first challenge is the diversity of gene-expression 
across datasets: using co-expression map, I will show how datasets 
exhibit differential co-expression patterns for genes in the same 
pathway. The second challenge, the microarray platform gene-coverage 
variability, can create potential biases during the integration of 
signals across datasets. In the second part, I will describe our search 
algorithm that battles the above challenges to deliver accurate and 
context-relevant search results. I will show how much improvement we can 
gain from knowing where to search (i.e., where the good datasets are). 
Finally, I will describe our initial PageRank-like iterative search 
algorithm that is designed to improve search context and accuracy.

Reading list:

Book chapters:

Chapters 3 (3.1 - 3.4), 4 from
Statistical analysis of gene expression microarray data
Terry Speed
Chapman and Hall/CRC; 1 edition (March 26, 2003)

Chapters 13, 14, 20 from
Artificial Intelligence: A Modern Approach
Stuart Russel, Peter Norvig
Prentice Hall; 2nd edition (2003)


Disease signatures are robust across tissues and experiments.
Dudley JT, Tibshirani R, Deshpande T, Butte AJ.
Mol Syst Biol. 2009;5:307.

Ontology-driven indexing of public datasets for translational 
Shah NH, Jonquet C, Chiang AP, Butte AJ, Chen R, Musen MA.
BMC Bioinformatics. 2009 Feb 5;10 Suppl 2:S1.

Systematic bioinformatic analysis of expression levels of 17,330 human 
genes across 9,783 samples from 175 types of healthy and pathological 
Kilpinen S, Autio R, Ojala K, Iljin K, Bucher E, Sara H, Pisto T, 
Saarela M, Skotheim RI, Björkman M, Mpindi JP, Haapa-Paananen S, Vainio 
P, Edgren H, Wolf M, Astola J, Nees M, Hautaniemi S, Kallioniemi O.
Genome Biol. 2008;9(9):R139.

Exploring the functional landscape of gene expression: directed search 
of large microarray compendia.
Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG.
Bioinformatics. 2007 Oct 15;23(20):2692-9.

Cross-platform analysis of cancer microarray data improves gene 
expression based classification of phenotypes.
Warnat P, Eils R, Brors B.
BMC Bioinformatics. 2005 Nov 4;6:265.

Putting microarrays in a context: integrated analysis of diverse 
biological data.
Troyanskaya OG.
Brief Bioinform. 2005 Mar;6(1):34-43.

Large-scale meta-analysis of cancer microarray data identifies common 
transcriptional profiles of neoplastic transformation and progression.
Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette 
T, Pandey A, Chinnaiyan AM.
Proc Natl Acad Sci U S A. 2004 Jun 22;101(25):9309-14.

Topic-sensitive PageRank: a context-sensitive ranking algorithm for web 
Haveliwala TH.
IEEE Transactions on Knowledge and Data Engineering. 2003 Jul/Aug; 4 
(15): 784-796.

Transitive functional annotation by shortest-path analysis of gene 
expression data.
Zhou X, Kao MC, Wong WH.
Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12783-8.

More information about the talks mailing list