Qian Zhu will present his research seminar/general exam on Wednesday April 20 at 10AM in Room 402. The members of his committee are: Olga Troyanskaya (advisor), Moses Charikar, and Tom Funkhouser. Everyone is invited to attend his talk and those faculty wishing to remain for the oral exam following are welcome to do so. His abstract and reading list follow below. ------------------------------------ Abstract: The human gene expression data are growing at an amazingly fast pace. This large compendium represents a potential gold mine of biological knowledge for generating hypotheses about complex human diseases. Despite the rapid growth of data, we are still lagging behind in the development of analysis and search methods to examine the data effectively. To date, there is no search system that allows for the fast exploration of the entire human compendium, which has limited the ability of the biological researchers to examine the data totally and effectively. We propose an integrative search system that enables context-sensitive, query-driven search of genes and experiments in the human microarray compendium. The system not only searches for the expression of a set of query genes, but also has the ability to suggest, based on co-expression, other genes that might be functionally related to the query. With these genes, biologists can design further wet-lab experiments to discover novel biological relationships. My talk will consist of two parts. In the first part, I will describe two major challenges that we face in the large-scale search of the human compendium. The first challenge is the diversity of gene-expression across datasets: using co-expression map, I will show how datasets exhibit differential co-expression patterns for genes in the same pathway. The second challenge, the microarray platform gene-coverage variability, can create potential biases during the integration of signals across datasets. In the second part, I will describe our search algorithm that battles the above challenges to deliver accurate and context-relevant search results. I will show how much improvement we can gain from knowing where to search (i.e., where the good datasets are). Finally, I will describe our initial PageRank-like iterative search algorithm that is designed to improve search context and accuracy. Reading list: Book chapters: Chapters 3 (3.1 - 3.4), 4 from Statistical analysis of gene expression microarray data Terry Speed Chapman and Hall/CRC; 1 edition (March 26, 2003) Chapters 13, 14, 20 from Artificial Intelligence: A Modern Approach Stuart Russel, Peter Norvig Prentice Hall; 2nd edition (2003) Papers: Disease signatures are robust across tissues and experiments. Dudley JT, Tibshirani R, Deshpande T, Butte AJ. Mol Syst Biol. 2009;5:307. Ontology-driven indexing of public datasets for translational bioinformatics. Shah NH, Jonquet C, Chiang AP, Butte AJ, Chen R, Musen MA. BMC Bioinformatics. 2009 Feb 5;10 Suppl 2:S1. Systematic bioinformatic analysis of expression levels of 17,330 human genes across 9,783 samples from 175 types of healthy and pathological tissues. Kilpinen S, Autio R, Ojala K, Iljin K, Bucher E, Sara H, Pisto T, Saarela M, Skotheim RI, Björkman M, Mpindi JP, Haapa-Paananen S, Vainio P, Edgren H, Wolf M, Astola J, Nees M, Hautaniemi S, Kallioniemi O. Genome Biol. 2008;9(9):R139. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG. Bioinformatics. 2007 Oct 15;23(20):2692-9. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. Warnat P, Eils R, Brors B. BMC Bioinformatics. 2005 Nov 4;6:265. Putting microarrays in a context: integrated analysis of diverse biological data. Troyanskaya OG. Brief Bioinform. 2005 Mar;6(1):34-43. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AM. Proc Natl Acad Sci U S A. 2004 Jun 22;101(25):9309-14. Topic-sensitive PageRank: a context-sensitive ranking algorithm for web search. Haveliwala TH. IEEE Transactions on Knowledge and Data Engineering. 2003 Jul/Aug; 4 (15): 784-796. Transitive functional annotation by shortest-path analysis of gene expression data. Zhou X, Kao MC, Wong WH. Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12783-8.
participants (1)
-
Melissa Lawson