Pawel Przytycki will present his research seminar/general exam on Wednesday May 22
at 10AM in Room 402.  The members of his committee are:  Mona Singh (advisor), Olga
Troyanskaya, and Andrea LaPaugh.  Everyone is welcome to attend his talk and those
faculty wishing to remain for the oral exam following are welcome to do so.  His abstract
and reading list follow below.
----------------------------

ABSTRACT

Even though the first human genome was sequenced more than a decade ago, it is largely still unknown what specific genetic variations contribute to the diversity of human life. It is only recently that whole-genome sequencing has become affordable enough to utilize at a large enough scale to precisely examine the genetic differences between people. Furthermore, exome sequencing, which is sequencing only the 2% of the human genome that codes for proteins, has surfaced as a cheap way to find minute differences between people’s genetic codes. I propose a method called Nexome designed for network driven analysis of exome sequencing data. Nexome is a degree-aware gene prioritization algorithm that uses a random walker method to diffuse variation information mapped to genes over a protein-protein interaction network. As opposed to many previous algorithms which rely on a pre-defined set of seed genes, Nexome is designed to work with dense non-binary data. Degree correction is derived from diffusion of an uniformed prior. Nexome confirms that there is modularity in cancer variation data when it is mapped to the protein interaction network. Furthermore, this method is capable of discovering known cancer genes that are not frequently mutated. Finally, when applied to specific cancer types Nexome reveals that there is a modular structure to genes that are uniquely enriched in these cancers.

BOOK

An Introduction to Bioinformatics Algorithms - Pevzner and Jones

PAPERS

Erten, S., Bebek, G., Ewing, R. M., & Koyutürk, M. (2011). DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization. BioData mining, 4(1), 19. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/21699738

Fu, W. et. al. (2013). Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220. Retreived from: http://www.nature.com/nature/journal/v493/n7431/full/nature11690.html

Kim, Y., Przytycka, A. (2012). Bridging the Gap between Genotype and Phenotype via Network Approaches. Frontiers in Genetics, 3, 00227. Retrieved from http://www.frontiersin.org/Journal/Abstract.aspx?s=1187&name=statistical_genetics_and_methodology&ART_DOI=10.3389/fgene.2012.00227

Lee, I., Blom, U. M., Wang, P. I., Shim, J. E., & Marcotte, E. M. (2011). Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Research, 21(7), 1109–1121. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/21536720

Navlakha, S., & Kingsford, C. (2010). The power of protein interaction networks for associating genes with diseases. Bioinformatics, 26(8), 1057–1063. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2853684&tool=pmcentrez&rendertype=abstract

O’Roak, B. J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B. P., Levy, R., et al. (2012). Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature, 485(7397), 1-7. Nature Publishing Group. Retrieved from http://www.nature.com/doifinder/10.1038/nature10989

The Cancer Genome Atlas Research Network. (2012). Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525. Retreived from: http://www.nature.com/nature/journal/v489/n7417/full/nature11404.html

Rossin, E.J. (2011). Proteins Encoded in Genomic Regions Associated with Immune-Mediated Disease Physically Interact and Suggest Underlying Biology. Plos Genetics 7(1). Retrieved from http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1001273

Wang, P. I., & Marcotte, E. M. (2010). It’s the machine that matters: Predicting gene function and phenotype from protein networks. Journal of proteomics, 73(11), 2277–2289. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/20637909

1000 Genomes Project Consortium. (2010). A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073. Retreived from: http://www.nature.com/nature/journal/v467/n7319/full/nature09534.html