ABSTRACT
Even though the first human genome was sequenced more than a decade ago, it is largely still unknown what specific genetic variations contribute to the diversity of human life. It is only recently that whole-genome sequencing has become affordable enough to utilize at a large enough scale to precisely examine the genetic differences between people. Furthermore, exome sequencing, which is sequencing only the 2% of the human genome that codes for proteins, has surfaced as a cheap way to find minute differences between people’s genetic codes. I propose a method called Nexome designed for network driven analysis of exome sequencing data. Nexome is a degree-aware gene prioritization algorithm that uses a random walker method to diffuse variation information mapped to genes over a protein-protein interaction network. As opposed to many previous algorithms which rely on a pre-defined set of seed genes, Nexome is designed to work with dense non-binary data. Degree correction is derived from diffusion of an uniformed prior. Nexome confirms that there is modularity in cancer variation data when it is mapped to the protein interaction network. Furthermore, this method is capable of discovering known cancer genes that are not frequently mutated. Finally, when applied to specific cancer types Nexome reveals that there is a modular structure to genes that are uniquely enriched in these cancers.
BOOK
An Introduction to Bioinformatics Algorithms - Pevzner and Jones
PAPERS
Erten, S., Bebek, G., Ewing, R. M., & Koyutürk, M. (2011). DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization. BioData mining, 4(1), 19. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/21699738
Lee, I., Blom, U. M., Wang, P. I., Shim, J. E., & Marcotte, E. M. (2011). Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Research, 21(7), 1109–1121. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/21536720
O’Roak, B. J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B. P., Levy, R., et al. (2012). Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature, 485(7397), 1-7. Nature Publishing Group. Retrieved from
http://www.nature.com/doifinder/10.1038/nature10989
The Cancer Genome Atlas Research Network. (2012). Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525. Retreived from:
http://www.nature.com/nature/journal/v489/n7417/full/nature11404.html
Wang, P. I., & Marcotte, E. M. (2010). It’s the machine that matters: Predicting gene function and phenotype from protein networks. Journal of proteomics, 73(11), 2277–2289. Retrieved from
http://www.ncbi.nlm.nih.gov/pubmed/20637909