<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000'><br><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><div style="font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000"><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><style>p { margin: 0; }</style><div style="font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000"><div style="color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><style>p { margin: 0; }</style><div style="font-family: arial,helvetica,sans-serif; font-size: 12pt; color: #000000"><b>Unraveling the heterogeneity and dynamics of regulatory elements in the human genome</b>
<br>
<b><a href="https://sites.google.com/site/anshulkundaje/" target="_blank">Anshul Kundaje</a></b>, <a href="http://www.mit.edu/" target="_blank">Massachusetts Institute of Technology</a>
<br>Tuesday, March 12, 2013, 4:30pm<br>Computer Science 105<br>
<br>
In 2003, the Human Genome Project marked a major scientific milestone by
releasing the first consensus DNA sequence of the human genome. The
ENCODE Project (Encyclopedia of DNA elements) was launched to pick up
where the Human Genome Project left off, with the ambitious goal of
systematically deciphering the potential function of every base (letter)
in the genome. ENCODE has generated the largest collection of
functional genomic data in humans to date, measuring the cellular
activity of thousands of cellular moieties in a variety of normal and
diseased cellular contexts. In this talk, I will describe novel
computational and machine learning approaches that I developed for
integrative analysis of massive compendia of diverse biological data
such as ENCODE to unravel the functional heterogeneity and variation of
regulatory elements in the human genome and their implications in human
disease.<br><br><p>
I will begin with a gentle introduction to the diversity and scale of
ENCODE data and a brief overview of robust, statistical methods that we
developed for automated detection of DNA binding sites of hundreds of
regulatory proteins from noisy, experimental data. Regulatory proteins
can perform multiple functions by interacting with and co-binding DNA
with different combinations of other regulatory proteins. I developed a
novel discriminative machine learning formulation based on regularized
Rule-based ensembles that was able to sort through the combinatorial
complexity of possible regulatory interactions and learn statistically
significant item-sets of co-binding events at an unprecedented level of
detail. I found extensive evidence that regulatory proteins could switch
partners at different sets of genomic domains within a single cell-type
and across different cell-types affecting structural and chemical
properties of DNA and regulating different functional categories of
target genes. Using regulatory elements discovered from ENCODE data, we
were also able to provide putative functional interpretations for up to
81% of all publicly available sequence variants (mutations) identified
in large-scale disease studies and generate new hypotheses by
integrating multiple sources of data. <br></p><p><br></p><p>
Finally, I will present a brief overview of my recent efforts on using
multivariate Hidden Markov models to analyze the dynamics of various
chemical modifications to DNA across three key axes of variation -
across multiple species, across different cell-types in a single species
(human), and across multiple human individuals for the same cell-type.
Our results indicate a remarkable universality of chemical modifications
defining hidden regulatory states across the animal kingdom with
dramatic differences in the variation and functional impact of these
regulatory elements between cell-types and individuals. <br></p><p><br></p>
Together, these efforts take us one step closer to learning
comprehensive models of gene regulation in humans in order to improve
our system-level understanding of cellular processes and complex
diseases.
</div><br></div></div></div></div></div><br></div></body></html>