Matt Hibbs will
present his preFPO on Friday August 10 at 10AM
in
Carl Icahn Labs Room 200 (CIL 200). The members
of his committee are:
Olga Troyanskaya, advisor; Kai Li and Tom Funkhouser,
readers; David
Botstein (MOL/Genomics) and Leonid Kruglyak
(EEB/Genomics),
nonreaders. Everyone is invited to attend his
talk. His title and abstract
follow below.
-------------------------------------
Title:
Analysis and Visualization of Large-Scale Gene Expression Microarray
Compendia
Over the past decade, gene expression microarray data has
become one of the most important tools available for biologists to understand
molecular processes and mechanisms on the whole-genome scale. Microarray data
provides a window into the inner workings of the transcriptional process that is
vital for cellular maintenance, development, biological regulation, and disease
progression. While an exponentially increasing amount of microarray data is
being generated for a wide variety of organisms, there is a severe lack of
methods designed to utilize the vast amount of data currently available. In my
work, I explore several techniques to meaningfully harness large-scale
collections of microarray data both to provide biologists with a greater ability
to explore data repositories, and to computationally utilize these repositories
to discover novel biology.
First, I will discuss techniques for visualization-based
analysis of microarray data on the scale of individual datasets. These
techniques include incorporating statistical measures into visualization schemes
and utilizing alternative views of data to gain a broader picture. Second, I
will focus on novel methods that allow users to simultaneously view multiple
datasets with the goal of providing a larger context within which to understand
individual datasets. These techniques include developing multi-dataset
visualization methods as well as utilizing new technologies such as very large
format display devices. Third, effective search and analysis techniques are
required to guide researchers and enable their effective use of large-scale
repositories. I will present a user-driven search algorithm designed to both
quickly locate relevant datasets in a collection and to then identify novel
players related to the user’s query. This technique is useful as an independent
search/exploration method, can be incorporated into visualization systems, and
can be used to predict novel functions for genes. I will discuss how we have
successfully used this approach to discover novel biology, including directing a
large-scale experimental investigation of S. cerevisiae
mitochondrial organization.
The combination of visualization-based analysis methods and
exploratory algorithms such as those presented are vital to future systems
biology research. As data collections continue to grow and as new forms of data
are generated, it will become increasingly important to develop methods and
techniques that will allow experts to intelligently sift through the available
information to make new discoveries.