Allison Chaney will present her FPO, Computational Methods for Exploring Human Behavior", on Thursday, 9/8/2016 at 3:00pm in CS 402.
Allison Chaney will present her FPO, Computational Methods for Exploring Human Behavior", on Thursday, 9/8/2016 at 3:00pm in CS 402. The members of her committee are David Blei (adviser), readers: David Blei and Barbara Engelhardt; Examiners: David Blei, Elad Hazan, and Brandon Stewart (Sociology). A copy of her thesis is available in Room 310. Everyone is invited to attend her talk. The abstract follows below: Researchers and analysts from many diverse fields are interested in unstructured observations of human behavior; this variety of data is constantly increasing in quantity. In this dissertation, we describe a suite of computational methods to assist investigators in interpreting, organizing, and exploring this data. We develop two Bayesian latent variable models for human-centered applications; specifically, we rely on additive Poisson models, which allow behavior to be associated with various sources of influence. Given observed data, we estimate the posterior distributions of these models with scalable variational inference algorithms. These models and inference algorithms are validated on real-world data. Developing statistical models and corresponding inference algorithms only addresses part of the needs of investigators. Non-technical researchers faced with analyzing large quantities of human behavior data are not able to use the results of inference algorithms without tools to translate estimated posterior distributions into accessible visualizations, browsers, or navigators. We present visualization based on an underlying statistical model as a first-class research problem, and provide principles to guide the construction of these systems. We demonstrate these principles with exploratory tools for two latent variable models. By considering the interplay between developing statistical models and tools for visualization, we are able to develop computational methods that provide for the full needs of investigators interested in exploring human behavior.
Qian Zhu will present his FPO, "Detecting gene similarities using large-scale content-based search systems", on Friday, 9/9/2016 at 10am in CS 402. The members of his committee are Olga Troyanskaya (adviser), readers: Kai Li and Vessela Kristensen (University of Oslo); Examiners: Mona Singh, Andrea LaPaugh. A copy of his thesis is available in Room 310. The abstract follows below: The accumulation of public gene expression datasets offers numerous opportunities for researchers to utilize these data to characterize gene functions, understand pathway actions, and formulate hypotheses about the molecular basis of human diseases. Yet, exploring this extremely large gene expression data collection has been challenging, due to a lack of effective tools in reusing existing datasets and exploring these datasets for targeted analyses. An important challenge is discovering robust gene signatures of biological processes and diseases, where this depends on the ability to detect similar genes that share gene expression patterns across a large set of conditions. This thesis discusses query-based systems that are intended for large-scale integration and exploration of gene similarities. It also discusses its key biological applications. In the first part, I present SEEK, a search system and a novel algorithm for searching similar (or coexpressed) genes around a multigene query of interest. The search algorithm combines coexpressed genes using a sensitive dataset weighting algorithm for effective weighting of coexpression results. Notably, through the robust search of thousands of human datasets, the retrieval of functionally co-annotated genes always improves with the inclusion of more datasets, showing the promise of the large compendia. In the second part, I extend the work of SEEK to the expression compendia of 5 commonly studied model organisms. The new system ModSEEK enables accurate searches in a wider experimental variety, and has been extensively evaluated. In the third part, I propose a novel framework for integrating and comparing coexpression context across a pair of organisms. I leverage both comparative genomics orthology data and functional genomics coexpression data, in an unsupervised framework to identify pairs of genes in an orthologous group that are similarly highly coexpressed to an orthologous query in two organisms. I show that such functionally similar pairs of genes can be used to improve the performance of single-organism gene retrieval searches. In the final part, I demonstrate how coexpressed genes can be used to identify important transcription factors and dysregulated processes underlying breast cancer subtypes. This part highlights the promise of coexpressed genes in providing an understanding of cancer dysregulations.
participants (1)
-
Nicki Gotsis