Ruth Dannenfelser will present her Pre FPO "From gene expression data to biomedical abstracts: data-driven approaches and systems to interrogate complex disease" on Tuesday, July 16, 2019 at 9am in CS 402.

The members of her committee are: Olga Troyanskaya (adviser), Mona Singh, Kai Li, Yibin Kang (Mol Bio), Wendell Lim (UCSF).

Everyone is invited to attend her talk.  The talk title and abstract follow below:

Title: From gene expression data to biomedical abstracts: data-driven approaches and systems to interrogate complex disease

Large-scale genomic studies now give more predictive power than ever, allowing us to profile the composition of tissues, study cellular functions, and understand organismal traits at an unprecedented level of detail. This is particularly important for studying heterogeneous diseases, such as cancer, where small patient-specific differences play critical roles in disease development and progression. Here we build off the wealth of publicly available data to examine the interplay between cancer and the immune system and then develop two query-based visualization systems that enable interactive data exploration for the biomedical community at large.

The first part of my talk will present two perspectives on cancer and the immune system, starting with a semi-supervised approach for immune cell type quantification. Using derived immune markers we examine lymphocyte infiltration in breast cancer and find that estrogen receptor activity and genomic complexity are the key factors driving variation in lymphocytic infiltrate across individuals. In a broader scope, we leverage public gene expression data to further the development of targeted immunotherapeutics for solid tumors. Engineered T cell therapies have shown great promise for hematological cancers but have only found limited success in targeting solid tumors due to off target effects. Working closely with experimental collaborators we are developing a method to prioritize safer pairs of antigen targets that will help engineered T cells hone in on specific tumor targets minimizing damage to normal tissues. 

The second half will cover how we can extract unbiased signals from large collections of biomedical data in the form of scientific abstracts and large repositories of transcriptomics data. First, we show how we can obtain informative tissue-disease-gene relationships from abstracts and integrate them into a web system that presents different snapshots of curated interactions and adds tissue and disease annotations to gene lists from experimental assays (e.g. GWAS, differentially expressed genes, drug screens, etc). Secondly, we extend SEEK, a gene expression search engine that simultaneously returns coexpressed genes and relevant datasets where the genes are likely coregulated. Our extension enables significant performance improvements, expands the search space across the major model organisms, and provides a new cross-organism exploration interface to help facilitate translational research.