Matthew Myers will present his Pre FPO "Inferring tumor heterogeneity from DNA sequencing data" on Thursday, January 27, 2022 at 3pm via Zoom. Zoom link: https://princeton.zoom.us/j/94807976020?pwd=d21JY2VLZElXbjlCZXJnbjRwL0svUT09 Committee members: Ben Raphael (adviser), Mona Singh, and Olga Troyanskaya; Readers: Yuri Pritykin and Quaid Morris (Sloan Kettering Institute) All are welcome to attend. Title: Inferring tumor heterogeneity from DNA sequencing data Abstract: Cancer is an evolutionary process where cells acquire somatic mutations over time at various genomic scales, from single-position changes (single-nucleotide variations or SNVs), to changes in the number of copies (copy number aberrations or CNAs) of larger regions of the genome, to duplication of the entire genome (whole-genome duplication or WGD). As a result of this process, each tumor is heterogeneous -- it consists of a mixture of different populations of cells, or clones, each characterized by a distinct set of mutations. Understanding these clones and their evolution is critical to treating many cancers. In this talk, I will present three computational methods for inferring tumor heterogeneity from DNA sequencing data. First, I will present CALDER, which uses SNVs from longitudinal bulk sequencing samples to infer a phylogenetic tree which defines the tumor clones and their evolutionary relationships. Unlike prior methods, CALDER applies constraints derived from the sequential ordering of samples to produce more plausible trees. Next, I will present HATCHet2, which infers CNAs from one or more bulk DNA sequencing samples. HATCHet2 combines several innovations including reference-based phasing and location-aware clustering with the novel factorization approach of HATCHet. Finally, I will present SBMClone, which uses SNVs identified from ultra-low-coverage single-cell DNA sequencing data to group tumor cells. SBMClone uses a stochastic block model to distinguish tumor cells in data that was too sparse for previous methods.