Alfredo Velasco will present his talk "Identifying Cancer-Driver Mutations using Machine Learning" on Tuesday, April 25, 2023 at 2pm in FCHEM 181

Alfredo Velasco will present his MSE talk "Identifying Cancer-Driver Mutations using Machine Learning" on Tuesday, April 25, 2023 at 2pm in FCHEM 181. The members of his committee are as follows: Mona Singh (adviser) and Yuri Pritykin (reader) All are welcome to attend. Talk title and abstract follow below. Title: "Identifying Cancer-Driver Mutations using Machine Learning" Abstract: Cancer is a cellular disease which is caused by somatic alterations that result in the increased growth rate of a cell. Many of these alterations are mutations that occur within protein-coding regions of the genome. Cancerous cells will also contain mutations that don't contribute to the disease progression. Thus, there is a need to develop methods that can differentiate between mutations that cause cancer (called driver mutations) and mutations that don't (called passenger mutations). This project specialized in looking at different methods to generate a set of features that could be fed into a Machine Learning (ML) model. One of these methods of feature generation includes protein language models which result in high-dimensional representations of amino acids that capture the context within which they appear within protein sequences. Another more familiar method for feature generation is to obtain the physiochemical properties of the amino acids such as the hyrdrophobic properties or if they are evolutionarily conserved. We also look at a variety of ML models such as random forests, gradient boosting, Logistic Regression, and Gaussian Naive Bayes, and Decision Tree to determine which model gives the best prediction. Together, the ultimate goal of this project is to determine which combination of feature sets and ML models give the best prediction performance and can properly distinguish driver and passenger mutations.
participants (1)
-
Gradinfo