Yida Wang will present his research seminar/general exam on Thursday May 19 at 2PM in Room 302 (note room!). The members of his committee are; Kai Li (advisor), Moses Charikar, Rob Schapire. Everyone is invited to attend his talk and those faculty wishing to remain for the oral exam following are welcome to do so. His abstract and reading list follow below. ---------------- Full Correlation Computation and Analysis of Large-Scale FMRI Abstract: Human brain imaging such as Functional Magnetic Resonance Imaging (fMRI) has made transformational impacts on neuroscience. However, due to technology limitations in the past, researchers have constrained their analyses by making assumptions that are biased or false, leading to missed opportunities for science discovery, and in the worst case, incorrect inferences. In this project, we are conducting our research by fully leveraging recent advances in large-scale computing techniques, for the first time, to analyze the full correlation matrix of fMRI data. The success of our approach will result in a quantum increase in the power of neuroimaging analysis, extracting 7 orders of magnitude more information from the data than previous approaches and opening up qualitatively new opportunities for neuroscience research. In order to fully leverage the advances in multicore processors and parallel computing, we have developed a parallel data analysis tool for a cluster of computers. This tool can achieve high utilization of CPU cores on a cluster. On a system with 528 CPU cores, the tool is able to perform full-correlation study of one-hour fMRI dataset in 2.5 days, more than an order of magnitude improvement over a native parallel approach. We have also studied how to analyze and visualize massive amounts of full-correlation result data (petabytes) and built corresponding tools for the data analysis pipeline. In this talk, I will describe our approach and report the current status and future plans. This work is advised by Prof. Moses Charikar and Prof. Kai Li at Princeton Computer Science Department, and Prof. Nicholas Turk-Browne and Prof. Jonathan Cohen at Princeton Neuroscience Institute. Reading List: Book: Computer Architecture: A Quantitative Approach, 4th Edition, John Hennessy and David Patterson Paper: Goto, K. and Van De Geijn, R. 2008. Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw. 34, 3. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters, Proc. 6th Symp. Operating System Design and Implementation (OSDI), Usenix Assoc., 2004, pp. 137-150. Pereira F, Mitchell T M and Botvinick M. Machine learning classifiers and fMRI: A tutorial overview Neuroimage. 2008 Norman, K. A., Polyn, S. M., Detre, G. J. and Haxby, J. V. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends Cogn. Sci. 10, 424-430 (2006). R. J. Bayardo, Y. Ma and R. Srikant. Scaling up all pairs similarity search. In WWW, 2007. R. Vernica, M. J. Carey and C. Li. Efficient Parallel Set-Similarity Joins Using MapReduce. In SIGMOD, 2010. J. Cheverud and G. Marriog. Comparing covariance matrices: Random skewers method compared to the common principal components model. Genetics and Molecular Biology, 2007.