Nick Johnson Pre-FPO
Nick Johnson will present his preFPO on Thursday April 17 at 9AM in Room 402. The members of his committee are: David August, advisor; David Wentzlaff and Jae Lee (SKKU), readers; David Walker and Andrew Appel, nonreaders. Everyone is invited to attend his talk. His abstract follows below. --------- Title: Static Dependence Analysis in an Infrastructure for Automatic Parallelization. Now that parallel architectures are common, software must exploit parallel functional units to fully utilize hardware resources and achieve efficient execution. Although developers may restructure their applications for explicit parallelism, that process requires developers to reason about low-level details of the execution environment to avoid concurrency bugs and achieve parallel performance. A favorable alternative is automatic thread extraction, since it relieves the developer of the arduous task of (re-)developing applications with each new architecture. This dissertation presents compiler middle-ware to support automatic thread extraction---analyses and transformations that allow the compiler to deliver parallel performance from sequentially-specified input programs. Specifically, this thesis contributes, 1. Collaborative Dependence Analysis: Compilers use static analysis to determine facts about the input code. These facts restrict compiler transformations to ensure correctness. The collaborative dependence analysis framework combines simple analysis algorithms into an ensemble algorithm. Such ensembles are structured for collaboration: the ensemble algorithm disproves dependence queries which no member analysis algorithm disproves alone. Preliminary results demonstrate that this framework has a large impact on even speculative thread extraction systems: for some benchmarks, the strength of analysis is the difference between a 28x speedup and an 11x slowdown (geomean) measured on 50 cores. 2. Scaling Analysis to Parallelize Larger Scopes: As scopes grow large, the burden of analysis becomes prohibitive. Observing that parallelization depends on the strongly-connected components of the PDG (the DAG_SCC), this dissertation presents a faster algorithm to compute the DAG_SCC. This algorithm greedily identifies dependence edges which cannot affect parallelization. It reduces analysis time by skipping such unimportant dependence edges, instead spending analysis effort on the remaining, important queries. Overall, this faster algorithm reduces analysis time by half while maintaining precision. The dissertation presents an end-to-end integration of these contributions. This infrastructure has proved robust and flexible; many works from the Liberty Research Group have built upon this infrastructure.
participants (1)
-
Melissa M. Lawson