Ziyang Xu will present his FPO "A Fast and Extensible Memory Profiling Framework" on September 27, 2024 at 10am in Friend 125.
Ziyang Xu will present his FPO "A Fast and Extensible Memory Profiling Framework" on September 27, 2024 at 10am in Friend 125. The members of his committee are as follows: Examiners: David August (adviser), Aarti Gupta, Amit Levy Readers: Zachary Kincaid and Simone Campanoni (Northwestern University) Abstract: Recent advancements in automatic parallelization offer a promising way to fully leverage modern processors, yet these techniques remain underutilized. To understand this gap, this dissertation presents a field study of the computational usage and needs. The study reveals that although many researchers have access to abundant parallel resources, they often lack the necessary tools or expertise to utilize them effectively, highlighting the need for practical, user-friendly, and efficient solutions. This dissertation explores the practicality of memory profiling, a critical technique for automatic parallelization. Memory profiling captures programs' dynamic memory behavior, assisting in debugging, tuning, and enabling advanced compiler optimizations. Building practical memory profilers often requires extensive compiler expertise, adeptness in program optimization, and significant implementation efforts, leaving a gap in the availability of fast and efficient profilers. To bridge this gap, this dissertation presents PROMPT, a pioneering framework for streamlined development of fast memory profilers. With it, developers only need to specify profiling events and define the core profiling logic, bypassing the complexities of custom instrumentation and intricate memory profiling components and optimizations. By integrating dynamic binary instrumentation alongside LLVM-IR and source instrumentation, PROMPT ensures comprehensive memory access coverage while reducing reliance on binary instrumentation. Two state-of-the-art memory profilers were ported to PROMPT with all features preserved. By focusing on the core profiling logic, the code was reduced by more than 65% and the profiling speed was improved by 5.3x and 7.1x respectively. To further underscore PROMPT's impact, a tailored memory profiling workflow was constructed for a sophisticated compiler optimization client. In just 570 lines of code, this redesigned workflow satisfies the client's memory profiling needs while achieving more than 90% reduction in profiling time and improved robustness compared to the original profilers.
participants (1)
-
Nicki Mahler