Matt Zoufaly will present his research seminar/general exam on Tuesday October 11 at 8AM in Room 402. The members of his committee are: David August, advisor; Margaret Martonosi, JP Singh. Everyone is invited to attend his talk, and those faculty wishing to remain for the oral exam following are welcome to do so. His abstract and reading list follow below. Title: A Framework for Runtime Parallelization Abstract: Performance no longer scales with the abundance of parallel resources provided by multicore devices. Prior work in manual parallelization as well as automatic static parallelization have shown promise, but a return to the era where processor improvements alone could be relied upon to consistently deliver improved application performance remains more desirable. To help address this problem, I present the HyPar dynamic parallelization framework for multicore systems. HyPar is a hybrid of lightweight hardware extensions and an efficient run-time parallelizer designed to provide the illusion of a single fast processor core for sequential binaries. In addition to a hot loop detector, each core in a HyPar system includes a memory characterizer. These memory characterizers guide memory speculation during parallelization, and they provide for distributed memory misspeculation detection during parallel execution. An instance of the HyPar framework, configured to use a custom memory characterizer and to perform the speculative DOALL transformation, yields a geomean speedup of 2.4x for 14 unmodified sequential x86 binaries on a commodity machine with four cores using simulated hardware components. Reading List: 1.1 Textbooks • A. W. Appel. Modern Compiler Implementation in ML. Cambridge University Press, 1998 • J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco, CA, 1996 1.2 Binary Parallelization Papers • B. Hertzberg and K. Olukotun. Runtime automatic speculative parallelization. In Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on, pages 64 –73, april 2011 • A. Kotha, K. Anand, M. Smithson, G. Yellareddy, and R. Barua. Automatic parallelization in a binary rewriter. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’43, pages 547–557, Washington, DC, USA, 2010. IEEE Computer Society • C. Wang, Y. Wu, E. Borin, S. Hu, W. Liu, D. Sager, T. fook Ngai, and J. Fang. Dynamic parallelization of single-threaded binary programs using speculative slicing. In ICS’09, pages 158–168, 2009 • E. Yardimci and M. Franz. Dynamic parallelization and mapping of binary executables on hierarchical platforms. In CF ’06: Proceedings of the 3rd ACM International Conference on Computing Frontiers, pages 127–138, New York, NY, USA, 2006. ACM 1.3 Speculative Parallelization • L. Rauchwerger and D. Padua. The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. ACM SIGPLAN Notices, 30(6):218–232, 1995 • J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. A scalable approach to thread-level speculation. In Proceedings of the 27th International Symposium on Computer Architecture, pages 1–12, June 2000 • H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke. Uncovering hidden loop level parallelism in sequential applications. In HPCA ’08: Proceedings of the 14th International Symposium on High-Performance Computer Architecture, 2008 1.4 Dynamic Optimizations • V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: A transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN ’00 Conference on Programming Language Design and Implementation, pages 1–12, June 2000 • A. Cristal, O. Santana, F. Cazorla, M. Galluzzi, T. Ramirez, M. Pericas, and M. Valero. Kilo-instruction processors: overcoming the memory wall. Micro, IEEE, 25(3):48 – 57, may-june 2005 • M. C. Merten, A. R. Trick, R. D. Barnes, E. M. Nystrom, C. N. George, J. C. Gyllenhaal, and W. W. Hwu. An architectural framework for runtime optimization. to appear in IEEE Transactions on Computers Special Issue on Dynamic Optimization, 2001 1.5 Memory Profiling • D. M. Gallagher, W. Y. Chen, S. A. Mahlke, J. C. Gyllenhaal, and W. W. Hwu. Dynamic memory disambiguation using the memory conflict buffer. In Proceedings of 6th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 183–193, October 1994 • M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the 20th Annual International Symposium on Computer Architecture (ISCA), 1993
participants (1)
-
Melissa M. Lawson