[talks] M Zoufaly general exam

Melissa M. Lawson mml at CS.Princeton.EDU
Tue Oct 4 16:16:49 EDT 2011

Matt Zoufaly will present his research seminar/general exam on 
Tuesday October 11 at 8AM in Room 402.  The members of his 
committee are:  David August, advisor; Margaret Martonosi, 
JP Singh.  Everyone is invited to attend his talk, and those 
faculty wishing to remain for the oral exam following are 
welcome to do so.  His abstract and reading list follow 

Title:  A Framework for Runtime Parallelization

Performance no longer scales with the abundance of parallel resources provided by
multicore devices. Prior work in manual parallelization as well as automatic static parallelization
have shown promise, but a return to the era where processor improvements
alone could be relied upon to consistently deliver improved application performance
remains more desirable. To help address this problem, I present the HyPar dynamic
parallelization framework for multicore systems. HyPar is a hybrid of lightweight hardware
extensions and an efficient run-time parallelizer designed to provide the illusion
of a single fast processor core for sequential binaries. In addition to a hot loop detector,
each core in a HyPar system includes a memory characterizer. These memory
characterizers guide memory speculation during parallelization, and they provide for
distributed memory misspeculation detection during parallel execution. An instance of
the HyPar framework, configured to use a custom memory characterizer and to perform
the speculative DOALL transformation, yields a geomean speedup of 2.4x for
14 unmodified sequential x86 binaries on a commodity machine with four cores using
simulated hardware components.

Reading List:

1.1 Textbooks
• A. W. Appel. Modern Compiler Implementation in ML. Cambridge University
Press, 1998
• J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative
Approach. Morgan Kaufmann, San Francisco, CA, 1996

1.2 Binary Parallelization Papers
• B. Hertzberg and K. Olukotun. Runtime automatic speculative parallelization.
In Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International
Symposium on, pages 64 –73, april 2011
• A. Kotha, K. Anand, M. Smithson, G. Yellareddy, and R. Barua. Automatic
parallelization in a binary rewriter. In Proceedings of the 2010 43rd Annual
IEEE/ACM International Symposium on Microarchitecture, MICRO ’43, pages
547–557, Washington, DC, USA, 2010. IEEE Computer Society
• C. Wang, Y. Wu, E. Borin, S. Hu, W. Liu, D. Sager, T. fook Ngai, and J. Fang.
Dynamic parallelization of single-threaded binary programs using speculative
slicing. In ICS’09, pages 158–168, 2009
• E. Yardimci and M. Franz. Dynamic parallelization and mapping of binary executables
on hierarchical platforms. In CF ’06: Proceedings of the 3rd ACM
International Conference on Computing Frontiers, pages 127–138, New York,
NY, USA, 2006. ACM

1.3 Speculative Parallelization
• L. Rauchwerger and D. Padua. The LRPD test: speculative run-time parallelization
of loops with privatization and reduction parallelization. ACM SIGPLAN
Notices, 30(6):218–232, 1995
• J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. A scalable approach to
thread-level speculation. In Proceedings of the 27th International Symposium on
Computer Architecture, pages 1–12, June 2000
• H. Zhong, M. Mehrara, S. Lieberman, and S. Mahlke. Uncovering hidden loop
level parallelism in sequential applications. In HPCA ’08: Proceedings of the
14th International Symposium on High-Performance Computer Architecture, 2008

1.4 Dynamic Optimizations
• V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: A transparent dynamic optimization
system. In Proceedings of the ACM SIGPLAN ’00 Conference on
Programming Language Design and Implementation, pages 1–12, June 2000
• A. Cristal, O. Santana, F. Cazorla, M. Galluzzi, T. Ramirez, M. Pericas, and
M. Valero. Kilo-instruction processors: overcoming the memory wall. Micro,
IEEE, 25(3):48 – 57, may-june 2005
• M. C. Merten, A. R. Trick, R. D. Barnes, E. M. Nystrom, C. N. George, J. C.
Gyllenhaal, and W. W. Hwu. An architectural framework for runtime optimization.
to appear in IEEE Transactions on Computers Special Issue on Dynamic
Optimization, 2001

1.5 Memory Profiling
• D. M. Gallagher, W. Y. Chen, S. A. Mahlke, J. C. Gyllenhaal, and W. W. Hwu.
Dynamic memory disambiguation using the memory conflict buffer. In Proceedings
of 6th International Conference on Architectural Support for Programming
Languages and Operating Systems, pages 183–193, October 1994
• M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support
for lock-free data structures. In Proceedings of the 20th Annual International 
Symposium on Computer Architecture (ISCA), 1993

More information about the talks mailing list