Jialu Huang will present her research seminar/general exam on Tuesday May 12 at 2PM in room 402. The members of her committee are: David August, advisor, Brian Kernighan, and Andrew Appel. Everyone is invited to attend her talk, and those faculty wishing to remain for the oral exam following are welcome to do so. Her abstract and reading list follow below. ============================================ General abstract: While multi-core systems have become prevalent, using them efficiently remains an unsolved problem. DOALL is a highly efficient parallelization technique, however, cross-iteration dependences limit DOALL's applicability. Decoupled software pipelining (DSWP) partitions the code into stages with dependences flowing unidirectionally from earlier stages to later stages. DSWP is more generally applicable than DOALL, but its scalability is limited by the number of stages and by the workload in each stage. Parallel stage decoupled software pipelining (PS-DSWP) addresses this problem by enabling DOALL parallelization to some stages of a DSWP pipeline in loops where DOALL wasn't originally applicable. Inspired by the idea that DSWP can make DOALL more applicable, this work proposes other parallelization techniques can become more broadly applicable through the use of DSWP. LOCALWRITE, which requires each thread to only update the data belonging only to itself, if applied alone, suffers from redundant computations. Introducing LOCALWRITE into a DSWP framework, removes this limitation, by having a separate DOALL stage produce the iteration state to the LOCALWRITE stages. Experiments show combining all three parallelizations (DOALL, LOCALWRITE, DSWP) yields performance superior to applying any of these techniques alone. Reading List: Books: [1] Modern Compiler Implementation in ML, by A. W. Appel, Cambridge University Press, 1998. Chapter 1~13, 17~21. [2] Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman, Addison Wesley, 2001. Chapter 9, 10. Papers: [1] Raman, E., Ottoni, G., Raman, A., Bridges, M. J., and August, D. I. 2008. Parallel-stage decoupled software pipelining. In Proceedings of the Sixth Annual IEEE/ACM international Symposium on Code Generation and Optimization (Boston, MA, USA, April 05 - 09, 2008). CGO '08. ACM, New York, NY, 114-123. DOI= http://doi.acm.org/10.1145/1356058.1356074 [2] Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August. Automatic Thread Extraction with Decoupled Software Pipelining. In Proceedings of the 38th IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2005. [3] Matthew J. Bridges, Neil Vachharajani, Yun Zhang, Thomas Jablin, David I. August, "Revisiting the Sequential Programming Model for the Multicore Era," IEEE Micro, vol. 28, no. 1, pp. 12-20, Jan./Feb. 2008, doi:10.1109/MM.2008.13. [4] Vachharajani, N., Rangan, R., Raman, E., Bridges, M. J., Ottoni, G., and August, D. I. 2007. Speculative Decoupled Software Pipelining. In Proceedings of the 16th international Conference on Parallel Architecture and Compilation Techniques (September 15 - 19, 2007). PACT. IEEE Computer Society, Washington, DC, 49-59. DOI= http://dx.doi.org/10.1109/PACT.2007.66. [5] Singh, D. E., Martin, M. J., and Rivera, F. F. 2005. Runtime characterisation of irregular accesses applied to parallelisation of irregular reductions. Int. J. Comput. Sci. Eng. 1, 1 (Feb. 2005), 1-14. DOI= http://dx.doi.org/10.1504/IJCSE.2005.008906. [6] Gutiérrez, E., Plata, O. G., and Zapata, E. L. 1999. On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems. In Proceedings of the 5th international Euro-Par Conference on Parallel Processing (August 31 - September 03, 1999). P. Amestoy, P. Berger, M. J. Daydé, I. S. Duff, V. Frayssé, L. Giraud, and D. Ruiz, Eds. Lecture Notes In Computer Science, vol. 1685. Springer-Verlag, London, 422-429. [7] Eladio Gutiérrez, Oscar Plata, Emilio L. Zapata, "Improving Parallel Irregular Reductions Using Partial Array Expansion," sc,pp.56, ACM/IEEE SC 2001 Conference (SC 2001), 2001. [8] Saltz, J. H., Mirchandaney, R., and Crowley, K. 1989. The doconsider loop. In Proceedings of the 3rd international Conference on Supercomputing (Crete, Greece, June 05 - 09, 1989). ICS '89. ACM, New York, NY, 29-40. DOI= http://doi.acm.org/10.1145/318789.318794. [9] Y. Lin and D. Padua. On the Automatic Parallelization of Sparse and Irregular Fortran Programs. Scientific Programming, 7(1999), pp. 231-246. ISSN 1058-9244, IOS Press. [10] Y. Lin and D. Padua. Analysis of Irregular Single-indexed Arrays and Its Applications in Compiler Optimizations. In Proceedings of the 9th International Conference on Compiler Construction. Berlin, Germany, March 2000.
participants (1)
-
Melissa Lawson