[talks] J Huang general exam

Tue May 5 09:28:55 EDT 2009

Jialu Huang will present her research seminar/general exam on Tuesday May 12 at 2PM in 
room 402.   The members of her committee are:  David August, advisor, Brian Kernighan, 
and Andrew Appel.  Everyone is invited to attend her talk, and those faculty wishing to
remain 
for the oral exam following are welcome to do so.  Her abstract and reading list follow
below.

============================================
General abstract:

While multi-core systems have become prevalent, using them efficiently
remains an unsolved problem. DOALL is a highly efficient parallelization
technique, however, cross-iteration dependences limit DOALL's
applicability. Decoupled software pipelining (DSWP) partitions the code
into stages with dependences flowing unidirectionally from earlier
stages to later stages. DSWP is more generally applicable than DOALL,
but its scalability is limited by the number of stages and by the
workload in each stage. Parallel stage decoupled software pipelining
(PS-DSWP) addresses this problem by enabling DOALL parallelization to
some stages of a DSWP pipeline in loops where DOALL wasn't originally
applicable.  Inspired by the idea that DSWP can make DOALL more
applicable, this work proposes other parallelization techniques can
become more broadly applicable through the use of DSWP. LOCALWRITE,
which requires each thread to only update the data belonging only to
itself, if applied alone, suffers from redundant computations.
Introducing LOCALWRITE into a DSWP framework, removes this limitation,
by having a separate DOALL stage produce the iteration state to the
LOCALWRITE stages. Experiments show combining all three parallelizations
(DOALL, LOCALWRITE, DSWP) yields performance superior to applying any of
these techniques alone.

Reading List:

Books:
[1] Modern Compiler Implementation in ML, by A. W. Appel, Cambridge
University Press, 1998. Chapter 1~13, 17~21.
[2] Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Ravi
Sethi, and Jeffrey D. Ullman, Addison Wesley, 2001. Chapter 9, 10.

Papers:
[1] Raman, E., Ottoni, G., Raman, A., Bridges, M. J., and August, D. I.
2008. Parallel-stage decoupled software pipelining. In Proceedings of
the Sixth Annual IEEE/ACM international Symposium on Code Generation and
Optimization (Boston, MA, USA, April 05 - 09, 2008). CGO '08. ACM, New
York, NY, 114-123. DOI= http://doi.acm.org/10.1145/1356058.1356074
[2]  Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August.
Automatic Thread Extraction with Decoupled Software Pipelining. In
Proceedings of the 38th IEEE/ACM International Symposium on
Microarchitecture (MICRO), November 2005.
[3] Matthew J. Bridges, Neil Vachharajani, Yun Zhang, Thomas Jablin,
David I. August, "Revisiting the Sequential Programming Model for the
Multicore Era," IEEE Micro, vol. 28, no. 1, pp. 12-20, Jan./Feb. 2008,
doi:10.1109/MM.2008.13.
[4] Vachharajani, N., Rangan, R., Raman, E., Bridges, M. J., Ottoni, G.,
and August, D. I. 2007. Speculative Decoupled Software Pipelining. In
Proceedings of the 16th international Conference on Parallel
Architecture and Compilation Techniques (September 15 - 19, 2007). PACT.
IEEE Computer Society, Washington, DC, 49-59. DOI=
http://dx.doi.org/10.1109/PACT.2007.66.
[5] Singh, D. E., Martin, M. J., and Rivera, F. F. 2005. Runtime
characterisation of irregular accesses applied to parallelisation of
irregular reductions. Int. J. Comput. Sci. Eng. 1, 1 (Feb. 2005), 1-14.
DOI= http://dx.doi.org/10.1504/IJCSE.2005.008906.
[6] Gutiérrez, E., Plata, O. G., and Zapata, E. L. 1999. On Automatic
Parallelization of Irregular Reductions on Scalable Shared Memory
Systems. In Proceedings of the 5th international Euro-Par Conference on
Parallel Processing (August 31 - September 03, 1999). P. Amestoy, P.
Berger, M. J. Daydé, I. S. Duff, V. Frayssé, L. Giraud, and D. Ruiz,
Eds. Lecture Notes In Computer Science, vol. 1685. Springer-Verlag,
London, 422-429.
[7] Eladio Gutiérrez, Oscar Plata, Emilio L. Zapata, "Improving Parallel
Irregular Reductions Using Partial Array Expansion," sc,pp.56, ACM/IEEE
SC 2001 Conference (SC 2001), 2001.
[8] Saltz, J. H., Mirchandaney, R., and Crowley, K. 1989. The doconsider
loop. In Proceedings of the 3rd international Conference on
Supercomputing (Crete, Greece, June 05 - 09, 1989). ICS '89. ACM, New
York, NY, 29-40. DOI= http://doi.acm.org/10.1145/318789.318794.
[9] Y. Lin and D. Padua. On the Automatic Parallelization of Sparse and
Irregular Fortran Programs. Scientific Programming, 7(1999), pp.
231-246. ISSN 1058-9244, IOS Press.
[10] Y. Lin and D. Padua. Analysis of Irregular Single-indexed Arrays
and Its Applications in Compiler Optimizations. In Proceedings of the
9th International Conference on Compiler Construction. Berlin, Germany,
March 2000.