[talks] H Kim general exam

Wed Dec 31 09:44:13 EST 2008

Hanjun Kim will present his research seminar/general exam on Tuesday 
January 6 at 2PM in Room 402.  The members of his committee are: 
David August (advisor), Margaret Martonosi, and Doug Clark.  Everyone is 
invited to attend his talk, and those faculty wishing to remain for the oral 
exam following are welcome to do so.  His abstract and reading list follow below. 
---
Software  Multithreaded Transactional Memory

Abstract

    As chip multiprocessors have become ubiquitous, programs must be parallelized to
effectively use computing resources. Recently proposed techniques, such as those by
Bridges et al., by Theis et al., and by Tian et al., have demonstrated significant
performance improvement by partitioning loops into long-running threads organized into a
pipeline. Various studies have shown that speculation is the key to unlocking the full
potential of these techniques for general purpose applications. Speculative pipelined
multi-threading needs multi-threaded transactional memory that supports atomic units of
code across multiple threads, but most transactional memory proposals cannot be used
because they only support single-threaded atomic units.
Multi-threaded transactions (MTX) address this problem, but they require expensive
hardware support as currently proposed. Tian et al.
offer CorD, a software technique which runs on existing hardware.
CorD's limitations, that it does not support loops that speculatively write pointer
addresses and that it only supports certain patterns of parallelism, restrict its utility.
    This work proposes a software MTX (SMTX) system that combines the applicability of
hardware MTX with CorD's performance on real hardware. Across a set of sequential
applications, SMTX yields a geomean speedup of 353% on a real dual 4-core CMP (8 cores in
total) machine running speculatively parallelized applications.

Reading List

Books:
  [1] Modern Compiler Implementation in ML, by A. W. Appel, Cambridge University Press,
1998
  [2] Computer Architecture: A Quantitative Approach 4th Edition, by J. L. Hennessy & D.
A. Patterson, Morgan Kaufmann, 2006

Paper:
  [3] K. Agrawal, J. Fineman, and J. Sukha. Nested parallelism in transactional memory. In
Proceedings of the Second ACM SIGPLAN Workshop on Transactional Computing, August 2007.
  [4] M. J. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. I.
August. Revisiting the sequential programming model for multi-core. In Proceedings of the
40th Annual ACM/IEEE International Symposium on Microarchitecture, December 2007.
  [5] J. Giacomoni, T. Moseley, and M. Vachharajani. FastForward for efficient pipeline
parallelism: a cache-optimized concurrent lock-free queue. In PPoPP '08: Proceedings of
the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, 2008.
  [6] L. Hammond, B. D. Carlstrom, V. Wong, M. Chen, C. Kozyrakis, and K. Olukotun.
Transactional coherence and consistency: Simplifying parallel hardware and software. IEEE
Micro, 24(6), Nov-Dec 2004.
  [7] M. Herlihy and J. E. B. Moss. Transactional memory:
Architectural support for lock-free data structures. In Proceedings of the 20th Annual
International Symposium on Computer Architecture, 1993.
  [8] L. Lamport. Specifying concurrent program modules. ACM Trans.
Program. Lang. Syst., 5(2):190-222, 1983.
  [9] S. Papadimitriou and T. C. Mowry. Exploring thread-level speculation in software:
The effects of memory access tracking granularity. Technical report, 2001.
[10] E. Raman, G. Ottoni, A. Raman, M. Bridges, and D. I. August.
Parallel-Stage Decoupled Software Pipelining. In Proceeding of the
2008 International Symposium on Code generation and Optimization, April 2008.
[11] R. Rangan, N. Vachharajani, M. Vachharajani, and D. I. August.
Decoupled software pipelining with the synchronization array. In Proceedings of the 13th
International Conference on Parallel Architectures and Compilation Techniques, September
2004.
[12] J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. A scalable approach to
thread-level speculation. In Proceedings of the 27th International Symposium on Computer
Architecture, June 2000.
[13] H. Sutter. The free lunch is over: A fundamental turn toward concurrency in software.
Dr. Dobb's Journal, 30(3), 2005.
[14] W. Thies, V. Chandrasekhar, and S. Amarasinghe. A practical approach to exploiting
coarse-grained pipeline parallelism in C programs. In Proceedings of the 40th Annual
ACM/IEEE International Symposium on Microarchitecture, 2007.
[15] C. Tian, M. Feng, V. Nagarajan, and R. Gupta. Copy or discard execution model for
speculative parallelization on multicores. In Proceedings of the 41st International
Symposium on Microarchitecture, November 2008.
[16] N. Vachharajani, R. Rangan, E. Raman, M. J. Bridges, G. Ottoni, and D. I. August.
Speculative decoupled software pipelining. In Proceedings of the 16th International
Conference on Parallel Architectures and Compilation Techniques, September 2007.