![](https://secure.gravatar.com/avatar/99e895d681f3892488feea2e538202cf.jpg?s=120&d=mm&r=g)
Neil Vachharajani will present his preFPO on Friday, September 7 at 1:30pm in Room 402. The members of his committee are: David August, advisor; Sharad Malik (ELE) and Scott Mahlke (U Michigan), readers; Andrew Appel and Li-Shiuan Peh (ELE), nonreaders. Everyone is invited to attend his talk. His abstract follows below. ----------------------------------- INTELLIGENT PIPELINED MULTITHREADING SPECULATION In recent years, microprocessor manufacturers have shifted their focus from single-core to multicore processors. To avoid burdening programmers with the responsibility of parallelizing their applications, some researchers have advocated automatic thread extraction. Within the scientific computing domain automatic parallelization techniques have been successful, but in the general purpose computing domain few, if any, techniques have achieved comparable success. Despite this, recent progress hints at mechanisms to unlock parallelism from general purpose applications. In particular, two promising proposals exist in the literature. The first, a group of techniques loosely classified as thread-level speculation (TLS), attempts to adapt techniques successful in the scientific domain, such as DOALL and DOACROSS parallelization, to the general purpose domain by using speculation to overcome complex control flow and data access patterns not easily analyzed statically. The second, a non-speculative technique called Decoupled Software Pipelining, partitions loops into long-running, fine-grained threads organized into a pipeline (pipelined multithreading or PMT). DSWP effectively extends the reach of conventional software pipeling to codes with complex control flow and variable latency operations. Unfortunately, both techniques suffer key limitations. TLS techniques either suffer from over speculation, in an attempt to speculatively transform a loop into a DOALL loop, or realize little parallelism in practice because DOACROSS parallelization puts core-to-core communication latency on the critical path. DSWP avoids these pitfalls with its pipeline organization and decoupled execution using inter-core communication queues. However, its non-speculative nature and restrictions needed to ensure a pipeline organization prevent DSWP from achieving balanced parallelism on many key application loops. This dissertation advances automatic parallelization of general purpose applications with two key contributions. First, we propose extending pipelined multithreaded execution with intelligent speculation. Rather than speculating all loop-carried dependences to transform loops into DOALL loops, we propose speculating only key predictable dependences that inhibit balanced, pipelined execution. We demonstrate this technique is effective with an automatic compiler implementation of Speculative DSWP. Second, to support decoupled speculative execution, this dissertation explores extending a multi-core architecture's memory subsystem with versioning. The proposed memory systems resemble those present in TLS architectures, but provide efficient execution in the presence of large transactions, many simultaneous outstanding transactions, and eager data forwarding between uncommitted transactions. In addition to supporting usage patterns exhibited by speculative pipelined multithreading, the proposed memory system facilitates existing and future speculative threading techniques.
participants (1)
-
Melissa M Lawson