[parsec-users] Fluid animate

Jim Dempsey jim at quickthreadprogramming.com
Thu Jun 10 14:07:32 EDT 2010


Scaling is fairly good until core saturation
Then different slope for HT
Then precipitous drop at oversubscription

 
Not sure why 7 theads showed a dip. May have been something else running on
the system.
9-13 threads fairly flat. must be contention by HT siblings for the SSE
floating point units.
14, 16 are odd in showing that much improvement. This might be due to
favorable skew between portions of code in HT pairs that are SSE floating
point and integer instruction streams (array indexing and instruction flow
control being integer).
 
See response to Major for additional information.
 
Jim
 
 

  _____  

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Christian
Bienia
Sent: Thursday, June 10, 2010 12:28 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] Fluid animate



Hey Jim,

 

Interesting numbers! Do you have an idea why the program doesn't achieve
higher speedups? It should scale fairly well with such a big input. Did you
use any form of thread pinning?

 

- Chris

 

 

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Jim
Dempsey
Sent: Thursday, June 10, 2010 12:35 PM
To: 'PARSEC Users'
Subject: [parsec-users] Fluid animate

 

Chris and others:

 

I got some time on a Dell R610 with dual Intel Xeon 5570 processors.

The readers of this mailing list might find it of interest.

 

Results from running fluidanimate using in_500K.fluid with 100 iterations

Runtimes using QuickThread threading toolkit:

 

Threads

1  Total time spent in ROI:         92.494s  1.0000x
2  Total time spent in ROI:         48.265s  1.9164x
3  Total time spent in ROI:         35.771s  2.5857x
4  Total time spent in ROI:         28.770s  3.2149x
5  Total time spent in ROI:         23.912s  3.8681x
6  Total time spent in ROI:         21.912s  4.2212x
7  Total time spent in ROI:         20.918s  4.4217x
8  Total time spent in ROI:         18.428s  5.0192x
9  Total time spent in ROI:         18.897s  4.8946x * note 1
10 Total time spent in ROI:         18.396s  5.0279x
11 Total time spent in ROI:         18.002s  5.1380x
12 Total time spent in ROI:         17.991s  5.1411x
13 Total time spent in ROI:         17.946s  5.1540x
14 Total time spent in ROI:         16.071s  5.7553x
15 Total time spent in ROI:         16.057s  5.7604x
16 Total time spent in ROI:         14.398s  6.4241x
17 Total time spent in ROI:         41.042s  2.2536x ** note 2
18 Total time spent in ROI:        553.489s  0.1671x ** note 3

 

Each processor has 4 cores with HyperThreading

Total of 8 cores and 16 hardware threads

fluidanimate is a floating point and memory access intensive application.

 

Note 1:

On this configuration, QuickThread distributes work to cores first, then
back fills to HyperThread siblings second.

Result being fairly steady slope from 1 thread to 8 threads (full set of
cores) then shallower slope as the HT threads are filled in.

 

Note 2:

At 17 threads we have oversubscription of threads. Note the adverse effect
on cache.

 

Note 3:

At 18 threads, the adverse effect on cache appears to be exponential.

Additional run data would provide some insight as would profiling.

 

The above results were from one set of test runs on a remote system.

IOW I could not verify no other activity was present on the system.

 

Jim Dempsey


 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/parsec-users/attachments/20100610/cc88a590/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 29741 bytes
Desc: not available
URL: <http://lists.cs.princeton.edu/pipermail/parsec-users/attachments/20100610/cc88a590/attachment-0001.jpe>


More information about the parsec-users mailing list