[parsec-users] Fluid animate

Jim Dempsey jim at quickthreadprogramming.com
Thu Jun 10 14:35:15 EDT 2010


I forgot to mention in my reply to you that affinity pinning is involved. (I
did mention this in reply to Major)
 
This system (2 x Xeon 5570) has HyperThreading
 
It has 8 cores and 16 hardware threads consisting
16 integer units
8 floating point units (FPU and/or SSE float/double)
 
fluidanimate computational choke point is in determining the particle
separations
essentially the square root of the separation.
 
The application, on this system, computationally chokes to 8x the single
core rate for performing the particle seperations.
Overlap of operations can occure when thread execution is skewed such that
the integer units of an HT sibling are executing while the other HT sibling
is executing floating point operations.
 
However, the application will also experience memory bandwith limitations.
 
Could you post run times for fluidanimate run with the 500K run on a
similarly equipped system?
(provide system information with data)
 
Jim
 
 
 
 

  _____  

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Christian
Bienia
Sent: Thursday, June 10, 2010 12:28 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] Fluid animate



Hey Jim,

 

Interesting numbers! Do you have an idea why the program doesn't achieve
higher speedups? It should scale fairly well with such a big input. Did you
use any form of thread pinning?

 

- Chris

 

 

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Jim
Dempsey
Sent: Thursday, June 10, 2010 12:35 PM
To: 'PARSEC Users'
Subject: [parsec-users] Fluid animate

 

Chris and others:

 

I got some time on a Dell R610 with dual Intel Xeon 5570 processors.

The readers of this mailing list might find it of interest.

 

Results from running fluidanimate using in_500K.fluid with 100 iterations

Runtimes using QuickThread threading toolkit:

 

Threads

1  Total time spent in ROI:         92.494s  1.0000x
2  Total time spent in ROI:         48.265s  1.9164x
3  Total time spent in ROI:         35.771s  2.5857x
4  Total time spent in ROI:         28.770s  3.2149x
5  Total time spent in ROI:         23.912s  3.8681x
6  Total time spent in ROI:         21.912s  4.2212x
7  Total time spent in ROI:         20.918s  4.4217x
8  Total time spent in ROI:         18.428s  5.0192x
9  Total time spent in ROI:         18.897s  4.8946x * note 1
10 Total time spent in ROI:         18.396s  5.0279x
11 Total time spent in ROI:         18.002s  5.1380x
12 Total time spent in ROI:         17.991s  5.1411x
13 Total time spent in ROI:         17.946s  5.1540x
14 Total time spent in ROI:         16.071s  5.7553x
15 Total time spent in ROI:         16.057s  5.7604x
16 Total time spent in ROI:         14.398s  6.4241x
17 Total time spent in ROI:         41.042s  2.2536x ** note 2
18 Total time spent in ROI:        553.489s  0.1671x ** note 3

 

Each processor has 4 cores with HyperThreading

Total of 8 cores and 16 hardware threads

fluidanimate is a floating point and memory access intensive application.

 

Note 1:

On this configuration, QuickThread distributes work to cores first, then
back fills to HyperThread siblings second.

Result being fairly steady slope from 1 thread to 8 threads (full set of
cores) then shallower slope as the HT threads are filled in.

 

Note 2:

At 17 threads we have oversubscription of threads. Note the adverse effect
on cache.

 

Note 3:

At 18 threads, the adverse effect on cache appears to be exponential.

Additional run data would provide some insight as would profiling.

 

The above results were from one set of test runs on a remote system.

IOW I could not verify no other activity was present on the system.

 

Jim Dempsey


 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/parsec-users/attachments/20100610/d3bc9431/attachment.html>


More information about the parsec-users mailing list