[parsec-users] fluidanimate Cells2 and cnumPars2 bug?

Jim Dempsey jim at quickthreadprogramming.com
Fri Feb 26 15:24:53 EST 2010


Chris,
 
Got your code, and have it running on my Windows system (currently as x32
program). Had to make some minor changes to keep Windows happy (e.g.
_aligned_malloc vs the posix equivalent).
 
Good work on adding the visualization.
 
When running the visualization mode, I notice the program doesn't exit. I
guess that is OK. When I set code in to exit (thread) after the specified
number of iterations, the system hangs. The glutMainLoop has no exit from
the for(;;) loop. Does this exit by way of throw()?
 
When running the 35K simulation (simsmall) in pthreads, I notice that the
particles stratify. IOW looks like layer cake or deck of cards (fluffed up).
This is view through the side of the "deck". My guess is the particles may
be sticking to the walls of the cells due to an appearance of "colder"
temperature there. 
 
On close inspection of display, I notice that particles in the top corners
seem to evaporate. Do they come back somewhere else?
 
Jim
 

  _____  

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Christian
Bienia
Sent: Thursday, February 25, 2010 9:23 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?



Hi Jim,

 

Now I get what you mean. Sounds good, just want to point out that the
current version of the program avoids the redundant computations of the
density by comparing the memory address of the particle data:

 

if(&neigh.p[iparNeigh] < &cell.p[ipar])   ç This eliminates redundant
computation

{

  float distSq = (cell.p[ipar] - neigh.p[iparNeigh]).GetLengthSq();

  if(distSq < hSq)

  {

    float t = hSq - distSq;

    float tc = t*t*t;

    cell.density[ipar] += tc;

    neigh.density[iparNeigh] += tc;

  }

}

 

Best,

Chris

 

 

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Jim
Dempsey
Sent: Thursday, February 25, 2010 9:18 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

 

See inclosed comments

 

  _____  

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Christian
Bienia
Sent: Thursday, February 25, 2010 4:17 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

Hi Jim,

 

That sounds exciting. I’d be interested in hearing more about your insights.
We’re always looking for ways to improve the benchmarks. Here are a few
quick things I’d like to point out:

 

·         Intel used a simple linked list of particles in an early version
of the program, but the performance was disappointing. I don’t have any
numbers on that though.
[JD] The Particles in my implimentation is an array of Particles. These
Particle objects do have a link pointer, the purpose of which is to form a
small collection of the particles within a Cell. The Cell containing the
head pointer. In your revised method (current PARSEC) each cell maintained a
table of particles [16] and when particles migrated they were (supposed to
be) copied from one cell to another (but weren't, were reset). Although the
copying phase was not performed in the code. The lack of the copy may have
produced a misleading assumption that the revised method (current PARSEC)
gave superior performance.

·         Please keep in mind that increasing the kernel radius h is not
necessarily desirable. Setting h to infinity would give you perfectly
realistic fluid behavior, but for a program like fluidanimate which computes
fluid movements for an animation all we need is fluid that behaves “good
enough” from the point of view of a human observer. You will get diminishing
returns as you increase h.
[JD] I addressed this in my prior email. The purpose of the (current PARSEC)
cell is to reduce the number of candidate cells yet produce some level of
fidelity. The design of the current PARSEC cell is such that the majority of
the particles in the cell will interact (those within distance of h of each
other). The cell size could exceed the size of the interaction radius h.
Example cell size = 2h would contain 4x the number of particles. Yet the
calculations can continue to test if(distSq < hSq) and produce the same
answer at the expense of taking the false path around the if statement that
qualifies the within h distance. The benefit of using larger cell size (but
not larger h) comes with the reduction of  the number of cells. The expense
is additional calculations. The cost savings is in the reduction of
shuffleing particles from cell to cell. There should be a sweet spot for
cell size.

 

        Notes, the current PARSEC calculations are redundantly recalculating
AND redundantly accumulating density and acceleration. e.g. assume Cell X
has 26 neighbors Xn1, Xn2,... Xn26. Assume for this example Cell X is
processed first in ComputeForces. When Cell X computation is complete Cell X
will contain the accumulated densities and forces amongst the particles
within Cell X and Cell Xn1, Xn2, ... Xn26. AND cells Xn1, Xn2, ... Xn26 have
accumulated the forces amongst themselves as well as with Cell X. Next
ComputeForces advances to the next cell in the system. At some point in time
the cell formerly known as Xn1 will be processed, call it Y. Yn26 could
indeed  be the former cell X. Now cell X and Y interact again.

 

      The GetNeighborCells must be modified, or some other technique
employed, such that the same pair of neighboring cells are only processed
once.

 

·         In any case the computational complexity of the program grows
significantly as you increase h, which will limit the kernel radius in
practice. Increasing h causes a cubic increase of the volume in which
particle interactions are considered, and computing particle interactions is
a quadratic in the number of particles because you need all pair-wise
interactions.
[JD] see prior discussion. 

·         The cell sizes are chosen such that they match exactly the kernel
radius, which means you can get a list of all particles that might interact
with a particle in the current cell simply by looking up all the particles
in neighboring cells. This is the acceleration structure I refer to in the
technical report, its only purpose is to allow us to cut away uninteresting
particles as fast as possible, so you probably want to keep that
relationship between the cell size and the kernel radius to avoid additional
checks of the distances between particles. If you find a better way to do
that then please let me know.
[JD] I will begin with the cell size = h and then experiment with increasing
the size.

 

It should be noted that this is a relatively simple design but could serve
as the basis for expansion. For example, adding charge to the particle (and
mass and radius, etc...). When adding charge then you will likely require a
different h. Possibly this can be accomidated in a get neighbors to take an
argument as to a multiplier of h (using same cell size). e.g. current is
within 1 h of current cell. New feature within n * h of current cell.

 

Jim

 

 

Best,

Chris

 

 

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Jim
Dempsey
Sent: Thursday, February 25, 2010 4:18 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

 

Chris,

 

I am in the process of writing my own implementation of fluidanimate for use
with QuickThread. I am taking generous liberties with the design. My current
design (work in progress) is having each Cell use a simple linked list of
Particles (new data object) as opposed to an array of particles. This linked
list needs management only when a particle migrates from one cell to
another. To reduce the frequency of migration I can play with the magnitude
of h (length of cell cube dimension). However in doing so, increase of h
will increase the number of computations in the density calculation (larger
number of failures on if(distSq < hSq). Some trade-off analysis will have to
be run.

 

The GetNeighborCells has been replaced with a new struct () who's ctor
builds a table of neighboring particles (not cells). For this table of
neighboring particles I too settled on a linked list of arrays for the
particle neighbors (1024 particles / node with the first node on the stack
in the base object). I have been toying with the idea that in addition to
the Cell p (position as x, y, z) that the integer quantized position ix, iy,
iz (currently in units of h relative to base of domain) can be carried in
the Cell object as well. With this addition I can increase the h of the cell
but quantize the ix, iy, iz to what was in old h (kernelRadiansMultiplier /
restParticlesPerMeter). IOW a unit with a high probability of adjacent
positions interacting. Then with this installed, GetNeighboringParticles
could collect only the current cells particles plus the particles in the
adjacent cells at the shared wall(corner). The integer test can be performed
without doing the distance test. GetNeighboringParticles will effectively
collect the current cell's particles and the particles in the adjacent cells
that are 1 quantized unit away from the cell. Using this subset, the
quantized positions can be used to quickly determine if the particles are
close enough to interact, if so then the floating point determination can be
made, and if at that point interaction is warranted, the values will be
updated. With the floating point calculations as fast as they are now, I
will have to run some test to see if the time required to avoid code exceeds
the time required to execute the code.

 

I also can eliminate locks with appropriate non-locked thread progress
detection. I've done this before with an n-Body interaction. The
fluidanimate is similar execpt for the Cell concept use to produce subsets
of particles that have a high probability of interaction. In the n-Body, all
bodies interact.

 

I will look at the website for docs on ferret.

 

Jim Dempsey

 

 

 

  _____  

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Christian
Bienia
Sent: Thursday, February 25, 2010 2:31 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

Hi Jim,

 

On fluidanimate:

 

We chose to implement cells as a “linked list of arrays”. That way it’s
about as flexible as a list but still nearly as efficient as an array
because the cost of pointer chasing is amortized over many particles. We
also wrote our own specialized memory management system that is
thread-private to cut down on allocation / freeing cost and nearly
completely eliminate additional synchronization in the system libraries. We
would prefer not to publish that before the next official PARSEC release.

 

On ferret:

 

Great job porting it to Windows x64! That sounds like it was a fight. You
can read the high-level description I wrote up in the technical report that
we publish on the PARSEC website. That should give you at least a rough idea
on what the program does. The ferret documentation you’re referring to must
be the one in the doc/manual subdirectory of the source? I gotta say I never
worked with it because the authors of the program sit in my office. We can
certainly add some pdf / html files. What exactly is it that you think we
should add?

 

- Chris

 

 

 

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Jim
Dempsey
Sent: Wednesday, February 24, 2010 4:56 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

 

Hi Chris,

 

Thanks for the rapid response. I was aware of the hard coded limit of 16
particles per cell could be a potential problem. The initialization code
with evenly distributed particles, would result in ~8 particles per cell (on
the 500K sim with 4 cores). So unless presure or density exceeds 2x of
initial value in any given cell there would have been no problem. For a
relatively static gas it might have been well enough behaved to not
experience an overflow.

 

One solution would be to use expandable vectors but you might want to hand
code this as opposed to using std::vector (or tbb's concurrent vector) since
use each of those vectors might preclude use of SSE vector instructions on
those vectors.

 

A second solution would be adjustable cells. i.e. when a given cell requires
more than its current max number of particles the size of the cell changes
particle capacity. As long as the particle capacity changes were infrequent
the extra work work required to resize the particle capacity might not be
all that bad.

 

Using the expandable vectors would (could) be problematic if, for example,
you have a high pressure wave propigate throught the cavity. In this
situation you would require excessive amounts of memory. It would seem like
the way to resolve this would be to permit the vectors to reduce in size as
well as expand. You would like to avoid paging if at all possible.

 

When might we expect the revised version? Could you post the revised
"serial" version of the code as alpha or beta and then post the others as
they mature? Each version is a single file and requires no change in build
procedure (other than for addition of visualization)

 

Off topic

 

I was porting ferret to Windows x64. The port was rather difficult in that I
also had to port GSL, LSH and Magick (plus Pthreads). The program runs, but
I haven't a clue as to what it is doing. The manual to ferret is not on the
system as a .PDF or HTML file, however it can be built using texi2dvi.
Unfortunately on Windows TeX and texi2dvi are not present. Could these be
included in the tools folder?

 

Jim Dempsey

 

 

  _____  

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Christian
Bienia
Sent: Wednesday, February 24, 2010 1:50 PM
To: 'PARSEC Users'
Subject: Re: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

Hi Jim,

 

There is indeed a bug in fluidanimate that makes the program recompute the
first frame over and over again, as you and Milo pointed out correctly.
However, the issue is a little more complicated than simply swapping the
pointers to the cell arrays because if the fluid state is simulated for more
frames then the number of particles per cell can exceed the hard-coded limit
of 16 particles per cell that are currently supported by fluidanimate. (That
is the reason for the crash that Milo reported.)

 

We fixed the issues and have a significantly improved version of the program
that will be released with the next version of PARSEC. The new version also
includes some more features such as the possibility to visualize the fluid.

 

Best,

Chris

 

 

 

From: parsec-users-bounces at lists.cs.princeton.edu
[mailto:parsec-users-bounces at lists.cs.princeton.edu] On Behalf Of Jim
Dempsey
Sent: Wednesday, February 24, 2010 1:38 PM
To: 'PARSEC Users'
Subject: [parsec-users] fluidanimate Cells2 and cnumPars2 bug?

 

I have been looking at fluidanimate with an eye towards reducing the number
of (or eliminating) critical sections. In performing this investigation I've
noticed two issues. These issues had been reported earlier by Milo Martin
(https://lists.cs.princeton.edu/pipermail/parsec-users/2008-August/000193.ht
ml)

 

He mentioned his fix wasn't correct, but he may have subsequently corrected
the problem. 

 

The Cells2 and cnumPars2 are built at file load time and never changed
afterwards. 

 

As Milo pointed out, RebuildGrid is essentially performing a Reset Grid to
initial values. IOW no advancement is occurring (particles are "vibrating"
so to speak). 

 

>From my cursory look at the code, it seems like the author's intentions were
for Cells2 and cnumPars2 to contain the prior state  (i.e. read-in initial
state and from integration step 2 onwards, the post AdvanceParticles state),
and then to be used by RebuildGrid for preparing the grid for the next
integration step. 

 

To fix this (which I haven't done as of this writing) either

 

  AdvanceParticles has to advance Cells into Cells2 (and copy new cnumPars
to cnumPars2)

or

  AdvanceFrame needs to add a new function CopyCellsAndcnumPars() following
AdvanceParticles()

 

Not only were the initial states (p, v, hv) not updated but also the
proximity pairs counts in cnumPars2 was not updated so the local particle
density could not change.

 

After I examine this a bit further, I will try to report back my findings.
(bug/noBug, if bug - bug fix)

 

It might be an interesting diagnostic to display the particles during
integration using a relatively small set of particles. Coding errors
(assuming this is an error) would quite easily be identified. Although this
might not catch all errors in coding the fluid dynamic formulas.

 

Jim Dempsey

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cs.princeton.edu/pipermail/parsec-users/attachments/20100226/4a1d90bb/attachment-0001.html>


More information about the parsec-users mailing list