[parsec-users] freqmine: segmentation fault on simlarge input

Joseph Greathouse jlgreath at umich.edu
Wed May 16 22:43:03 EDT 2012


Hi Amittai,

It turns out that my earlier statement that this works for RHEL 6.2, GCC 
version 4.4.6 was incorrect. I recompiled from a fresh download and ran 
into the same problem.

I traced it down to the loop vectorization optimization, I think. Try 
recompiling freqmine, but add '-fno-tree-vectorize' into the the 
CXXFLAGS variable in gcc.bldconf.

I don't know whether this is a GCC regression or a problem with freqmine 
that tree vectorization brings out. Turning this off is not a good 
permanent solution (especially if you're looking to get performance 
numbers for a paper), but it's better than a program that crashes for 
now. :)

-Joe

On 5/16/2012 10:07 PM, Amittai Aviram wrote:
> Hi, Joe!
>
> OK, I noticed that part of the code, too, and wondered whether it could cause a data race.  However, I just tried your suggestion of annotating every call to release_node_before_mining and release_node_after_mining with "#pragma omp critical"--and, of course, commented the current "#pragma omp critical" lines out--but, alas, I am still getting the same segmentation fault, in exactly the same place.
>
> Amittai
>
> On May 16, 2012, at 9:39 PM, Joseph Greathouse wrote:
>
>> Hi Amittai,
>>
>> Sorry about the confusion-- I was basing my message to you off the bug report I sent in a few years ago, so I didn't even bother looking into the bodies of those two functions again.
>>
>> The data races are actually in the for loops immediately above those existing OMP critical regions. This is what I wrote in my email that never hit the public mailing list:
>>
>>
>> I believe it's possible to schedule the threads in such a way that, for example, in the for loop in release_node_array_before_mining:
>> Thread0 reads thread_begin_status[3] and finds that it is greater than Thread0.current. Then Thread3 writes a lower value into thread_begin_status[3] (sequence values proceed from high to low, so this will always be the case with this thread ordering), and Thread0 loads that value into Thread0.current and proceeds into the critical region to delete things.  This makes it possible to have current lower than it normally would be, which I believe is an unintended data race.
>>
>>
>> So, yeah, the entirety of these functions should be OMP critical, not just the freebuf part. The quick way to do this is to extend the entire critical region to start before the call and end immediately afterwards.
>>
>> -Joe
>>
>> On 5/16/2012 9:10 PM, Amittai Aviram wrote:
>>> Hi, Joe!  Thank you very much for your comments.  One clarification, please--
>>>
>>> On May 16, 2012, at 9:01 PM, Joseph Greathouse wrote:
>>>
>>>> It's also worth nothing that this function also contains a data race that has not been publicly patched yet. The uses of release_node_array_before_mining() and release_node_array_after_mining() within FP_growth_first() should actually be OMP Critical regions.
>>>>
>>>> You could try adding "#pragma omp critical" before each of the calls to those functions. I never sat down and figured out the possible errors that this data race could cause (I don't think it would cause your crashes), but it's worth testing.
>>>
>>> In my source (PARSEC 2.1), there is an "omp critical" region _inside_ each of release_node_array_before_mining and release_node_array_after_mining.  Does your source not have those pragmas?  Or do you mean that the critical region should have extended further out by starting before each call and ending after it?
>>>
>>> Thanks!
>>>
>>> Amittai


More information about the parsec-users mailing list