[parsec-users] freqmine: segmentation fault on simlarge input

Joseph Greathouse jlgreath at umich.edu
Mon May 28 16:44:52 EDT 2012

Hi Justin,

I'm sorry to say that I don't have a copy of ICC on hand to test this. 
One thing to note is that the location of the segaults you and Amittai 
are experiencing are different, though the underlying problem may be the 

The only things I can recommend are:

1) Make sure that previously mentioned data race is fixed.
2) Try compiling freqmine at -O0, then -O1, and finally -O2 in order to 
see if any compiler optimizations are causing your problem.


On 5/24/2012 4:47 PM, Justin Zhang wrote:
> Hi Joe,
> Thanks so much for the information. I am actually using ICC and I find
> that ICC does not recognize flag "-fno-tree-vectorize". I am using ICC
> 12.1 on Debian 2.6.32-5-amd64 on a 12 core Intel 64bit machine. I am
> also not exactly sure if the segfault I got is exactly the same as
> Amittai got. It narrows to the following code section in the  'void
> FP_tree::scan2_DB(int workingthread)' function in fp_tree.cpp:
>                         ...
>                         if (child->leftchild == NULL) {     //it segfaulted
> here
>                             current_new_data_num += k;
>                             temp->rightsibling = child->leftchild;
>                             child->leftchild=temp;
>                         } else {
>                             temp->rightsibling =
> child->leftchild->rightsibling;
>                             child->leftchild->rightsibling = temp;
>                             current_new_data_num += has +
> current_hot_node_depth;
>                         }
>                         ...
> I wonder if you have any clue how to temporarily fix this problem?
> Thanks a lot!
> Justin
> Quoting "Joseph Greathouse"<jlgreath at umich.edu>:
>> Another note: This occurs, at minimum, with "-O1 -fstrict-aliasing
>> -ftree-vectorize".
>> I know very little about the internals of GCC, nor do I really want
>> to try to cut down this program into a reproducible testcase to post
>> onto the GCC mailing list or IRC channel. :)
>> A better solution might be to try different versions of GCC, since
>> this problem appears to be relatively new (or maybe won't happen in
>> 4.5, etc.)
>> -Joe
>> On 5/16/2012 10:43 PM, Joseph Greathouse wrote:
>>> Hi Amittai,
>>> It turns out that my earlier statement that this works for RHEL 6.2, GCC
>>> version 4.4.6 was incorrect. I recompiled from a fresh download and ran
>>> into the same problem.
>>> I traced it down to the loop vectorization optimization, I think. Try
>>> recompiling freqmine, but add '-fno-tree-vectorize' into the the
>>> CXXFLAGS variable in gcc.bldconf.
>>> I don't know whether this is a GCC regression or a problem with freqmine
>>> that tree vectorization brings out. Turning this off is not a good
>>> permanent solution (especially if you're looking to get performance
>>> numbers for a paper), but it's better than a program that crashes for
>>> now. :)
>>> -Joe
>>> On 5/16/2012 10:07 PM, Amittai Aviram wrote:
>>>> Hi, Joe!
>>>> OK, I noticed that part of the code, too, and wondered whether it
>>>> could cause a data race. However, I just tried your suggestion of
>>>> annotating every call to release_node_before_mining and
>>>> release_node_after_mining with "#pragma omp critical"--and, of course,
>>>> commented the current "#pragma omp critical" lines out--but, alas, I
>>>> am still getting the same segmentation fault, in exactly the same place.
>>>> Amittai
>>>> On May 16, 2012, at 9:39 PM, Joseph Greathouse wrote:
>>>>> Hi Amittai,
>>>>> Sorry about the confusion-- I was basing my message to you off the
>>>>> bug report I sent in a few years ago, so I didn't even bother looking
>>>>> into the bodies of those two functions again.
>>>>> The data races are actually in the for loops immediately above those
>>>>> existing OMP critical regions. This is what I wrote in my email that
>>>>> never hit the public mailing list:
>>>>> I believe it's possible to schedule the threads in such a way that,
>>>>> for example, in the for loop in release_node_array_before_mining:
>>>>> Thread0 reads thread_begin_status[3] and finds that it is greater
>>>>> than Thread0.current. Then Thread3 writes a lower value into
>>>>> thread_begin_status[3] (sequence values proceed from high to low, so
>>>>> this will always be the case with this thread ordering), and Thread0
>>>>> loads that value into Thread0.current and proceeds into the critical
>>>>> region to delete things. This makes it possible to have current lower
>>>>> than it normally would be, which I believe is an unintended data race.
>>>>> So, yeah, the entirety of these functions should be OMP critical, not
>>>>> just the freebuf part. The quick way to do this is to extend the
>>>>> entire critical region to start before the call and end immediately
>>>>> afterwards.
>>>>> -Joe
>>>>> On 5/16/2012 9:10 PM, Amittai Aviram wrote:
>>>>>> Hi, Joe! Thank you very much for your comments. One clarification,
>>>>>> please--
>>>>>> On May 16, 2012, at 9:01 PM, Joseph Greathouse wrote:
>>>>>>> It's also worth nothing that this function also contains a data
>>>>>>> race that has not been publicly patched yet. The uses of
>>>>>>> release_node_array_before_mining() and
>>>>>>> release_node_array_after_mining() within FP_growth_first() should
>>>>>>> actually be OMP Critical regions.
>>>>>>> You could try adding "#pragma omp critical" before each of the
>>>>>>> calls to those functions. I never sat down and figured out the
>>>>>>> possible errors that this data race could cause (I don't think it
>>>>>>> would cause your crashes), but it's worth testing.
>>>>>> In my source (PARSEC 2.1), there is an "omp critical" region
>>>>>> _inside_ each of release_node_array_before_mining and
>>>>>> release_node_array_after_mining. Does your source not have those
>>>>>> pragmas? Or do you mean that the critical region should have
>>>>>> extended further out by starting before each call and ending after it?
>>>>>> Thanks!
>>>>>> Amittai

More information about the parsec-users mailing list