[parsec-users] Freqmine Question Help

Raghav Mohan rmohan2 at wisc.edu
Mon Aug 13 10:39:49 EDT 2012

Hi Joseph,

The reason why I am using GCC 4.7.0 is because this is a machine connected to an AFS server, and g++ defaults it to GCC 4.7.0. I ran it with 4.4.6 and got similar results. To answer your other question - no one else is logged on/no other processes are running. Any case, I figured out what is going on (atleast I think). Usually for these benchmarks, I have the input files on the physical disk and the code on the AFS server (so that it gets backed up). So far this has made sense and has been garnering results, however, something in freqmine causes this not to be the case. I ran the results this morning by placing the code on the disk, along side the inputs as well and here are my results:

Command: ./freqmine ../inputs/kosarak_990k.dat 790
Output:NUMTHREADS: 1 the data preparation cost 0.593927 seconds, the FPgrowth cost 19.738423 seconds
NUMTHREADS: 2 the data preparation cost 0.592534 seconds, the FPgrowth cost 10.312950 seconds
NUMTHREADS: 3 the data preparation cost 0.644112 seconds, the FPgrowth cost 6.767818 seconds
NUMTHREADS: 4 the data preparation cost 0.592829 seconds, the FPgrowth cost 5.265522 seconds
NUMTHREADS: 5 the data preparation cost 0.592956 seconds, the FPgrowth cost 4.249213 seconds
NUMTHREADS: 6 the data preparation cost 0.592839 seconds, the FPgrowth cost 3.516205 seconds
NUMTHREADS: 7 the data preparation cost 0.593371 seconds, the FPgrowth cost 3.316069 seconds
NUMTHREADS: 8 the data preparation cost 0.594291 seconds, the FPgrowth cost 3.211393 seconds
NUMTHREADS: 9 the data preparation cost 0.594084 seconds, the FPgrowth cost 3.947599 seconds
NUMTHREADS: 10 the data preparation cost 0.633032 seconds, the FPgrowth cost 3.865774 seconds
NUMTHREADS: 11 the data preparation cost 0.635615 seconds, the FPgrowth cost 3.524606 seconds
NUMTHREADS: 12 the data preparation cost 0.593882 seconds, the FPgrowth cost 3.150823 seconds
NUMTHREADS: 13 the data preparation cost 0.595541 seconds, the FPgrowth cost 3.502538 seconds
NUMTHREADS: 14 the data preparation cost 0.604269 seconds, the FPgrowth cost 3.467640 seconds
NUMTHREADS: 15 the data preparation cost 0.627852 seconds, the FPgrowth cost 3.477972 seconds
NUMTHREADS: 16 the data preparation cost 0.603754 seconds, the FPgrowth cost 3.629891 seconds

Now we see the scaling desired. I wonder if there is some process of afs (sending acknowledgement etc) is interrupting the last kernel of freqmine, where this is hanging. I will investigate this further. 
As for the FPgrowth cost performance of a single thread - that is still slower than expected, as you reported ~12 seconds vs ~20 seconds. This is again for the simlarge. I will investigate this further as well. Feel free t o comment on either of these aspects.
Thank you for all your help and prompt responses.

More information about the parsec-users mailing list