[parsec-users] Freqmine Question Help

Joseph Greathouse jlgreath at umich.edu
Wed Aug 8 16:02:46 EDT 2012


On 8/8/2012 3:04 PM, Raghav Mohan wrote:
> Hi,
>
> I am trying to parallelize the Freqmine(parsec v 2.1) benchmark with my own parallel library instead of Open MP. I ran the freqmine benchmark and compared the results from the sequential to open MP version. I would expect the Open MP time to be drastically less, however, it keeps increasing by the magnitude of threads. (Essentially reverse speedup). I am running freqmine on a Hyper threaded Intel Xeon E5620 CPU. This machine has 8 cores that are hyperthreaded, giving 16 threads. Here are the sample results:
>
>
> Command:
> ./freqmine kosarak_250k.dat 220 out.txt
>
>
>
> Sequential Version Result :
> the data preparation cost 0.163102 seconds, the FPgrowth cost 2.720993 seconds
>
>
> OMP Version Result (16 threads):
> the data preparation cost 0.191582 seconds, the FPgrowth cost 9.168250 seconds
>
>
>
>
> As one can see, the FPgrowth cost for the threaded is about 4 times more than the sequential. This is the behavior is replicated for all inputs.
>
>
> I apologize if I am missing something or interpreting the results wrongly, and this is the expected behavior, however, I read the manual, and I could not find any information on this.
> Any help provided is more than greatly appreciated.
>
>
> Thank you.

Hi Raghav,

I agree with Yungang, those numbers appear strange. I've attached 
outputs from a few freqmine runs on a Xeon E5520 (which is a 
Nehalem-based core, rather than a Westmere-based core like yours, but 
otherwise also has 8 physical cores and 16 virtual cores). This is 
running on RHEL 5.8, compiled with GCC 4.1.2 (Red Hat patch 52).

As you can see, adding more threads gives a steady decrease in runtime.

What OS and compiler are you using? What environment variables are set?

Also, you showed the FPgrowth output for the serial version and your 
16-thread version. Could you show us the outputs of the 2-, 4-, and 
8-threaded versions as well?

-Joe

-----------------------------------------------

bash-3.2$ cd ../inst/amd64-linux.gcc-serial/bin/
bash-3.2$ time ./freqmine ../../../inputs/webdocs_250k.dat 11000
...
the data preparation cost 4.136187 seconds, the FPgrowth cost 935.675228 
seconds

real    15m39.923s
user    15m38.940s
sys     0m0.729s

bash-3.2$ cd ../../amd64-linux.gcc-openmp/bin/
bash-3.2$ OMP_NUM_THREADS=4
bash-3.2$ export OMP_NUM_THREADS
bash-3.2$ time ./freqmine ../../../inputs/webdocs_250k.dat 11000
...
the data preparation cost 4.151570 seconds, the FPgrowth cost 215.969161 
seconds

real    3m40.163s
user    14m26.022s
sys     0m0.891s

bash-3.2$ OMP_NUM_THREADS=8
bash-3.2$ export OMP_NUM_THREADS
bash-3.2$ time ./freqmine ../../../inputs/webdocs_250k.dat 11000
...
the data preparation cost 4.094214 seconds, the FPgrowth cost 116.869059 
seconds

real    2m0.972s
user    15m21.003s
sys     0m1.030s

bash-3.2$ OMP_NUM_THREADS=16
bash-3.2$ export OMP_NUM_THREADS
bash-3.2$ time ./freqmine ../../../inputs/webdocs_250k.dat 11000
...
the data preparation cost 4.145387 seconds, the FPgrowth cost 92.685972 
seconds

real    1m36.841s
user    21m38.168s
sys     0m1.801s


More information about the parsec-users mailing list