[parsec-users] Parsec 2.0, M5 Simulations, Linux Idle loop.

Rick Strong rstrong at cs.ucsd.edu
Wed Sep 16 17:46:26 EDT 2009


Major Bhadauria wrote:
> Blackscholes does not have much interaction between threads, so it 
> seems unlikely there's thread contention or threads stuck at barriers, 
> I'm not familiar enough with your simulator to know if all the threads 
> are running on all available cores or if you're mapping all the 
> threads to run on just two cores and the other cores are sleeping.
The simulator is booting the linux operating system with the linux 
scheduler handling thread movement. Thus, if you have a methodology when 
running the benchmarks on a vanilla linux kernel, then I can duplicate 
that. The only problem here is that the amount of time I can simulate 
the benchmarks is limited, so I would rather have the threads 
communicate to me when they are all scheduled to a different core and 
then take a checkpoint. Is there a way to get a new ROI after the 
threads have all been scheduled to each available core?

In addition, I have compiled parsec 2.0 with option -c gcc-hooks and 
have not set any of the OS scheduling affinity variables under the 
assumption that the benchmarks will attempt to use all of the cores. 
Should I be setting affinity? Also, I do see performance improvements 
for 2, 4,  8,  16, and 32 cores which indicates that the cores are being 
used.
>
> Do you have the same behavior with larger input sets? simsmall is 
> really tiny, you should try the largest size that can finish in a 
> reasonable amount of time, perhaps sim-medium (for the larger 32 core 
> simulation)?
I have tried a 1e9 total instruction execution simulation with 
sim-large, which takes around 10 hours to finish  and is around 25-275ms 
of execution in simulation time. I did not see much difference in 
performance for this small execution. One possible explanation might be 
that I am seeing startup effects of the ROI?
>
> Its unclear what the IPC graph shows, is it IPCs for all the procs 
> combined? I;m assuming all the instructions are useful insns since  at 
> sync points the kernel puts the  cores to sleep?
The IPC is the summed IPC for all cores in the system.  It is possible 
that not all instructions are useful instructions and would be dependent 
on the parallelization model used whether the parallel threads would 
release the processor while waiting for more work to do.  If the 
parallel thread went to sleep and the idle thread had no other threads 
to currently schedule, then the linux kernel executes a quiesce 
instruction which puts the core to sleep.
>
> Rick Strong wrote:
>> I have attached the pictures this time. Hopefully, they make it to 
>> the mailing list.
>>
>> -Rick
>>
>> Rick Strong wrote:
>>> Dear all,
>>>
>>> I am current Ph.D. student at UCSD studying computer architecture 
>>> for multicore systems and its interaction with the OS. My goal for 
>>> last half of year has been to run Parsec-2.0 on the M5 simulator for 
>>> the alpha ISA for a many many core  architectures.
>>>
>>> I have most of the benchmarks compiled and ready to go but I find 
>>> that IPC is smaller than what I would expect. The figure attached 
>>> shows IPC for 2 cores, 4 cores, 8 cores, 16 cores and 32 cores for a 
>>> hypothetical 22nm process technology running @ 3.5GHz in an 
>>> Out-of-Order processor modeling the Alpha EV6.  The IPC seems fine 
>>> for 2 cores, but as more cores are added an alarming amount of time 
>>> is spent in the idle loop of the linux kernel which puts the 
>>> processor to sleep through a quiesce instruction ... you may find 
>>> the amount of time spent sleeping in profile_quiesce.png that was 
>>> also attached (This stat is gathered in gprof like manner).  The 
>>> input set that was being used was simsmall and I started simulation 
>>> measurement at the beginning of the Region of Interest.
>>>
>>> There are many things that can be going wrong but the problem seems 
>>> to be related to a lack of work available to be scheduled on the 
>>> idle cores. Some possible causes include:
>>> (1) The linux scheduler has not load balanced the parallel 
>>> application leaving some cores unscheduled.
>>> (2) The threads are stalling on a barrier and the core has nothing 
>>> left to do.
>>> (3) Poor startup performance. I see this occur when I simulate the 
>>> benchmarks for simsmall on a x86 nehalem architecture where the 8 
>>> virtual cpu's never get up to 100% utilization.
>>>
>>> This introduction brings the following questions for the parsec 
>>> team, as I am hoping your experience and expert knowledge can direct 
>>> my instrumentation more effectively.
>>>
>>> (1) Have you noticed that linux scheduler load balancing takes 
>>> longer than the proportion of time of execution in simsmall?
>>>
>>> (2) Is there an easy way to determine that the parsec benchmark is 
>>> indeed scheduled and running on all available cores?
>>>
>>> (3) Does simsmall contain enough work to saturate core utilization 
>>> or is it just too small? If so, which sim size is optimal?
>>>
>>> (4) Are there known reasons why the parsec benchmark suite would not 
>>> play nice with the Alpha architecture running a linux kernel for 
>>> those benchmarks compiled using pthreads (I am purposely leaving out 
>>> OpenMP)?
>>>
>>> (5) Is there a way to easily test the barrier stall hypothesis?
>>>
>>> Thanks in advance,
>>> -Richard Strong
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>>
>>> _______________________________________________
>>> parsec-users mailing list
>>> parsec-users at lists.cs.princeton.edu
>>> https://lists.cs.princeton.edu/mailman/listinfo/parsec-users
>>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> ------------------------------------------------------------------------
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> parsec-users mailing list
>> parsec-users at lists.cs.princeton.edu
>> https://lists.cs.princeton.edu/mailman/listinfo/parsec-users
>>   
>



More information about the parsec-users mailing list