f



Problem in Parallel Processing in 8-cores system

Hi all,

I am quite new to the concept of parallel processing. So, you may find my queries too trivial.

I am doing a huge simulation to collect data for analysis and thus varying plenty of parameters on top of several thousand runs of a single simulation. So, it takes a lot of time. I tried to use the below code, but it doesn't work for some reasons. Just to tell you, the X_DATA matrix is generated repetatively using other parameter variation. And after each X_DATA is generated, I am saving this to another matrix.

********************************************
 X_DATA=zeros(run,1,'int8');
matlabpool open 4
parfor p = 1:run;
    %     [t,x]=sbiogetnamedstate(simDataObj);
    x_length=size((simDataObj(p,1).Data()),1);
    
    X_DATA(p,1) = simDataObj(p,1).Data(x_length,2);
end
**********************************************
The machine I am using for my simulation has 8-cores and 16 GB RAM. I always see that the CPU utilization is about 13-14%. Currently, the simulation time is taking around 30 hours which I feel, can be minimized.

Could you suggest any way forward?

Thanking you,

Md. Shahriar Karim
0
7/7/2009 5:10:18 PM
comp.soft-sys.matlab 211266 articles. 17 followers. lunamoonmoon (258) is leader. Post Follow

2 Replies
3132 Views

Similar Articles

[PageSpeed] 42

Hi,

FYI.  In R2008a, we began throwing warnings that sbiogetnamedstate will be deprecated in future versions.  Which version of MATLAB are you running?

I'm having a little trouble following:
1. Why do you think it's not working, because it's taking so long or because it throws an error
2. Where are the calls to sbiosimulate being made, in or out of the parfor?
3. If they're being made outside of the parfor, could they be made inside (to minimize network traffic)?  If you can move it inside, keep in mind it won't be accessible after the parfor since it's a temporary variable if assigned for the first time in the for loop.
4. How long do you think it should take?  How long does it take to run in a regular for loop?
5. Is simDataObj.Data an int8?  Otherwise you'll truncate the data.
6. The problem I see is that, although you're running 1000 runs, I don't really see the time hit.  So, if the array of simDataObj is really being assigned before the parfor, then you're spending all the network traffic copying over an instance of a simDataObj to each Worker.  Can you flesh out the code a bit more.

Raymond

"Md. Shahriar Karim" <karim.shahriar@gmail.com> wrote in message <h2vvhq$ca4$1@fred.mathworks.com>...
> Hi all,
> 
> I am quite new to the concept of parallel processing. So, you may find my queries too trivial.
> 
> I am doing a huge simulation to collect data for analysis and thus varying plenty of parameters on top of several thousand runs of a single simulation. So, it takes a lot of time. I tried to use the below code, but it doesn't work for some reasons. Just to tell you, the X_DATA matrix is generated repetatively using other parameter variation. And after each X_DATA is generated, I am saving this to another matrix.
> 
> ********************************************
>  X_DATA=zeros(run,1,'int8');
> matlabpool open 4
> parfor p = 1:run;
>     %     [t,x]=sbiogetnamedstate(simDataObj);
>     x_length=size((simDataObj(p,1).Data()),1);
>     
>     X_DATA(p,1) = simDataObj(p,1).Data(x_length,2);
> end
> **********************************************
> The machine I am using for my simulation has 8-cores and 16 GB RAM. I always see that the CPU utilization is about 13-14%. Currently, the simulation time is taking around 30 hours which I feel, can be minimized.
> 
> Could you suggest any way forward?
> 
> Thanking you,
> 
> Md. Shahriar Karim
0
7/8/2009 4:07:02 AM
Dear Raymond,

Thanks for your reply. Please see my inputs. I would be happy to somekind of suggestions in this regard.


1. Why do you think it's not working, because it's taking so long or because it throws an error

-I found MATLAB "not responding" twice, however, in the rest case I just terminated after waiting for about a day.

2. Where are the calls to sbiosimulate being made, in or out of the parfor?

It's out of PARFOR at this moment. 

Using the sbiosimulate, we are generating ensemble_data "SimDataObj". This SimDataObj comprises of 6000 runs data with  multiple points. However, we are only interested about the "last data point". Once we have the "SimDataObj" generated, we run a for loop (PARFOR) to extract the all last data point of 6000 runs and save those using "number of runs" as index. That is, FINAL DATA(1) will store the last data point of run 1, FINAL DATA(2) will store the last data point of run 2 and so on. 


3. If they're being made outside of the parfor, could they be made inside (to minimize network traffic)? If you can move it inside, keep in mind it won't be accessible after the parfor since it's a temporary variable if assigned for the first time in the for loop.

-Could you explain a little bit? I can't follow this...


4. How long do you think it should take? How long does it take to run in a regular for loop? 

-There's no hard and fast time frame for it. I am trying to optimize it only. 

5. Is simDataObj.Data an int8? Otherwise you'll truncate the data.
-No, it's a double type data. 

6. The problem I see is that, although you're running 1000 runs, I don't really see the time hit. So, if the array of simDataObj is really being assigned before the parfor, then you're spending all the network traffic copying over an instance of a simDataObj to each Worker. Can you flesh out the code a bit more.

The problem is that the SimDataObj is generated by the system. I don't write code for this. If I could somehow access the SimDataObj, that would SIMPLY be the best! Running for loop for thousands times (We have use around 6000 thousand run for a btter simulation, so 6000 for loop just to extract the desired value. Moreover, we have 3 more varying parameters working on top of each 6000 runs!)
0
7/9/2009 8:41:03 PM
Reply: