Hello, I'm trying to implement a parallel process, and I'm not sure how to set it up and I was wondering if someone could help me please? My code resembles, for i=1:N % N approx = 10^6 x_new = my_func(x_old) A(:,i) = x_new; x_old = x_new; end where x_old and x_new are Mx1 vectors and A is a MxN matrix. From one iteration to the next, the only dependence is x_new(i) on x_old(i). I'd like to parallelise this by splitting the elements of x_new and x_old across more than one core. Is there a way I can, say, matlabpool open 5 <use core's local copy of x_new, x_old, A> for i=1:N % N approx = 10^6 x_new = my_func(x_old) % now x_new and x_old are M/5 x 1 vectors A(:,i) = x_new; x_old = x_new; end <retrieve local copies> Even if someone just tells me what the commands are, I'd be grateful since that'll be enough for me to look them up. Thanks! PLH.

0 |

1/11/2011 10:23:05 AM

"PLH " <paulhalkyard@googlemail.com> wrote in message news:ighb28$hkg$1@fred.mathworks.com... > Hello, > > I'm trying to implement a parallel process, and I'm not sure how to set it > up and I was wondering if someone could help me please? > > My code resembles, > > for i=1:N % N approx = 10^6 > x_new = my_func(x_old) > A(:,i) = x_new; > x_old = x_new; > end > > where x_old and x_new are Mx1 vectors and A is a MxN matrix. From one > iteration to the next, the only dependence is x_new(i) on x_old(i). I'd > like to parallelise this by splitting the elements of x_new and x_old > across more than one core. Is there a way I can, say, > > matlabpool open 5 <use core's local copy of x_new, x_old, A> > for i=1:N % N approx = 10^6 > x_new = my_func(x_old) % now x_new and x_old are M/5 x 1 vectors > A(:,i) = x_new; > x_old = x_new; > end > <retrieve local copies> > > Even if someone just tells me what the commands are, I'd be grateful since > that'll be enough for me to look them up. I think the command you're looking for is PARFOR. -- Steve Lord slord@mathworks.com comp.soft-sys.matlab (CSSM) FAQ: http://matlab.wikia.com/wiki/FAQ To contact Technical Support use the Contact Us link on http://www.mathworks.com

0 |

1/11/2011 2:46:15 PM

Thanks for your reply. I thought parfor doesn't work if the iterations are dependent on each other, as they are here. The i-th entry of x_old or x_new requires the (i-1)th entry. It's the entries of the vectors x_new and x_old that are independent in my code. The only way I can see to use parfor here is to un-vectorise my code and loop over the elements of x_new and x_old, but that doesn't feel like the right thing to do. Could you advise me further please?

0 |

1/11/2011 3:09:05 PM

Sorry, just realised a typo: The i-th ITERATION depends on the (i-1)-th ITERATION. The ENTRIES are independent.

0 |

1/11/2011 3:18:05 PM

"PLH " <paulhalkyard@googlemail.com> wrote in message news:ighrqh$38v$1@fred.mathworks.com... > Thanks for your reply. > > I thought parfor doesn't work if the iterations are dependent on each > other, as they are here. The i-th entry of x_old or x_new requires the > (i-1)th entry. It's the entries of the vectors x_new and x_old that are > independent in my code. Yes, I missed the "x_new = x_old" in your code. In this case, I think you're probably stuck, as this tight order dependence is likely to interfere with parallelization. You _can't_ start work on the second element until the computations on the first element are complete, and so on down the line. This seems pretty serial to me. If you're attempting to run this code in parallel to improve the execution time, I'd say you should instead identify the bottlenecks and see if there are other ways to improve those sections of your code. Run a smallish example in the Profiler (see HELP PROFILE) to identify the bottlenecks and then focus on improving the efficiency of those sections of the code. -- Steve Lord slord@mathworks.com comp.soft-sys.matlab (CSSM) FAQ: http://matlab.wikia.com/wiki/FAQ To contact Technical Support use the Contact Us link on http://www.mathworks.com

0 |

1/11/2011 3:39:57 PM

Thanks for your time with this. I've been doing some more reading on Mathworks, and I came across SPMD, but I'm struggling to understand how to implement it. I tried: M = 500; N = 1000; x_old = rand(1,M); x_new = x_old; A = zeros(M,N); A = codistributed(A); x_old = codistributed(x_old); x_new = codistributed(x_new); matlabpool open 5 spmd for j=1:N x_new = rand*x_old-rand; A(:,j) = x_new; x_old = x_new; end end I got a long error message (one from each core, I think): ??? Error using ==> spmd_feval at 8 Error detected on lab(s) 1 2 3 4 5 Error in ==> spmdtest at 17 for j=1:N Caused by: Attempted to access startA(2); index out of bounds because numel(startA)=1. Error stack: subsref.m at 99 subsasgn.m at 139 Attempted to access startA(2); index out of bounds because numel(startA)=1. Error stack: subsref.m at 99 subsasgn.m at 139 Attempted to access startA(2); index out of bounds because numel(startA)=1. Error stack: subsref.m at 99 subsasgn.m at 139 Attempted to access startA(2); index out of bounds because numel(startA)=1. Error stack: subsref.m at 99 subsasgn.m at 139 Attempted to access startA(2); index out of bounds because numel(startA)=1. Error stack: subsref.m at 99 subsasgn.m at 139 Do I need to bury the for loop in a function and call that within spmd so that the counting index j is interpreted properly? Thanks

0 |

1/11/2011 4:03:05 PM

"PLH " <paulhalkyard@googlemail.com> wrote in message news:ighuvp$qpg$1@fred.mathworks.com... > Thanks for your time with this. > > I've been doing some more reading on Mathworks, and I came across SPMD, > but I'm struggling to understand how to implement it. I tried: *snip* From your description, it sounds like what you're doing is pretty thoroughly serial in nature; I don't think you're going to be able to parallelize it. [It's like baking a cake -- no matter how much you may want to perform the "mix the ingredients" and "bake the cake" steps in parallel, it's not going to turn out well if you do. Unless (maybe) you're working with oven-safe mixing equipment.] *snip* > x_new = rand*x_old-rand; A(:,j) = x_new; > x_old = x_new; This looks like a filtering operation. If this is your actual application or a close approximation to it, take a look at the FILTER function and see if it's appropriate for your application. -- Steve Lord slord@mathworks.com comp.soft-sys.matlab (CSSM) FAQ: http://matlab.wikia.com/wiki/FAQ To contact Technical Support use the Contact Us link on http://www.mathworks.com

0 |

1/11/2011 6:46:04 PM

To my thinking, it's definitely a problem that should parallelise - although, I may not have explained myself properly to make that clear anyone else. I've written a classical integrator for an ensemble of particles. My code just calculates the trajectories of an ensemble of particles using a 4th order Runge-Kutta method. Crucially, the particles are non-interacting. The i-th particle (whose position is represented by the i-th entry of my x_new and x_old vectors) is completely independent of every other particle. I believe this is a "single-program multiple data" problem, where the different initial conditions of the particles are the data and the program common to all of them is the classical integration scheme. I've only referred to x_new and x_old but, as you've probably guessed from this, there's also a v_old and a v_new. Those details, however, aren't important to my problem. I suppose I could batch process my code, and mash together the results at the end. But it seems, from what I've read at least, that spmd should be able to deal with my problem. So, although the processing of each entry is most definitely serial, I believe I should be able to process the ensemble, i.e. the different entries of the x vectors, in a parallel fashion. But I have no experience using spmd, and I'm struggling to get it to work. Maybe this is because I've implemented it badly or perhaps it's because I've misuderstood its purpose. If you could help with this at all, I'd be very grateful. THanks, PLH.

0 |

1/11/2011 7:10:21 PM

"PLH " <paulhalkyard@googlemail.com> writes: > I've been doing some more reading on Mathworks, and I came across > SPMD, but I'm struggling to understand how to implement it. I tried: > > M = 500; N = 1000; > > x_old = rand(1,M); x_new = x_old; > A = zeros(M,N); > > A = codistributed(A); > x_old = codistributed(x_old); > x_new = codistributed(x_new); > > matlabpool open 5 > spmd > for j=1:N x_new = rand*x_old-rand; A(:,j) = x_new; > x_old = x_new; > end > end > > I got a long error message (one from each core, I think): The problem here is basically that you're creating *codistributed* arrays outside the SMPD block and then passing them in. You should be creating *distributed* arrays there, like so: A = distributed(A); x_old = distributed(x_old); x_new = distributed(x_new); Inside the SPMD block, these appear as codistributed arrays, i.e.: spmd class(A) % returns 'codistributed' on each worker. end With that change, your code gives no errors. (Whether it's going to do what you want is another matter...) Cheers, Edric.

0 |

1/12/2011 9:39:35 AM

Thanks for the help Edric! This is just a test example, so it does what i intended it to do. But I did some other tests, and I found significant slowdown using spmd. For those that are interested, I had implicitly assumed that 1 worker processing a 1xpq vector would be slower than q workers each processing a 1xp vector when p is large. Actually, that's not true, and the scaling with respect to p is such that using one worker is faster (presumably the limit is set by some memory limitation). Essentially, I assumed parallelising would be faster than vectorising, which is incorrect for the problem I'm interested in. Anyway, thanks for the help everyone.

0 |

1/13/2011 10:16:05 AM