f



parallel computing problem

The program works well on a local jobmanager. When doing it using a
jobmanager on a remote cluster computer, the error looks like:

Error using ==>parallel_function>make_general_channel/channel_general
at 843  Undefined function handle.

Error in ==> parallel_function>distributed_execution at 752
[tags, out] = P.getCompleteIntervals(chunkSize);

Error in ==> parallel_function at 564
R = distributed_execution(...


Could anybody tell me what's wrong?
Thanks.
0
8/5/2008 6:44:51 AM
comp.soft-sys.matlab 211266 articles. 14 followers. lunamoonmoon (258) is leader. Post Follow

3 Replies
713 Views

Similar Articles

[PageSpeed] 30

YAN <lingyan.sheng@gmail.com> writes:

> The program works well on a local jobmanager. When doing it using a
> jobmanager on a remote cluster computer, the error looks like:
> 
> Error using ==>parallel_function>make_general_channel/channel_general
> at 843  Undefined function handle.

Looks like you may not have all the functions available on the cluster that
you're using on the MATLAB client. I *think* that particular error message may
occur if the M-file containing the parfor loop isn't available on the workers'
path. You could try

pctRunOnAll which myFunctionCallingParfor

to see if the labs can find myFunctionCallingParfor.

We try to keep the MATLAB path in synch between client and workers, but this can
be defeated if the filesystems aren't uniform. For example, if you have an
M-file containing a parfor loop under C:\ (on Windows), then the remote machines
cannot see that. 

Cheers,

Edric.
0
eellis (488)
8/5/2008 8:40:17 AM
>
> Looks like you may not have all the functions available on the cluster that
> you're using on the MATLAB client. I *think* that particular error message may
> occur if the M-file containing the parfor loop isn't available on the workers'
> path. You could try
>
> pctRunOnAll which myFunctionCallingParfor
>
> to see if the labs can find myFunctionCallingParfor.
>
> We try to keep the MATLAB path in synch between client and workers, but this can
> be defeated if the filesystems aren't uniform. For example, if you have an
> M-file containing a parfor loop under C:\ (on Windows), then the remote machines
> cannot see that.
>
> Cheers,
>
> Edric.


Thanks, Edric!!!
I ran pctRunOnAll with output myFunctionCallingParfor not found.
But it seems not due to the nonuniform filesystems. The client is on
one of the clusters. I tried some simply test programs, and they
work.
0
8/5/2008 6:24:03 PM
Yan,

A function that you are invoking inside the loop body is not
available to the workers on the remote cluster.   Let's say
your loop is:
parfor i = 1:n
   y(i) = foo(x(i));
end
You would then get this error message if the workers don't
have foo.m on their path.  You can verify this by doing:
parfor i = 1:1
    which foo
end

If you simply need to add the directory with foo.m to the
path on the workers, you can use pctRunOnAll (in 8a onwards)
or dctRunOnAll if you are using R2007b.
pctRunOnAll addpath /my/mfile/directory/
Alternatively, you can specify your M-file directory in the
PathDependencies of your cluster configuration.  Both of
those options rely on the workers being able to read the
directory where you store your M-files.

If you don't have your M-files in a directory that is
readable by the workers (e.g. in C:\Work), put the M-files
(or the directory containing them) into the FileDependencies
of your cluster configuration.  That ensures the files or
directories will  be copied out to the cluster and added to
the path of the workers when you start the matlabpool.

Best,

Narfi
YAN <lingyan.sheng@gmail.com> wrote in message
<4ad2cfd6-8218-4ada-9484-0495bb6fc3d0@k36g2000pri.googlegroups.com>...
> The program works well on a local jobmanager. When doing
it using a
> jobmanager on a remote cluster computer, the error looks like:
> 
> Error using
==>parallel_function>make_general_channel/channel_general
> at 843  Undefined function handle.
> 
> Error in ==> parallel_function>distributed_execution at 752
> [tags, out] = P.getCompleteIntervals(chunkSize);
> 
> Error in ==> parallel_function at 564
> R = distributed_execution(...
> 
> 
> Could anybody tell me what's wrong?
> Thanks.

0
8/17/2008 3:03:02 AM
Reply: