I'm getting the following errors:
rm_l_1_29492: p4_error: net_recv read: probable EOF on socket: 1
rm_l_2_28222: p4_error: net_recv read: probable EOF on socket: 1
rm_l_3_27390: p4_error: net_recv read: probable EOF on socket: 1
rm_l_4_27782: p4_error: net_recv read: probable EOF on socket: 1
In order to trace which part is giving the error, I don't even do any
SEND/RECEIVE, still program hands, here's part of it:
In master PE:
col = 0; row = 0;
for(PE = 1; PE < numtasks; PE++) {
tileSize = TIFFReadTile(tif, bufRGB, col, row, 0, 0);
printf("%u\n", tileSize);
tileSize = TIFFReadTile(tif, bufRGB+TIFFTileSize(tif), col, row,
0, 1);
printf("%u\n", tileSize);
tileSize = TIFFReadTile(tif, bufRGB+(TIFFTileSize(tif)*2), col,
row, 0, 2)
;
printf("%u\n", tileSize);
col += tileWidth;
if(col == imageWidth) {
col = 0;
row += tileLength;
}
/*rc = MPI_Send(bufRGB, tileSize*3, MPI_UNSIGNED_CHAR, PE, 0,
MPI_COMM_WOR
LD);
if(rc != MPI_SUCCESS)
exit(-1);*/
}
In slaves I commented out everything, so what could be the problem?
|
|
0
|
|
|
|
Reply
|
Pushkar
|
10/6/2003 10:56:02 PM |
|
In article <3F81F302.7020306@gri.msstate.edu>,
Pushkar Pradhan <pushkar@gri.msstate.edu> wrote:
>In slaves I commented out everything, so what could be the problem?
If the slaves exit without calling MPI_Finalize(), what do you expect
the master process to think is going on?
-- greg
|
|
0
|
|
|
|
Reply
|
lindahl
|
10/6/2003 11:31:01 PM
|
|
No I do call MPI_Finalize() at the end in all the PEs.
Here's some more info on my problem:
Strangely if I do the SENDs in the master PE and no RECV in the slaves
the program completes, however if I put the RECVs in the slaves the
program hangs!
That's strange the program shouldn't complete if there are no
corresponding RECVs to SENDs, right?
If you want I can post my code?
Greg Lindahl wrote:
> In article <3F81F302.7020306@gri.msstate.edu>,
> Pushkar Pradhan <pushkar@gri.msstate.edu> wrote:
>
>
>>In slaves I commented out everything, so what could be the problem?
>
>
> If the slaves exit without calling MPI_Finalize(), what do you expect
> the master process to think is going on?
>
> -- greg
>
>
>
>
|
|
0
|
|
|
|
Reply
|
Pushkar
|
10/6/2003 11:59:15 PM
|
|
Pushkar Pradhan <pushkar@gri.msstate.edu> wrote in message news:<3F8201D3.7070603@gri.msstate.edu>...
> No I do call MPI_Finalize() at the end in all the PEs.
> Here's some more info on my problem:
>
> Strangely if I do the SENDs in the master PE and no RECV in the slaves
> the program completes, however if I put the RECVs in the slaves the
> program hangs!
> That's strange the program shouldn't complete if there are no
> corresponding RECVs to SENDs, right?
>
> If you want I can post my code?
>
You can post it, so we will be able to test it by ourselves.
--
Xebax
|
|
0
|
|
|
|
Reply
|
christy
|
10/7/2003 8:03:15 AM
|
|
Pushkar Pradhan wrote:
>
> No I do call MPI_Finalize() at the end in all the PEs.
> Here's some more info on my problem:
>
> Strangely if I do the SENDs in the master PE and no RECV in the slaves
> the program completes, however if I put the RECVs in the slaves the
> program hangs!
> That's strange the program shouldn't complete if there are no
> corresponding RECVs to SENDs, right?
>
> If you want I can post my code?
A MPI_Recv _must_ wait until the data is there, so
the program will hang if there is no matching send.
A MPI_Send can be implemented in several ways.
The MPI standard explicitely says that after
MPI_Send returns you can safely reuse the send
buffer but you are not allowed to assume that
the data has arrived anywhere.
In your case I guess that the slaves hang in the
RECV, the master terminates and the slaves see that
the socket is closed without any data read so far.
|
|
0
|
|
|
|
Reply
|
Georg
|
10/7/2003 9:44:20 AM
|
|
Pushkar Pradhan wrote:
>
> But shouldn't the MPI_Send block until a corresponding MPI_Recv is executed?
No. The return of an MPI_Send does _only_ mean, that
you can reuse the send buffer. Nothing more, nothing
less.
The MPI standard (RTFM!) has MPI_Ssend that has the semantics
you want to have. But be warned, not to rely on its
proper implementation. In general the proper implementation
of MPI_Ssend would need an acknowledging control message
from the receiver back to the sender. Not every implementation
is willing to pay this price especially for short
messages.
> I mean if I'm trying to send to rank=1, it shouldn't complete the call
> until rank=1 executes MPI_Recv?
This assumption is, in general, wrong.
> I've used similar logic in the past without any problem.
It maybe worked by chance.
>
> I'm sending the main file only, since there are a couple of other files.
> Maybe you guys can take a look at the code and tell what's wrong.
I have some questions about the code, but I guess the
communication pattern is ok, e.g. sends and receives are paired.
The questions:
1) Is the calculation of numtiles correct?
2) Why not let the slaves read their part
from the file for themselves?
The pattern as I see it:
MPI_Init...
if (0 == rank) {
send them all a chunk
} else {
receive a single chunk
}
MPI_Reduce to get the time.
MPI_Finalize
And that fails? I can't see why.
|
|
0
|
|
|
|
Reply
|
Georg
|
10/8/2003 8:51:17 AM
|
|
|
5 Replies
391 Views
(page loaded in 0.107 seconds)
|