program hands with probable EOF on socket

  • Follow


I'm getting the following errors:
rm_l_1_29492:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_2_28222:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_3_27390:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_4_27782:  p4_error: net_recv read:  probable EOF on socket: 1

In order to trace which part is giving the error, I don't even do any 
SEND/RECEIVE, still program hands, here's part of it:
In master PE:
     col = 0; row = 0;
     for(PE = 1; PE < numtasks; PE++) {
       tileSize = TIFFReadTile(tif, bufRGB, col, row, 0, 0);
       printf("%u\n", tileSize);
       tileSize = TIFFReadTile(tif, bufRGB+TIFFTileSize(tif), col, row, 
0, 1);
       printf("%u\n", tileSize);
       tileSize = TIFFReadTile(tif, bufRGB+(TIFFTileSize(tif)*2), col, 
row, 0, 2)
;
       printf("%u\n", tileSize);

       col += tileWidth;
       if(col == imageWidth) {
         col = 0;
         row += tileLength;
       }

       /*rc = MPI_Send(bufRGB, tileSize*3, MPI_UNSIGNED_CHAR, PE, 0, 
MPI_COMM_WOR
LD);
       if(rc != MPI_SUCCESS)
       exit(-1);*/
     }

In slaves I commented out everything, so what could be the problem?

0
Reply Pushkar 10/6/2003 10:56:02 PM

In article <3F81F302.7020306@gri.msstate.edu>,
Pushkar Pradhan  <pushkar@gri.msstate.edu> wrote:

>In slaves I commented out everything, so what could be the problem?

If the slaves exit without calling MPI_Finalize(), what do you expect
the master process to think is going on?

-- greg




0
Reply lindahl 10/6/2003 11:31:01 PM


No I do call MPI_Finalize() at the end in all the PEs.
Here's some more info on my problem:

Strangely if I do the SENDs in the master PE and no RECV in the slaves 
the program completes, however if I put the RECVs in the slaves the 
program hangs!
That's strange the program shouldn't complete if there are no 
corresponding RECVs to SENDs, right?

If you want I can post my code?

Greg Lindahl wrote:
> In article <3F81F302.7020306@gri.msstate.edu>,
> Pushkar Pradhan  <pushkar@gri.msstate.edu> wrote:
> 
> 
>>In slaves I commented out everything, so what could be the problem?
> 
> 
> If the slaves exit without calling MPI_Finalize(), what do you expect
> the master process to think is going on?
> 
> -- greg
> 
> 
> 
> 

0
Reply Pushkar 10/6/2003 11:59:15 PM

Pushkar Pradhan <pushkar@gri.msstate.edu> wrote in message news:<3F8201D3.7070603@gri.msstate.edu>...
> No I do call MPI_Finalize() at the end in all the PEs.
> Here's some more info on my problem:
> 
> Strangely if I do the SENDs in the master PE and no RECV in the slaves 
> the program completes, however if I put the RECVs in the slaves the 
> program hangs!
> That's strange the program shouldn't complete if there are no 
> corresponding RECVs to SENDs, right?
> 
> If you want I can post my code?
> 

You can post it, so we will be able to test it by ourselves.

-- 
Xebax
0
Reply christy 10/7/2003 8:03:15 AM

Pushkar Pradhan wrote:
> 
> No I do call MPI_Finalize() at the end in all the PEs.
> Here's some more info on my problem:
> 
> Strangely if I do the SENDs in the master PE and no RECV in the slaves
> the program completes, however if I put the RECVs in the slaves the
> program hangs!
> That's strange the program shouldn't complete if there are no
> corresponding RECVs to SENDs, right?
> 
> If you want I can post my code?

A MPI_Recv _must_ wait until the data is there, so
the program will hang if there is no matching send.

A MPI_Send can be implemented in several ways.
The MPI standard explicitely says that after
MPI_Send returns you can safely reuse the send
buffer but you are not allowed to assume that
the data has arrived anywhere.

In your case I guess that the slaves hang in the
RECV, the master terminates and the slaves see that
the socket is closed without any data read so far.

0
Reply Georg 10/7/2003 9:44:20 AM

Pushkar Pradhan wrote:
> 
> But shouldn't the MPI_Send block until a corresponding MPI_Recv is executed?

No. The return of an MPI_Send does _only_ mean, that
you can reuse the send buffer. Nothing more, nothing
less.

The MPI standard (RTFM!) has MPI_Ssend that has the semantics
you want to have. But be warned, not to rely on its
proper implementation. In general the proper implementation 
of MPI_Ssend would need an acknowledging control message
from the receiver back to the sender. Not every implementation
is willing to pay this price especially for short
messages.

> I mean if I'm trying to send to rank=1, it shouldn't complete the call
> until rank=1 executes MPI_Recv?

This assumption is, in general, wrong.

> I've used similar logic in the past without any problem.

It maybe worked by chance.

> 
> I'm sending the main file only, since there are a couple of other files.
> Maybe you guys can take a look at the code and tell what's wrong.

I have some questions about the code, but I guess the
communication pattern is ok, e.g. sends and receives are paired.

The questions:
1) Is the calculation of numtiles correct?
2) Why not let the slaves read their part 
   from the file for themselves?

The pattern as I see it:
  MPI_Init...
  if (0 == rank) {
    send them all a chunk
  } else {
    receive a single chunk
  }
  MPI_Reduce to get the time.
  MPI_Finalize

And that fails? I can't see why.
0
Reply Georg 10/8/2003 8:51:17 AM

5 Replies
391 Views

(page loaded in 0.107 seconds)

Similiar Articles:











7/27/2012 3:03:56 PM


Reply: