Hello! I am learning MPI and my first project is to parallelize a Monte
Carlo integration I already wrote up in fortran. I have started having
issues with my standard deviation. I need the sum of my calculations
(to figure an average) AND I need to retain each individual calculation
in some manner. Initially, because I couldn't debug my program when I
was using a MPI_RECV and MPI_REDUCE, I just used an MPI_RECV statement
that assigned each recieved calculation to an array. Once all the
recieveing was done by the master node, I had a do loop add the array
together. That method is too restricting, because I need to avoid
defining an array that may eventually be too small.
My output file reads:
p1_10028: p4_error: net_recv read: probable EOF on socket: 1
p2_9496: p4_error: net_recv read: probable EOF on socket: 1
p3_14385: p4_error: net_recv read: probable EOF on socket: 1
p5_9529: p4_error: net_recv read: probable EOF on socket: 1
p4_10061: p4_error: net_recv read: probable EOF on socket: 1
p6_14418: p4_error: net_recv read: probable EOF on socket: 1
p7_10094: p4_error: net_recv read: probable EOF on socket: 1
Killing MPICH slave process, PID 11233
Killing MPICH slave process, PID 11234
So, my code follows in a summarized form. Any suggestions would be
greatly appreciated.
Program monteCarloIntegration
-Use (lots of things)
Implicit None
Include 'mpif.h'
-Define types
Call MPI_INIT(ierr)
Call MPI_COMM_RANK(MPI_COMM_WORLD,MyID,ierr)
Call MPI_COMM_SIZE(MPI_COMM_WORLD,NumProcs,ierr)
-If master node, then read info
-Broadcast everything just read by master to all other processes using
MPI_BCAST
1-Do calculations
2 Call
MPI_SEND(averagePoints,1,MPI_DOUBLE_PRECISION,0,k,MPI_COMM_WORLD)
3 If (MyID==0) Then
Call
MPI_REDUCE(averagePoints,sumBatch,1,MPI_DOUBLE_PRECISION,MPI_SUM,0,MPI_COMM_WORLD,ierr)
4-Do calculations using sumBatch
5 Do i=1,NumProcs
Call
MPI_RECV(batch,1,MPI_DOUBLE_PRECISION,MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,stat,ierr)
End If
Call MPI_FINALIZE(ierr)
End Program
Thanks for reading! Cat
|
|
0
|
|
|
|
Reply
|
cat3686 (2)
|
7/24/2006 2:31:22 PM |
|
Hi Cat,
I'm not sure if I understood everything you have written, but here are
a few pointers/questions that may help:
katalyst wrote:
> -Broadcast everything just read by master to all other processes using
> MPI_BCAST
All processes in the communictor (MPI_COMM_WORLD in your case) must
call MPI_BCAST.
> 2 Call
> MPI_SEND(averagePoints,1,MPI_DOUBLE_PRECISION,0,k,MPI_COMM_WORLD)
Each send must have a matching receive on the receiving node, and the
receiving node must not be "stuck" waiting for a receive to be posted
while it is waiting to send to itself.
>
> 3 If (MyID==0) Then
> Call
> MPI_REDUCE(averagePoints,sumBatch,1,MPI_DOUBLE_PRECISION,MPI_SUM,0,MPI_COMM_WORLD,ierr)
This call must be made by all processes in MPI_COMM_WORLD
>
> 4-Do calculations using sumBatch
>
> 5 Do i=1,NumProcs
> Call
> MPI_RECV(batch,1,MPI_DOUBLE_PRECISION,MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,stat,ierr)
Are these the receives that go with the sends at the beginning? If so,
then I don't think they will ever be reached, as the root node will be
waiting at the MPI_REDUCE before it. Also, with each call the value at
"batch" will be overwritten.
However, if I understand what you are trying to do correctly, then for
the root node to obtain each individual value, and then to obtain a
sum, then I think using an MPI_GATHER and then just averaging on the
root node will be a simpler way. (If you want all the processes to have
a copy of all the processes data, then use MPI_ALLGATHER.)
Thus in general, if each process has a real variable "work", and you
wish to have all copies of this, and an average on process 0, then
something like the following, on all nodes, should work (not tested!):
ALLOCATE(receiveArray(numProcs))
work = resultOfSomeWork()
rootNode = 0
sendCount = 1
receiveCount = 1
CALL MPI_GATHER(work, sendCount, MPI_DOUBLE_PRECISION, receiveArray,
receiveCount, MPI_DOUBLE_PRECISION, rootNode, MPI_COMM_WORLD, err)
IF (myId == 0 ) THEN
Here work out the average of the entries that are in receiveArray
END IF
Note that receiveArray is ignore on all but the root node.
Hope some of this helps,
Nym.
|
|
0
|
|
|
|
Reply
|
Nym
|
7/24/2006 3:08:23 PM
|
|
In article <1153753702.977370.21700@i42g2000cwa.googlegroups.com>,
Nym <neverwillreply@gmail.com> wrote:
[ snip ]
>> 3 If (MyID==0) Then
>> Call
>>
>MPI_REDUCE(averagePoints,sumBatch,1,MPI_DOUBLE_PRECISION,MPI_SUM,0,MPI_COMM_WORLD,ierr)
>This call must be made by all processes in MPI_COMM_WORLD
Agreed that not doing this is the likeliest source of the OP's
problems.
[ snip ]
>However, if I understand what you are trying to do correctly, then for
>the root node to obtain each individual value, and then to obtain a
>sum, then I think using an MPI_GATHER and then just averaging on the
>root node will be a simpler way. (If you want all the processes to have
>a copy of all the processes data, then use MPI_ALLGATHER.)
But if all you really want is an average, isn't it simpler to just
use MPI_Reduce to get a sum over all processes and then divide by
number of processes (obtainable with MPI_Comm_size)?
The OP's concerns seemed to be about wiping out each process's local
copy of its own data, which wouldn't happen with MPI_Reduce, and with
needing an array to hold values from all processes, which also
wouldn't be needed ....
Maybe I'm misunderstanding something about the OP's concerns and
your solution, though ....
[ snip ]
--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
|
|
0
|
|
|
|
Reply
|
blmblm
|
7/24/2006 4:59:19 PM
|
|
In article <4ikcj7F3u772U1@individual.net>,
blmblm@myrealbox.com <blmblm@myrealbox.com> wrote:
>In article <1153753702.977370.21700@i42g2000cwa.googlegroups.com>,
>Nym <neverwillreply@gmail.com> wrote:
[ snip ]
>>However, if I understand what you are trying to do correctly, then for
>>the root node to obtain each individual value, and then to obtain a
>>sum, then I think using an MPI_GATHER and then just averaging on the
>>root node will be a simpler way. (If you want all the processes to have
>>a copy of all the processes data, then use MPI_ALLGATHER.)
>
>But if all you really want is an average, isn't it simpler to just
>use MPI_Reduce to get a sum over all processes and then divide by
>number of processes (obtainable with MPI_Comm_size)?
>
>The OP's concerns seemed to be about wiping out each process's local
>copy of its own data, which wouldn't happen with MPI_Reduce, and with
>needing an array to hold values from all processes, which also
>wouldn't be needed ....
>
>Maybe I'm misunderstanding something about the OP's concerns and
>your solution, though ....
I *am* misunderstanding something -- namely that the OP wants to
compute standard deviation too, not just an average. Sorry about
that. This makes the solution using MPI_GATHER more sensible.
Another way to solve the OP's problem (possibly unnecesssary
but FWIW):
Use MPI_ALLREDUCE to compute the sum. The "ALL" variant means all
processes have this value.
In each process, compute the average [1] and then use it to compute
this process's contribution to the sum for the standard deviation.
Use MPI_REDUCE [2] to combine the different processes' contributions
to the sum.
In process 0 [2], finish computing standard deviation.
[1] Duplicated work, true, but if processes are running on separate
processors shouldn't have any effect on total runtime.
[2] Or use MPI_ALLREDUCE and do the "finish computing" in all processes,
if it's needed by all.
--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
|
|
0
|
|
|
|
Reply
|
blmblm
|
7/25/2006 7:25:26 PM
|
|
blmblm@myrealbox.com wrote:
> I *am* misunderstanding something -- namely that the OP wants to
> compute standard deviation too, not just an average. Sorry about
> that. This makes the solution using MPI_GATHER more sensible.
It seems to me that MPI_REDUCE is still the best option. Recall that
the standard deviation is
std_dev = sqrt(sum((x_i - x_average)^2)/N)
The square in the middle can be multipled out, to obtain:
(x_i - x_avg)^2 = x_i^2 - 2 * x_i * x_avg + x_avg^2
And, distributing the summation, we get
std_dev = sqrt(( sum(x_i^2) - 2 * x_avg * sum(x_i) + N * x_avg^2 )/N)
Finally, noting that x_avg = sum(x_i)/N, this can be simplified to
std_dev = sqrt( sum(x_i^2)/N - (sum(x_i)/N)^2 )
Thus, it is relatively trivial to calculate the standard deviation from
the sum of the x_i values and the sum of their squares -- neither of
which require knowledge of the overall average before the contributions
from each process are calculated.
Incidentally, correlations between two variables can be calculated in a
very similar manner.
- Brooks
--
The "bmoses-nospam" address is valid; no unmunging needed.
|
|
0
|
|
|
|
Reply
|
Brooks
|
7/27/2006 2:00:27 AM
|
|
Hi,
Brooks Moses wrote:
> Thus, it is relatively trivial to calculate the standard deviation from
> the sum of the x_i values and the sum of their squares -- neither of
> which require knowledge of the overall average before the contributions
> from each process are calculated.
This would then mean that 2 MPI_REDUCE calls would be needed, compared
to the one fror the MPI_GATHER method. Would 2 MPI_REDUCE calls be
faster than one MPI_GATHER?
Nym.
|
|
0
|
|
|
|
Reply
|
Nym
|
7/27/2006 6:59:45 AM
|
|
> This would then mean that 2 MPI_REDUCE calls would be needed, compared=
> to the one fror the MPI_GATHER method. Would 2 MPI_REDUCE calls be
> faster than one MPI_GATHER?
One call to MPI_REDUCE with MPI_SUM on an array with 2 values (double my=
_contribution[2] =3D { x_me, x_me * x_me }) should suffice to sum the va=
lues and their squares at once.
MPI_REDUCE can be executed in parallel whereas an MPI_GATHER is (1) seri=
alized at the root node and (2) transfers a lot more data. So, with an i=
ncreasing number of processors, I think MPI_REDUCE is the most efficient=
way.
Michael
|
|
0
|
|
|
|
Reply
|
Michael
|
7/27/2006 8:21:08 AM
|
|
Hi,
Michael Hofmann wrote:
> One call to MPI_REDUCE with MPI_SUM on an array with 2 values (double my_contribution[2] = { x_me, x_me * x_me }) should suffice to sum the values and their squares at once.
I never considered doing it this way: I forgot you can use MPI_REDUCE
with an array of values. Yes, I see now how this would be better than
an MPI_GATHER
Thank you,
Nym.
|
|
0
|
|
|
|
Reply
|
Nym
|
7/27/2006 9:16:37 AM
|
|
|
7 Replies
392 Views
(page loaded in 0.124 seconds)
Similiar Articles: Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpiHello! I am learning MPI and my first project is to parallelize a Monte Carlo integration I already wrote up in fortran. I have started having issues ... Help with synchronization in a while loop (MPI_Bcast & MPI_Reduce ...Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpi | Computer Group Help with synchronization in a while loop (MPI_Bcast & MPI_Reduce ... Help with synchronization ... Sending a 2D Array in MPI - comp.parallel.mpiUsing a MPI_RECV and MPI_REDUCE - comp.parallel.mpi > 2 Call > MPI_SEND(averagePoints,1,MPI_DOUBLE_PRECISION,0,k,MPI_COMM_WORLD) Each ... Hi, Michael Hofmann wrote: > One ... Use MPI inside a for loop - comp.parallel.mpiUsing a MPI_RECV and MPI_REDUCE - comp.parallel.mpi Use MPI inside a for loop - comp.parallel.mpi Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpi Use MPI inside a for ... Nonblocking socket read gets lots of EAGAIN - comp.unix.programmer ...Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpi... p1_10028: p4_error: net_recv read: probable EOF on socket ... to root The root process: - Gets input from user ... Swinstall error : source is already in use - comp.sys.hp.hpux ...EOF error. - comp.lang.vhdl Swinstall error : source is already in use - comp.sys.hp.hpux ... Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpi... is to parallelize a ... Blocking and non blocking assignment in VHDL - comp.lang.vhdl ...Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpi Blocking and non blocking assignment in VHDL - comp.lang.vhdl ... Hi folks, In Verilog, we have blocking and non ... I can not make MPI recognizes a dual core - comp.parallel.mpi ...Hi, On Sat, 22 May 2010 21:01:58 -0700 (PDT) Yessica Brinkmann <yessica.brinkmann@gmail.com> wrote: > Hello. I am using MPICH2. I am writing my first program in MPI. EOF error. - comp.lang.vhdlUsing a MPI_RECV and MPI_REDUCE - comp.parallel.mpi My output file reads: p1_10028: p4_error: net_recv read: probable EOF on socket: 1 p2_9496: p4_error: net_recv read ... How does MPI_Bcast work? - comp.parallel.mpiUsing a MPI_RECV and MPI_REDUCE - comp.parallel.mpi... info -Broadcast everything just read by master to all other processes using MPI_BCAST ... Thus in general, if each ... Using a MPI_RECV and MPI_REDUCE - comp.parallel.mpi | Computer GroupHello! I am learning MPI and my first project is to parallelize a Monte Carlo integration I already wrote up in fortran. I have started having issues ... Using a MPI_RECV and MPI_REDUCE - newsgroups.derkeiler.com: The ...Relevant Pages. Using a MPI_RECV and MPI_REDUCE... I need the sum of my calculations... that assigned each recieved calculation to an array.... recieveing was done by ... 7/25/2012 3:29:14 AM
|