Seems my parallel program has memory corruption problem. The problem is
that it's a batch job, and the current environment does not allow to
debug it interactively. So I can only analyze the core file using dbx,
but it seems the memory was corrupted elsewhere, so the call stack of
the core file does not help.
Any idea?
Thanks!
|
|
0
|
|
|
|
Reply
|
lanchenn (28)
|
9/7/2005 9:00:36 PM |
|
On Wed, 7 Sep 2005, Patricia wrote:
> Seems my parallel program has memory corruption problem. The problem is
> that it's a batch job, and the current environment does not allow to
> debug it interactively. So I can only analyze the core file using dbx,
> but it seems the memory was corrupted elsewhere, so the call stack of
> the core file does not help.
Check out libumem and/or the watchmalloc library (see my book, Solaris
Systems Programming, for more info, and their respective man pages).
--
Rich Teer, SCNA, SCSA, OpenSolaris CAB member
President,
Rite Online Inc.
Voice: +1 (250) 979-1638
URL: http://www.rite-group.com/rich
|
|
0
|
|
|
|
Reply
|
Rich
|
9/7/2005 9:46:03 PM
|
|
Use Purify from IBM (used to be Rational Software but IBM bought them).
It is now under the Rational brand at IBM.
This is what is does..
http://bmrc.berkeley.edu/purify/docs/html/installing_and_gettingstarted/2-purify.html
|
|
0
|
|
|
|
Reply
|
david
|
9/8/2005 1:24:49 AM
|
|
Patricia wrote:
> Seems my parallel program has memory corruption problem. The problem is
> that it's a batch job, and the current environment does not allow to
> debug it interactively. So I can only analyze the core file using dbx,
> but it seems the memory was corrupted elsewhere, so the call stack of
> the core file does not help.
>
> Any idea?
>
> Thanks!
>
Check out the AppCrash dtrace program at:
http://developers.sun.com/solaris/articles/app_crash/app_crash.html
--
----------------------------------
Randy Jones
E-Mail: randy@jones.tri.net
----------------------------------
|
|
0
|
|
|
|
Reply
|
Randy
|
9/8/2005 2:13:52 AM
|
|
> Use Purify from IBM (used to be Rational Software but IBM bought them).
> It is now under the Rational brand at IBM.
I have to submit the job through a batch file. It runs on multiple
processors, and I can't debug it interactively. Is purify still useful
for me?
|
|
0
|
|
|
|
Reply
|
Patricia
|
9/8/2005 10:31:28 AM
|
|
Yep, Purify create a new binary that you then run.
hence "I've purified the binary!"
|
|
0
|
|
|
|
Reply
|
david
|
9/8/2005 11:24:51 PM
|
|
I have used mpatrol to find problems.
..
<david@smooth1.co.uk> wrote in message
news:1126221891.029644.263870@g47g2000cwa.googlegroups.com...
>
> Yep, Purify create a new binary that you then run.
>
> hence "I've purified the binary!"
>
|
|
0
|
|
|
|
Reply
|
Fran
|
9/9/2005 1:40:35 PM
|
|
Rich Teer <rich.teer@rite-group.com> wrote:
> On Wed, 7 Sep 2005, Patricia wrote:
>> Seems my parallel program has memory corruption problem. The problem is
>> that it's a batch job, and the current environment does not allow to
>> debug it interactively. So I can only analyze the core file using dbx,
>> but it seems the memory was corrupted elsewhere, so the call stack of
>> the core file does not help.
> Check out libumem and/or the watchmalloc library (see my book, Solaris
> Systems Programming, for more info, and their respective man pages).
Actually your book seems to mention libumem only in a few lines on page 100,
and except for the diagram on page 105, the discussion of watchmalloc
is paraphrased from the Solaris manpages.
(The manpages of course are freely available in a variety of
searchable media ;-)
--
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net
|
|
0
|
|
|
|
Reply
|
Thomas
|
9/11/2005 12:33:14 AM
|
|
how about bcheck ?
|
|
0
|
|
|
|
Reply
|
Patricia
|
9/14/2005 11:43:33 AM
|
|
|
8 Replies
719 Views
(page loaded in 0.141 seconds)
|