We have a problem with mpfr failing 20 tests in 32-bit mode on a Sun
T5420 which has a T2+ processor. (The machine was donated by Sun to the
Sage project).
Without access to the Solaris 10 source code, it's impossible to know
precisely how the library function memset() is coded. It does not help I
don't have a clue about SPARC assembly code, so I'm not managing to
follow the technical arguments.
But a developer of gcc believes this is a bug in Solaris, rather than of
gcc.
Someone looking at the OpenSolaris code
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/sparc_hwcap1/common/gen/memset.s#88
believes the memset() function in OpenSolaris for sun4v is ok, but it
unknown to us whether the code in Solaris 10 is the same or not.
If you have a knowledge of SPARC assembly code, and hopefully access to
the the Solaris 10 sources, you might want to look at:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757
Unfortunately, the problem is not as clearly stated as I'd like, but
comments 11 and 17 are perhaps the most relevant to date.
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/17/2009 9:30:55 AM |
|
On Jul 17, 2:30 am, Dave <f...@coo.com> wrote:
> Without access to the Solaris 10 source code, it's impossible to know
> precisely how the library function memset() is coded.
Actually it should be. You could try something like:
dbx ./someprogramusingmemset
dis -a memset
here it should print the implementation
(I don't currently have access to a solaris to check if the syntax is
exactly this one)
> Someone looking at the OpenSolaris code
>
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/lib...
>
> believes the memset() function in OpenSolaris for sun4v is ok, but it
> unknown to us whether the code in Solaris 10 is the same or not.
The sun4v implementation in opensolaris has /sun4v/ in its path. The
URL above is for the implementation that I think is strange.
|
|
0
|
|
|
|
Reply
|
Marc
|
7/17/2009 6:32:11 PM
|
|
On 2009-07-17 19:32:11 +0100, Marc <marc.glisse@gmail.com> said:
> On Jul 17, 2:30 am, Dave <f...@coo.com> wrote:
>> Without access to the Solaris 10 source code, it's impossible to know
>> precisely how the library function memset() is coded.
>
> Actually it should be. You could try something like:
> dbx ./someprogramusingmemset
> dis -a memset
> here it should print the implementation
It prints "function address expected with -a" but "dis memset/100"
(etc) prints enough to see what's going on.
> (I don't currently have access to a solaris to check if the syntax is
> exactly this one)
>
>> Someone looking at the OpenSolaris code
>>
>> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/lib...
>>
>> believes the memset() function in OpenSolaris for sun4v is ok, but it
>> unknown to us whether the code in Solaris 10 is the same or not.
>
> The sun4v implementation in opensolaris has /sun4v/ in its path. The
> URL above is for the implementation that I think is strange.
The implementation on a T2000 (10 update 5) is a little different from
the one at opensolaris.org in the case where n=0:
(32-bit version)
0x000311b4: mov %o0, %o5
0x000311b8: cmp %o2, 7
0x000311bc: blu 0x00031204 ! 0x31204
[...]
0x00031204: deccc %o2
0x00031208: inc %o5
0x0003120c: bgeu,a 0x00031204 ! 0x31204
0x00031210: stb %o1, [%o5 - 1]
0x00031214: retl
(64-bit version)
0x000000000003a780: mov %o0, %o5
0x000000000003a784: cmp %o2, 7
0x000000000003a788: bcs,pn %xcc,0x000000000003a828 ! 0x3a828
[...]
0x000000000003a828: deccc %o2
0x000000000003a82c: inc %o5
0x000000000003a830: bcc,a,pt %xcc,0x000000000003a828 ! 0x3a828
0x000000000003a834: stb %o1, [%o5 - 1]
0x000000000003a838: retl
ldd confirms this is coming from
/platform/SUNW,Sun-Fire-T200/lib/libc_psr.so.1 and
/platform/SUNW,Sun-Fire-T200/lib/sparcv9/libc_psr.so.1 respectively.
Code from
<http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc_psr/sun4v/common/memset.s>:
77 mov %o0, %o5 ! copy sp1 before using it
78 cmp %o2, 7 ! if small counts, just write bytes
79 blu,pn %ncc, .wrchar
[...]
232 .wrchar:
233 ! Set the remaining bytes, if any
234 cmp %o2, 0
235 be %ncc, .exit
236 nop
237
238 7:
239 deccc %o2
240 stb %o1, [%o5]
241 bgu,pt %ncc, 7b
242 inc %o5
243
244 .exit:
245 retl ! %o0 was preserved
246 nop
Note lines 232 to 236 for the special handling of n=0 are absent from
Solaris 10 u5.
--
Chris
|
|
0
|
|
|
|
Reply
|
Chris
|
7/18/2009 7:02:58 AM
|
|
Chris Ridd wrote:
> It prints "function address expected with -a" but "dis memset/100"
> (etc) prints enough to see what's going on.
Thank you for posting the result.
> The implementation on a T2000 (10 update 5) is a little different from
> the one at opensolaris.org in the case where n=0:
The implementation you are showing looks a lot like:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/sparc/gen/memset.s
(and sparcv9 for the 64 bit version)
> (32-bit version)
> 0x000311b4: mov %o0, %o5
> 0x000311b8: cmp %o2, 7
> 0x000311bc: blu 0x00031204 ! 0x31204
> [...]
(for sparc you usually don't want to cut the instruction just after a
branch, as it gets executed in many cases)
> 0x00031204: deccc %o2
> 0x00031208: inc %o5
> 0x0003120c: bgeu,a 0x00031204 ! 0x31204
> 0x00031210: stb %o1, [%o5 - 1]
> 0x00031214: retl
>
> (64-bit version)
> 0x000000000003a780: mov %o0, %o5
> 0x000000000003a784: cmp %o2, 7
> 0x000000000003a788: bcs,pn %xcc,0x000000000003a828 ! 0x3a828
> [...]
> 0x000000000003a828: deccc %o2
> 0x000000000003a82c: inc %o5
> 0x000000000003a830: bcc,a,pt %xcc,0x000000000003a828 ! 0x3a828
> 0x000000000003a834: stb %o1, [%o5 - 1]
> 0x000000000003a838: retl
>
> ldd confirms this is coming from
> /platform/SUNW,Sun-Fire-T200/lib/libc_psr.so.1 and
> /platform/SUNW,Sun-Fire-T200/lib/sparcv9/libc_psr.so.1 respectively.
>
> Code from
> <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc_psr/sun4v/common/memset.s>:
>
>
>
> 77 mov %o0, %o5 ! copy sp1 before using it
> 78 cmp %o2, 7 ! if small counts, just write bytes
> 79 blu,pn %ncc, .wrchar
> [...]
> 232 .wrchar:
> 233 ! Set the remaining bytes, if any
> 234 cmp %o2, 0
> 235 be %ncc, .exit
> 236 nop
> 237
> 238 7:
> 239 deccc %o2
> 240 stb %o1, [%o5]
> 241 bgu,pt %ncc, 7b
> 242 inc %o5
> 243
> 244 .exit:
> 245 retl ! %o0 was preserved
> 246 nop
>
>
> Note lines 232 to 236 for the special handling of n=0 are absent from
> Solaris 10 u5.
Well, the 2 versions are something like:
while(--n>=0) stuff;
and
if(n==0) return;
do { stuff; } while(--n>0)
They both look fine to me. I guess finding the real problem will still
take a bit of work...
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 9:01:33 AM
|
|
On 2009-07-18 10:01:33 +0100, Marc <marc.glisse@gmail.com> said:
> Chris Ridd wrote:
>
>> It prints "function address expected with -a" but "dis memset/100"
>> (etc) prints enough to see what's going on.
>
> Thank you for posting the result.
>
>> The implementation on a T2000 (10 update 5) is a little different from
>> the one at opensolaris.org in the case where n=0:
>
> The implementation you are showing looks a lot like:
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/sparc/gen/memset.s
(and
>
> sparcv9 for the 64 bit version)
>
>> (32-bit version)
>> 0x000311b4: mov %o0, %o5
>> 0x000311b8: cmp %o2, 7
>> 0x000311bc: blu 0x00031204 ! 0x31204
>> [...]
>
> (for sparc you usually don't want to cut the instruction just after a
> branch, as it gets executed in many cases)
Noted. The next instruction was:
0x000311c0: btst 3, %o5
>
>> 0x00031204: deccc %o2
>> 0x00031208: inc %o5
>> 0x0003120c: bgeu,a 0x00031204 ! 0x31204
>> 0x00031210: stb %o1, [%o5 - 1]
>> 0x00031214: retl
>>
>> (64-bit version)
>> 0x000000000003a780: mov %o0, %o5
>> 0x000000000003a784: cmp %o2, 7
>> 0x000000000003a788: bcs,pn %xcc,0x000000000003a828 ! 0x3a828
The next instruction was:
0x000000000003a78c: and %o1, 255, %o1
>> [...]
>> 0x000000000003a828: deccc %o2
>> 0x000000000003a82c: inc %o5
>> 0x000000000003a830: bcc,a,pt %xcc,0x000000000003a828 ! 0x3a828
>> 0x000000000003a834: stb %o1, [%o5 - 1]
>> 0x000000000003a838: retl
>>
>> ldd confirms this is coming from
>> /platform/SUNW,Sun-Fire-T200/lib/libc_psr.so.1 and
>> /platform/SUNW,Sun-Fire-T200/lib/sparcv9/libc_psr.so.1 respectively.
>>
>> Code from
>> <http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc_psr/sun4v/common/memset.s>:
If
>>
you look in the source history for that file it references a couple of
bugs suggesting Niagara 2-specific optimizations are included. My
T2000's only a Niagara 1 - someone with a Niagara 2 ought to repeat the
disassembly. Dave's using a Niagara 2, I think.
> Well, the 2 versions are something like:
>
> while(--n>=0) stuff;
Are you sure? It looks more like it'll always do the stb, but maybe
deccc does something I'm not seeing. (Any good references to sparc
assembly? The wikibook at <http://en.wikibooks.org/wiki/SPARC_Assembly>
is bit "sparse"...)
--
Chris
|
|
0
|
|
|
|
Reply
|
Chris
|
7/18/2009 9:21:51 AM
|
|
Marc wrote:
> On Jul 17, 2:30 am, Dave <f...@coo.com> wrote:
>> Without access to the Solaris 10 source code, it's impossible to know
>> precisely how the library function memset() is coded.
>
> Actually it should be. You could try something like:
> dbx ./someprogramusingmemset
> dis -a memset
> here it should print the implementation
> (I don't currently have access to a solaris to check if the syntax is
> exactly this one)
>
>> Someone looking at the OpenSolaris code
>>
>> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/lib...
>>
>> believes the memset() function in OpenSolaris for sun4v is ok, but it
>> unknown to us whether the code in Solaris 10 is the same or not.
>
> The sun4v implementation in opensolaris has /sun4v/ in its path. The
> URL above is for the implementation that I think is strange.
I'd be interested if any others could compile this bit of code (check2.c
below), which is a modification of a program that Marc wrote and
published on the gcc bug database, after this was openened initially as
a bug against gcc 4.4.0.
It dumps core on the Sun T5240, but runs fine on my Blade 2000. Sun's
compiler will not compile it though.
kirkby@t2:[~] $ gcc -m32 check2.c
kirkby@t2:[~] $ ldd a.out
libc.so.1 => /lib/libc.so.1
libm.so.2 => /lib/libm.so.2
/platform/SUNW,T5240/lib/libc_psr.so.1
kirkby@t2:[~] $ ./a.out
n=0
n=1
Abort (core dumped)
kirkby@t2:[~] $ cat check2.c
#include <stdio.h>
typedef __SIZE_TYPE__ size_t;
extern void *memset (void *, const void *, size_t);
extern void abort (void);
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];
int main (void)
{
int n;
for (n=0 ; n <=10;++n) {
printf("n=%d\n",n);
if (sizeof (size_t) != 4)
return 0;
buf[n] = 6;
memset (buf+n, 0, i + j);
if (buf[n] != 6)
abort ();
}
return 0;
}
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 3:50:57 PM
|
|
On 2009-07-18 16:50:57 +0100, Dave <foo@coo.com> said:
> It dumps core on the Sun T5240, but runs fine on my Blade 2000. Sun's
> compiler will not compile it though.
It depends on some GNUisms. Try replacing:
> typedef __SIZE_TYPE__ size_t;
> extern void *memset (void *, const void *, size_t);
> extern void abort (void);
with:
#include <string.h>
#include <stdlib.h>
--
Chris
|
|
0
|
|
|
|
Reply
|
Chris
|
7/18/2009 4:35:43 PM
|
|
Chris Ridd wrote:
> On 2009-07-18 16:50:57 +0100, Dave <foo@coo.com> said:
>
>> It dumps core on the Sun T5240, but runs fine on my Blade 2000. Sun's
>> compiler will not compile it though.
>
> It depends on some GNUisms. Try replacing:
>
>> typedef __SIZE_TYPE__ size_t;
>> extern void *memset (void *, const void *, size_t);
>> extern void abort (void);
>
> with:
>
> #include <string.h>
> #include <stdlib.h>
>
Thanks those changes allow it to build, but it still dumps core.
kirkby@t2:[~] $ cat check4.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];
int main (void)
{
int n;
for (n=0 ; n <=10;++n) {
printf("n=%d\n",n);
if (sizeof (size_t) != 4)
return 0;
buf[n] = 6;
memset (buf+n, 0, i + j);
if (buf[n] != 6)
abort ();
}
return 0;
}
kirkby@t2:[~] $ /opt/SUNWspro/bin/cc check4.c
kirkby@t2:[~] $ ./a.out
n=0
n=1
Abort (core dumped)
I can't say I am fully understanding this though. I'd be interested in
what you think having read:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757#c17
about whether this does show a problem with the memset function.
Both Sun's and GNU's compiler cause a core-dump, but looking at the
code, it does not surprise me. It looks to me that the code tries to set
a large chunk of unowned memory to 0.
It looks to me like the second and third arguments to memset are not in
the right order, but I might be mistaken.
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 5:48:20 PM
|
|
Chris Ridd wrote:
> you look in the source history for that file it references a couple of
> bugs suggesting Niagara 2-specific optimizations are included. My
> T2000's only a Niagara 1 - someone with a Niagara 2 ought to repeat the
> disassembly. Dave's using a Niagara 2, I think.
Indeed.
>> Well, the 2 versions are something like:
>>
>> while(--n>=0) stuff;
>
> Are you sure? It looks more like it'll always do the stb, but maybe
> deccc does something I'm not seeing.
Actually it is the branch delay slot that you are missing. When a branch
is taken, the next instruction is executed before the actual jump. When
the branch is not taken, the next instruction is executed only if the
annul bit (represented by ",a" in asm) is not set.
> (Any good references to sparc
> assembly? The wikibook at <http://en.wikibooks.org/wiki/SPARC_Assembly>
> is bit "sparse"...)
"ultrasparc architecture 2007" has enough information to understand and
write sparc asm, as long as you don't want to tune it to a specific
processor (it is missing the extra instructions present on newer fujitsu
processors).
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 5:50:40 PM
|
|
Dave wrote:
>>> It dumps core on the Sun T5240, but runs fine on my Blade 2000. Sun's
>>> compiler will not compile it though.
>>
>> It depends on some GNUisms. Try replacing:
>>
>>> typedef __SIZE_TYPE__ size_t;
>>> extern void *memset (void *, const void *, size_t);
>>> extern void abort (void);
>>
>> with:
>>
>> #include <string.h>
>> #include <stdlib.h>
>>
>
> Thanks those changes allow it to build, but it still dumps core.
Seems normal if the bug is in memset.
> kirkby@t2:[~] $ cat check4.c
> #include <stdio.h>
> #include <string.h>
> #include <stdlib.h>
>
> volatile size_t i = 0x80000000U, j = 0x80000000U;
> char buf[16];
>
> int main (void)
> {
> int n;
>
> for (n=0 ; n <=10;++n) {
> printf("n=%d\n",n);
You might want to write to stderr (or explicitly flush), otherwise some
output may be buffered and never printed.
> if (sizeof (size_t) != 4)
> return 0;
> buf[n] = 6;
> memset (buf+n, 0, i + j);
> if (buf[n] != 6)
> abort ();
> }
> return 0;
> }
>
>
> kirkby@t2:[~] $ /opt/SUNWspro/bin/cc check4.c
> kirkby@t2:[~] $ ./a.out
> n=0
> n=1
> Abort (core dumped)
>
> I can't say I am fully understanding this though. I'd be interested in
> what you think having read:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757#c17
>
> about whether this does show a problem with the memset function.
>
> Both Sun's and GNU's compiler cause a core-dump, but looking at the
> code, it does not surprise me. It looks to me that the code tries to set
> a large chunk of unowned memory to 0.
>
> It looks to me like the second and third arguments to memset are not in
> the right order, but I might be mistaken.
i and j are equal to 2^31. (i+j) is 2^32, that is 0. If memset behaves
differently for an input of 2^32 than an input of 0, it is a bug in
memset.
Could you try the dbx "dis memset/180" thing mentionned in this thread?
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 6:21:26 PM
|
|
Marc wrote:
> You might want to write to stderr (or explicitly flush), otherwise some
> output may be buffered and never printed.
Ah, no, forget that part, abort handles it, sorry.
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 6:30:15 PM
|
|
Marc wrote:
> Dave wrote:
>
>>>> It dumps core on the Sun T5240, but runs fine on my Blade 2000. Sun's
>>>> compiler will not compile it though.
>>> It depends on some GNUisms. Try replacing:
>>>
>>>> typedef __SIZE_TYPE__ size_t;
>>>> extern void *memset (void *, const void *, size_t);
>>>> extern void abort (void);
>>> with:
>>>
>>> #include <string.h>
>>> #include <stdlib.h>
>>>
>> Thanks those changes allow it to build, but it still dumps core.
>
> Seems normal if the bug is in memset.
>
>> kirkby@t2:[~] $ cat check4.c
>> #include <stdio.h>
>> #include <string.h>
>> #include <stdlib.h>
>>
>> volatile size_t i = 0x80000000U, j = 0x80000000U;
>> char buf[16];
>>
>> int main (void)
>> {
>> int n;
>>
>> for (n=0 ; n <=10;++n) {
>> printf("n=%d\n",n);
>
> You might want to write to stderr (or explicitly flush), otherwise some
> output may be buffered and never printed.
>
>> if (sizeof (size_t) != 4)
>> return 0;
>> buf[n] = 6;
>> memset (buf+n, 0, i + j);
>> if (buf[n] != 6)
>> abort ();
>> }
>> return 0;
>> }
>>
>>
>> kirkby@t2:[~] $ /opt/SUNWspro/bin/cc check4.c
>> kirkby@t2:[~] $ ./a.out
>> n=0
>> n=1
>> Abort (core dumped)
>>
>> I can't say I am fully understanding this though. I'd be interested in
>> what you think having read:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40757#c17
>>
>> about whether this does show a problem with the memset function.
>>
>> Both Sun's and GNU's compiler cause a core-dump, but looking at the
>> code, it does not surprise me. It looks to me that the code tries to set
>> a large chunk of unowned memory to 0.
>>
>> It looks to me like the second and third arguments to memset are not in
>> the right order, but I might be mistaken.
>
> i and j are equal to 2^31. (i+j) is 2^32, that is 0. If memset behaves
> differently for an input of 2^32 than an input of 0, it is a bug in
> memset.
>
> Could you try the dbx "dis memse
t/180" thing mentionned in this thread?
I see!!
I could not understand the logic of his code. Now I do. He's trying to
set 0 bytes of memory to 0, so it should not change the values in the
buffer.
That does indeed look like a memset bug to me now. I stuck a bit more
information, and disabled optimisation (at least I think -O0 will
disable the optimiser).
kirkby@t2:[~] $ uname -a
SunOS t2 5.10 Generic_141414-02 sun4v sparc SUNW,T5240
kirkby@t2:[~] $ cat /etc/release
Solaris 10 5/09 s10s_u7wos_08 SPARC
Copyright 2009 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 30 March 2009
kirkby@t2:[~] $ cat check4.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
volatile size_t i = 0x80000000U, j = 0x80000000U;
char buf[16];
int main (void)
{
int n;
for (n=0 ; n <=10;++n) {
printf("n=%d\n",n);
if (sizeof (size_t) != 4)
return 0;
buf[n] = 6;
memset (buf+n, 0, i + j);
if (buf[n] != 6)
abort ();
}
return 0;
}
kirkby@t2:[~] $ /opt/SUNWspro/bin/cc -O0 check4.c
kirkby@t2:[~] $ ./a.out
n=0
n=1
Abort (core dumped)
kirkby@t2:[~] $ ldd a.out
libc.so.1 => /lib/libc.so.1
libm.so.2 => /lib/libm.so.2
/platform/SUNW,T5240/lib/libc_psr.so.1
kirkby@t2:[~] $ /opt/SUNWspro/bin/dbx a.out
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.6' in
your .dbxrc
Reading a.out
Reading ld.so.1
Reading libc.so.1
(dbx) dis memset/100
dbx: warning: unknown language, 'c' assumed
0x00031494: mov %o0, %o5
0x00031498: cmp %o2, 7
0x0003149c: blu 0x000314e4 ! 0x314e4
0x000314a0: btst 3, %o5
0x000314a4: be 0x000314bc ! 0x314bc
0x000314a8: andn %o2, 3, %o3
0x000314ac: dec %o2
0x000314b0: stb %o1, [%o5]
0x000314b4: ba 0x000314a0 ! 0x314a0
0x000314b8: inc %o5
0x000314bc: and %o1, 255, %o1
0x000314c0: sll %o1, 8, %o4
0x000314c4: bset %o4, %o1
0x000314c8: sll %o1, 16, %o4
0x000314cc: bset %o4, %o1
0x000314d0: st %o1, [%o5]
0x000314d4: deccc 4, %o3
0x000314d8: bne 0x000314d0 ! 0x314d0
0x000314dc: inc 4, %o5
0x000314e0: and %o2, 3, %o2
0x000314e4: deccc %o2
0x000314e8: inc %o5
0x000314ec: bgeu,a 0x000314e4 ! 0x314e4
0x000314f0: stb %o1, [%o5 - 1]
0x000314f4: retl
0x000314f8: nop
0x000314fc: clr [%o0]
0x00031500: st %sp, [%o0 + 4]
0x00031504: add %o7, 8, %o1
0x00031508: st %o1, [%o0 + 8]
0x0003150c: st %fp, [%o0 + 12]
0x00031510: st %i7, [%o0 + 16]
0x00031514: retl
0x00031518: clr %o0
0x0003151c: ta %icc,0x00000003
0x00031520: ld [%o0 + 4], %o2
0x00031524: ldd [%o2], %l0
0x00031528: ldd [%o2 + 8], %l2
0x0003152c: ldd [%o2 + 16], %l4
0x00031530: ldd [%o2 + 24], %l6
0x00031534: ldd [%o2 + 32], %i0
0x00031538: ldd [%o2 + 40], %i2
0x0003153c: ldd [%o2 + 48], %i4
0x00031540: ld [%o0 + 12], %fp
0x00031544: mov %o2, %sp
0x00031548: ld [%o0 + 16], %i7
0x0003154c: ld [%o0 + 8], %o3
0x00031550: tst %o1
0x00031554: bne 0x00031560 ! 0x31560
0x00031558: sub %o3, 8, %o7
0x0003155c: mov 1, %o1
0x00031560: retl
0x00031564: mov %o1, %o0
0x00031568: save %sp, -544, %sp
0x0003156c: ld [%i0 + 12], %i3
0x00031570: add %fp, -448, %o0
0x00031574: clr %o1
0x00031578: ld [%i0 + 16], %l0
0x0003157c: mov 6, %i5
0x00031580: call 0x00131468 ! 0x131468
0x00031584: mov 448, %o2
0x00031588: st %i5, [%fp - 448]
0x0003158c: call 0x000c89a0 ! 0xc89a0
0x00031590: add %fp, -392, %o0
0x00031594: ld [%i0 + 64], %i2
0x00031598: st %i2, [%fp - 424]
0x0003159c: ld [%i0 + 68], %l7
0x000315a0: st %l7, [%fp - 420]
0x000315a4: ld [%i0 + 72], %l6
0x000315a8: st %l6, [%fp - 416]
0x000315ac: ld [%i0 + 20], %l5
0x000315b0: st %l5, [%fp - 444]
0x000315b4: ld [%i0 + 8], %l4
0x000315b8: st %l4, [%fp - 404]
0x000315bc: add %l4, 4, %l3
0x000315c0: st %l3, [%fp - 400]
0x000315c4: ld [%i0 + 4], %l2
0x000315c8: st %l2, [%fp - 340]
0x000315cc: ld [%i0], %l1
0x000315d0: btst 1, %l1
0x000315d4: be 0x00031608 ! 0x31608
0x000315d8: cmp %i1, 0
0x000315dc: ld [%fp - 448], %o3
0x000315e0: or %o3, 1, %o2
0x000315e4: st %o2, [%fp - 448]
0x000315e8: ld [%i0 + 48], %o1
0x000315ec: st %o1, [%fp - 440]
0x000315f0: ld [%i0 + 52], %o0
0x000315f4: st %o0, [%fp - 436]
0x000315f8: ld [%i0 + 56], %g1
0x000315fc: st %g1, [%fp - 432]
0x00031600: ld [%i0 + 60], %i4
0x00031604: st %i4, [%fp - 428]
0x00031608: be 0x00031618 ! 0x31618
0x0003160c: mov 1, %o4
0x00031610: ba 0x0003161c ! 0x3161c
0x00031614: st %i1, [%fp - 364]
0x00031618: st %o4, [%fp - 364]
0x0003161c: ld [%i0 + 4], %o5
0x00031620: cmp %o5, 0
(dbx)
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 7:19:53 PM
|
|
Dave wrote:
> /platform/SUNW,T5240/lib/libc_psr.so.1
> kirkby@t2:[~] $ /opt/SUNWspro/bin/dbx a.out
> For information about new features see `help changes'
> To remove this message, put `dbxenv suppress_startup_message 7.6' in
> your .dbxrc
> Reading a.out
> Reading ld.so.1
> Reading libc.so.1
> (dbx) dis memset/100
> dbx: warning: unknown language, 'c' assumed
> 0x00031494: mov %o0, %o5
> 0x00031498: cmp %o2, 7
> 0x0003149c: blu 0x000314e4 ! 0x314e4
> 0x000314a0: btst 3, %o5
Hmm, looks like it is printing the generic libc.so.1 implementation. My
fault, I thought it would work. Could you try instead:
/usr/ccs/bin/dis -F memset /platform/SUNW,T5240/lib/libc_psr.so.1
Thanks for running those tests, hopefully this is the last I am asking.
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 7:34:00 PM
|
|
Marc wrote:
> Dave wrote:
>
>> /platform/SUNW,T5240/lib/libc_psr.so.1
>> kirkby@t2:[~] $ /opt/SUNWspro/bin/dbx a.out
>> For information about new features see `help changes'
>> To remove this message, put `dbxenv suppress_startup_message 7.6' in
>> your .dbxrc
>> Reading a.out
>> Reading ld.so.1
>> Reading libc.so.1
>> (dbx) dis memset/100
>> dbx: warning: unknown language, 'c' assumed
>> 0x00031494: mov %o0, %o5
>> 0x00031498: cmp %o2, 7
>> 0x0003149c: blu 0x000314e4 ! 0x314e4
>> 0x000314a0: btst 3, %o5
>
> Hmm, looks like it is printing the generic libc.so.1 implementation. My
> fault, I thought it would work. Could you try instead:
> /usr/ccs/bin/dis -F memset /platform/SUNW,T5240/lib/libc_psr.so.1
>
> Thanks for running those tests, hopefully this is the last I am asking.
Perhaps you will want some more tests, as the output says memset is not
found. Don't feel guilty - this has caused a huge debate on the Sage
developers list, with the MPFR guys saying they refuse to work around
compiler bugs. There is a fix, which someone devleoped, which I've
implemented only on sun4v.
kirkby@t2:[~] $ /usr/ccs/bin/dis -F memset
/platform/SUNW,T5240/lib/libc_psr.so.1 a.out
disassembly for /platform/SUNW,T5240/lib/libc_psr.so.1
memset()
memset: 9a 10 00 08 mov %o0, %o5
memset+0x4: 80 a2 a0 07 cmp %o2, 0x7
memset+0x8: 0a 40 00 6c bcs,pn %icc, +0x1b0
<memset+0x1b8>
memset+0xc: 92 0a 60 ff and %o1, 0xff, %o1
memset+0x10: 97 2a 60 08 sll %o1, 0x8, %o3
memset+0x14: 92 12 40 0b or %o1, %o3, %o1
memset+0x18: 97 2a 60 10 sll %o1, 0x10, %o3
memset+0x1c: 80 a2 a0 20 cmp %o2, 0x20
memset+0x20: 0a 40 00 5a bcs,pn %icc, +0x168
<memset+0x188>
memset+0x24: 92 12 40 0b or %o1, %o3, %o1
memset+0x28: 97 2a 70 20 sllx %o1, 0x20, %o3
memset+0x2c: 92 12 40 0b or %o1, %o3, %o1
memset+0x30: 96 8b 60 07 andcc %o5, 0x7, %o3
memset+0x34: 02 48 00 07 be,pt %icc, +0x1c
<memset+0x50>
memset+0x38: 96 22 e0 08 sub %o3, 0x8, %o3
memset+0x3c: 94 02 80 0b add %o2, %o3, %o2
memset+0x40: d2 2b 40 00 stb %o1, [%o5]
memset+0x44: 96 82 e0 01 addcc %o3, 0x1, %o3
memset+0x48: 06 4f ff fe bl,pt %icc, -0x8
<memset+0x40>
memset+0x4c: 9a 03 60 01 add %o5, 0x1, %o5
memset+0x50: 87 80 20 e2 wr %g0, 0xe2, %asi
memset+0x54: 80 a2 a0 40 cmp %o2, 0x40
memset+0x58: 0a 40 00 41 bcs,pn %icc, +0x104
<memset+0x15c>
memset+0x5c: 96 10 00 0a mov %o2, %o3
memset+0x60: 96 8b 60 3f andcc %o5, 0x3f, %o3
memset+0x64: 02 48 00 07 be,pt %icc, +0x1c
<memset+0x80>
memset+0x68: 96 22 e0 40 sub %o3, 0x40, %o3
memset+0x6c: 94 02 80 0b add %o2, %o3, %o2
memset+0x70: d2 73 40 00 stx %o1, [%o5]
memset+0x74: 96 82 e0 08 addcc %o3, 0x8, %o3
memset+0x78: 06 4f ff fe bl,pt %icc, -0x8
<memset+0x70>
memset+0x7c: 9a 03 60 08 add %o5, 0x8, %o5
memset+0x80: 96 0a a0 3f and %o2, 0x3f, %o3
memset+0x84: 98 2a a0 3f andn %o2, 0x3f, %o4
memset+0x88: 80 a3 21 00 cmp %o4, 0x100
memset+0x8c: 0a 40 00 26 bcs,pn %icc, +0x98
<memset+0x124>
memset+0x90: 01 00 00 00 nop
memset+0x94: d2 f3 60 00 stxa %o1, [%o5 + 0x0] %asi
memset+0x98: d2 f3 60 40 stxa %o1, [%o5 + 0x40] %asi
memset+0x9c: d2 f3 60 80 stxa %o1, [%o5 + 0x80] %asi
memset+0xa0: d2 f3 60 c0 stxa %o1, [%o5 + 0xc0] %asi
memset+0xa4: d2 f3 60 08 stxa %o1, [%o5 + 0x8] %asi
memset+0xa8: d2 f3 60 10 stxa %o1, [%o5 + 0x10] %asi
memset+0xac: d2 f3 60 18 stxa %o1, [%o5 + 0x18] %asi
memset+0xb0: d2 f3 60 20 stxa %o1, [%o5 + 0x20] %asi
memset+0xb4: d2 f3 60 28 stxa %o1, [%o5 + 0x28] %asi
memset+0xb8: d2 f3 60 30 stxa %o1, [%o5 + 0x30] %asi
memset+0xbc: d2 f3 60 38 stxa %o1, [%o5 + 0x38] %asi
memset+0xc0: d2 f3 60 48 stxa %o1, [%o5 + 0x48] %asi
memset+0xc4: d2 f3 60 50 stxa %o1, [%o5 + 0x50] %asi
memset+0xc8: d2 f3 60 58 stxa %o1, [%o5 + 0x58] %asi
memset+0xcc: d2 f3 60 60 stxa %o1, [%o5 + 0x60] %asi
memset+0xd0: d2 f3 60 68 stxa %o1, [%o5 + 0x68] %asi
memset+0xd4: d2 f3 60 70 stxa %o1, [%o5 + 0x70] %asi
memset+0xd8: d2 f3 60 78 stxa %o1, [%o5 + 0x78] %asi
memset+0xdc: d2 f3 60 88 stxa %o1, [%o5 + 0x88] %asi
memset+0xe0: d2 f3 60 90 stxa %o1, [%o5 + 0x90] %asi
memset+0xe4: d2 f3 60 98 stxa %o1, [%o5 + 0x98] %asi
memset+0xe8: d2 f3 60 a0 stxa %o1, [%o5 + 0xa0] %asi
memset+0xec: d2 f3 60 a8 stxa %o1, [%o5 + 0xa8] %asi
memset+0xf0: d2 f3 60 b0 stxa %o1, [%o5 + 0xb0] %asi
memset+0xf4: d2 f3 60 b8 stxa %o1, [%o5 + 0xb8] %asi
memset+0xf8: d2 f3 60 c8 stxa %o1, [%o5 + 0xc8] %asi
memset+0xfc: d2 f3 60 d0 stxa %o1, [%o5 + 0xd0] %asi
memset+0x100: d2 f3 60 d8 stxa %o1, [%o5 + 0xd8] %asi
memset+0x104: d2 f3 60 e0 stxa %o1, [%o5 + 0xe0] %asi
memset+0x108: d2 f3 60 e8 stxa %o1, [%o5 + 0xe8] %asi
memset+0x10c: d2 f3 60 f0 stxa %o1, [%o5 + 0xf0] %asi
memset+0x110: d2 f3 60 f8 stxa %o1, [%o5 + 0xf8] %asi
memset+0x114: 98 23 21 00 sub %o4, 0x100, %o4
memset+0x118: 80 a3 21 00 cmp %o4, 0x100
memset+0x11c: 18 4f ff de bgu,pt %icc, -0x88
<memset+0x94>
memset+0x120: 9a 03 61 00 add %o5, 0x100, %o5
memset+0x124: 80 a3 20 40 cmp %o4, 0x40
memset+0x128: 0a 48 00 0d bcs,pt %icc, +0x34
<memset+0x15c>
memset+0x12c: 01 00 00 00 nop
memset+0x130: d2 f3 60 00 stxa %o1, [%o5 + 0x0] %asi
memset+0x134: d2 f3 60 08 stxa %o1, [%o5 + 0x8] %asi
memset+0x138: d2 f3 60 10 stxa %o1, [%o5 + 0x10] %asi
memset+0x13c: d2 f3 60 18 stxa %o1, [%o5 + 0x18] %asi
memset+0x140: d2 f3 60 20 stxa %o1, [%o5 + 0x20] %asi
memset+0x144: d2 f3 60 28 stxa %o1, [%o5 + 0x28] %asi
memset+0x148: d2 f3 60 30 stxa %o1, [%o5 + 0x30] %asi
memset+0x14c: d2 f3 60 38 stxa %o1, [%o5 + 0x38] %asi
memset+0x150: 98 a3 20 40 subcc %o4, 0x40, %o4
memset+0x154: 18 4f ff f7 bgu,pt %icc, -0x24
<memset+0x130>
memset+0x158: 9a 03 60 40 add %o5, 0x40, %o5
memset+0x15c: 81 43 e0 40 membar #Sync
memset+0x160: 87 80 20 82 wr %g0, 0x82, %asi
memset+0x164: 96 a2 e0 08 subcc %o3, 0x8, %o3
memset+0x168: 0a 40 00 14 bcs,pn %icc, +0x50
<memset+0x1b8>
memset+0x16c: 94 0a a0 07 and %o2, 0x7, %o2
memset+0x170: d2 73 40 00 stx %o1, [%o5]
memset+0x174: 96 a2 e0 08 subcc %o3, 0x8, %o3
memset+0x178: 1a 4f ff fe bcc,pt %icc, -0x8
<memset+0x170>
memset+0x17c: 9a 03 60 08 add %o5, 0x8, %o5
memset+0x180: 10 80 00 0e ba +0x38
<memset+0x1b8>
memset+0x184: 01 00 00 00 nop
memset+0x188: 96 8b 60 03 andcc %o5, 0x3, %o3
memset+0x18c: 02 40 00 06 be,pn %icc, +0x18
<memset+0x1a4>
memset+0x190: 96 2a a0 03 andn %o2, 0x3, %o3
memset+0x194: 94 22 a0 01 sub %o2, 0x1, %o2
memset+0x198: d2 2b 40 00 stb %o1, [%o5]
memset+0x19c: 10 bf ff fb ba -0x14
<memset+0x188>
memset+0x1a0: 9a 03 60 01 add %o5, 0x1, %o5
memset+0x1a4: d2 23 40 00 st %o1, [%o5]
memset+0x1a8: 96 a2 e0 04 subcc %o3, 0x4, %o3
memset+0x1ac: 12 4f ff fe bne,pt %icc, -0x8
<memset+0x1a4>
memset+0x1b0: 9a 03 60 04 add %o5, 0x4, %o5
memset+0x1b4: 94 0a a0 03 and %o2, 0x3, %o2
memset+0x1b8: 02 ca 80 06 brz,pt %o2, +0x18
<memset+0x1d0>
memset+0x1bc: 01 00 00 00 nop
memset+0x1c0: 94 a2 a0 01 subcc %o2, 0x1, %o2
memset+0x1c4: d2 2b 40 00 stb %o1, [%o5]
memset+0x1c8: 18 4f ff fe bgu,pt %icc, -0x8
<memset+0x1c0>
memset+0x1cc: 9a 03 60 01 add %o5, 0x1, %o5
memset+0x1d0: 81 c3 e0 08 retl
memset+0x1d4: 01 00 00 00 nop
disassembly for a.out
dis: warning: failed to find function 'memset' in 'a.out'
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 7:52:04 PM
|
|
If this is a bug in mpfr on sun4v (or some subset of sun4v), a short
term solution would perhaps be to ensure a generic version of memset()
is called, rather than one optimised for this processor.
Perhaps this could be done in the link phase of mpfr.
From a practical point of view, I have a short-term solution developed
by an mpfr developer, but it's clear a longer term solution would be for
Sun to fix this. I don't know if the fix I have can guarantee to work in
all cases, whereas calling another implementation of the library would.
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 8:01:53 PM
|
|
Dave wrote:
> If this is a bug in mpfr on sun4v (or some subset of sun4v), a short
> term solution would perhaps be to ensure a generic version of memset()
> is called, rather than one optimised for this processor.
>
> Perhaps this could be done in the link phase of mpfr.
>
> From a practical point of view, I have a short-term solution developed
> by an mpfr developer, but it's clear a longer term solution would be for
> Sun to fix this. I don't know if the fix I have can guarantee to work in
> all cases, whereas calling another implementation of the library would.
>
>
Sorry, it is as a bug in *memset* - not mpfr. It seems Sun needs to fix
this. I suspect a Sun employee will see this soon and comment and if so
perhaps raise it as a bug. If not I'll do it via the Solaris support
channels, as this machine does have a support contract with Sun.
Dave
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 8:04:07 PM
|
|
Dave wrote:
> Perhaps you will want some more tests, as the output says memset is not
> found.
No, that's fine, you added an extra "a.out" which causes this harmless
message.
> memset()
> memset: 9a 10 00 08 mov %o0, %o5
> memset+0x4: 80 a2 a0 07 cmp %o2, 0x7
> memset+0x8: 0a 40 00 6c bcs,pn %icc, +0x1b0
> <memset+0x1b8>
n (third argument) is smaller than 7 (as a 32 bit number), so jump...
> memset+0xc: 92 0a 60 ff and %o1, 0xff, %o1
> memset+0x10: 97 2a 60 08 sll %o1, 0x8, %o3
> memset+0x14: 92 12 40 0b or %o1, %o3, %o1
> memset+0x18: 97 2a 60 10 sll %o1, 0x10, %o3
> memset+0x1c: 80 a2 a0 20 cmp %o2, 0x20
> memset+0x20: 0a 40 00 5a bcs,pn %icc, +0x168
> <memset+0x188>
> memset+0x24: 92 12 40 0b or %o1, %o3, %o1
> memset+0x28: 97 2a 70 20 sllx %o1, 0x20, %o3
> memset+0x2c: 92 12 40 0b or %o1, %o3, %o1
> memset+0x30: 96 8b 60 07 andcc %o5, 0x7, %o3
> memset+0x34: 02 48 00 07 be,pt %icc, +0x1c
> <memset+0x50>
> memset+0x38: 96 22 e0 08 sub %o3, 0x8, %o3
> memset+0x3c: 94 02 80 0b add %o2, %o3, %o2
> memset+0x40: d2 2b 40 00 stb %o1, [%o5]
> memset+0x44: 96 82 e0 01 addcc %o3, 0x1, %o3
> memset+0x48: 06 4f ff fe bl,pt %icc, -0x8
> <memset+0x40>
> memset+0x4c: 9a 03 60 01 add %o5, 0x1, %o5
> memset+0x50: 87 80 20 e2 wr %g0, 0xe2, %asi
> memset+0x54: 80 a2 a0 40 cmp %o2, 0x40
> memset+0x58: 0a 40 00 41 bcs,pn %icc, +0x104
> <memset+0x15c>
> memset+0x5c: 96 10 00 0a mov %o2, %o3
> memset+0x60: 96 8b 60 3f andcc %o5, 0x3f, %o3
> memset+0x64: 02 48 00 07 be,pt %icc, +0x1c
> <memset+0x80>
> memset+0x68: 96 22 e0 40 sub %o3, 0x40, %o3
> memset+0x6c: 94 02 80 0b add %o2, %o3, %o2
> memset+0x70: d2 73 40 00 stx %o1, [%o5]
> memset+0x74: 96 82 e0 08 addcc %o3, 0x8, %o3
> memset+0x78: 06 4f ff fe bl,pt %icc, -0x8
> <memset+0x70>
> memset+0x7c: 9a 03 60 08 add %o5, 0x8, %o5
> memset+0x80: 96 0a a0 3f and %o2, 0x3f, %o3
> memset+0x84: 98 2a a0 3f andn %o2, 0x3f, %o4
> memset+0x88: 80 a3 21 00 cmp %o4, 0x100
> memset+0x8c: 0a 40 00 26 bcs,pn %icc, +0x98
> <memset+0x124>
> memset+0x90: 01 00 00 00 nop
> memset+0x94: d2 f3 60 00 stxa %o1, [%o5 + 0x0] %asi
> memset+0x98: d2 f3 60 40 stxa %o1, [%o5 + 0x40] %asi
> memset+0x9c: d2 f3 60 80 stxa %o1, [%o5 + 0x80] %asi
> memset+0xa0: d2 f3 60 c0 stxa %o1, [%o5 + 0xc0] %asi
> memset+0xa4: d2 f3 60 08 stxa %o1, [%o5 + 0x8] %asi
> memset+0xa8: d2 f3 60 10 stxa %o1, [%o5 + 0x10] %asi
> memset+0xac: d2 f3 60 18 stxa %o1, [%o5 + 0x18] %asi
> memset+0xb0: d2 f3 60 20 stxa %o1, [%o5 + 0x20] %asi
> memset+0xb4: d2 f3 60 28 stxa %o1, [%o5 + 0x28] %asi
> memset+0xb8: d2 f3 60 30 stxa %o1, [%o5 + 0x30] %asi
> memset+0xbc: d2 f3 60 38 stxa %o1, [%o5 + 0x38] %asi
> memset+0xc0: d2 f3 60 48 stxa %o1, [%o5 + 0x48] %asi
> memset+0xc4: d2 f3 60 50 stxa %o1, [%o5 + 0x50] %asi
> memset+0xc8: d2 f3 60 58 stxa %o1, [%o5 + 0x58] %asi
> memset+0xcc: d2 f3 60 60 stxa %o1, [%o5 + 0x60] %asi
> memset+0xd0: d2 f3 60 68 stxa %o1, [%o5 + 0x68] %asi
> memset+0xd4: d2 f3 60 70 stxa %o1, [%o5 + 0x70] %asi
> memset+0xd8: d2 f3 60 78 stxa %o1, [%o5 + 0x78] %asi
> memset+0xdc: d2 f3 60 88 stxa %o1, [%o5 + 0x88] %asi
> memset+0xe0: d2 f3 60 90 stxa %o1, [%o5 + 0x90] %asi
> memset+0xe4: d2 f3 60 98 stxa %o1, [%o5 + 0x98] %asi
> memset+0xe8: d2 f3 60 a0 stxa %o1, [%o5 + 0xa0] %asi
> memset+0xec: d2 f3 60 a8 stxa %o1, [%o5 + 0xa8] %asi
> memset+0xf0: d2 f3 60 b0 stxa %o1, [%o5 + 0xb0] %asi
> memset+0xf4: d2 f3 60 b8 stxa %o1, [%o5 + 0xb8] %asi
> memset+0xf8: d2 f3 60 c8 stxa %o1, [%o5 + 0xc8] %asi
> memset+0xfc: d2 f3 60 d0 stxa %o1, [%o5 + 0xd0] %asi
> memset+0x100: d2 f3 60 d8 stxa %o1, [%o5 + 0xd8] %asi
> memset+0x104: d2 f3 60 e0 stxa %o1, [%o5 + 0xe0] %asi
> memset+0x108: d2 f3 60 e8 stxa %o1, [%o5 + 0xe8] %asi
> memset+0x10c: d2 f3 60 f0 stxa %o1, [%o5 + 0xf0] %asi
> memset+0x110: d2 f3 60 f8 stxa %o1, [%o5 + 0xf8] %asi
> memset+0x114: 98 23 21 00 sub %o4, 0x100, %o4
> memset+0x118: 80 a3 21 00 cmp %o4, 0x100
> memset+0x11c: 18 4f ff de bgu,pt %icc, -0x88
> <memset+0x94>
> memset+0x120: 9a 03 61 00 add %o5, 0x100, %o5
> memset+0x124: 80 a3 20 40 cmp %o4, 0x40
> memset+0x128: 0a 48 00 0d bcs,pt %icc, +0x34
> <memset+0x15c>
> memset+0x12c: 01 00 00 00 nop
> memset+0x130: d2 f3 60 00 stxa %o1, [%o5 + 0x0] %asi
> memset+0x134: d2 f3 60 08 stxa %o1, [%o5 + 0x8] %asi
> memset+0x138: d2 f3 60 10 stxa %o1, [%o5 + 0x10] %asi
> memset+0x13c: d2 f3 60 18 stxa %o1, [%o5 + 0x18] %asi
> memset+0x140: d2 f3 60 20 stxa %o1, [%o5 + 0x20] %asi
> memset+0x144: d2 f3 60 28 stxa %o1, [%o5 + 0x28] %asi
> memset+0x148: d2 f3 60 30 stxa %o1, [%o5 + 0x30] %asi
> memset+0x14c: d2 f3 60 38 stxa %o1, [%o5 + 0x38] %asi
> memset+0x150: 98 a3 20 40 subcc %o4, 0x40, %o4
> memset+0x154: 18 4f ff f7 bgu,pt %icc, -0x24
> <memset+0x130>
> memset+0x158: 9a 03 60 40 add %o5, 0x40, %o5
> memset+0x15c: 81 43 e0 40 membar #Sync
> memset+0x160: 87 80 20 82 wr %g0, 0x82, %asi
> memset+0x164: 96 a2 e0 08 subcc %o3, 0x8, %o3
> memset+0x168: 0a 40 00 14 bcs,pn %icc, +0x50
> <memset+0x1b8>
> memset+0x16c: 94 0a a0 07 and %o2, 0x7, %o2
> memset+0x170: d2 73 40 00 stx %o1, [%o5]
> memset+0x174: 96 a2 e0 08 subcc %o3, 0x8, %o3
> memset+0x178: 1a 4f ff fe bcc,pt %icc, -0x8
> <memset+0x170>
> memset+0x17c: 9a 03 60 08 add %o5, 0x8, %o5
> memset+0x180: 10 80 00 0e ba +0x38
> <memset+0x1b8>
> memset+0x184: 01 00 00 00 nop
> memset+0x188: 96 8b 60 03 andcc %o5, 0x3, %o3
> memset+0x18c: 02 40 00 06 be,pn %icc, +0x18
> <memset+0x1a4>
> memset+0x190: 96 2a a0 03 andn %o2, 0x3, %o3
> memset+0x194: 94 22 a0 01 sub %o2, 0x1, %o2
> memset+0x198: d2 2b 40 00 stb %o1, [%o5]
> memset+0x19c: 10 bf ff fb ba -0x14
> <memset+0x188>
> memset+0x1a0: 9a 03 60 01 add %o5, 0x1, %o5
> memset+0x1a4: d2 23 40 00 st %o1, [%o5]
> memset+0x1a8: 96 a2 e0 04 subcc %o3, 0x4, %o3
> memset+0x1ac: 12 4f ff fe bne,pt %icc, -0x8
> <memset+0x1a4>
> memset+0x1b0: 9a 03 60 04 add %o5, 0x4, %o5
> memset+0x1b4: 94 0a a0 03 and %o2, 0x3, %o2
.... and land on the line below.
> memset+0x1b8: 02 ca 80 06 brz,pt %o2, +0x18
> <memset+0x1d0>
brz tests all 64 bits, not just the low 32 ones. 2^32 is thus not
considered as 0, and we don't jump straight to the exit.
> memset+0x1bc: 01 00 00 00 nop
> memset+0x1c0: 94 a2 a0 01 subcc %o2, 0x1, %o2
> memset+0x1c4: d2 2b 40 00 stb %o1, [%o5]
and here is a store.
> memset+0x1c8: 18 4f ff fe bgu,pt %icc, -0x8
> <memset+0x1c0>
> memset+0x1cc: 9a 03 60 01 add %o5, 0x1, %o5
> memset+0x1d0: 81 c3 e0 08 retl
> memset+0x1d4: 01 00 00 00 nop
> disassembly for a.out
With this analysis, I don't understand why the problem appears to depend
on the alignment (aborts for buf[1] but not buf[0] in the testcase), so I
have probably missed something.
The implementation above looks like the opensolaris one, but there are
some differences, and in particular the bug seems to be already fixed
there.
Now you need to get Sun to fix it in S10 (I think you said you had a
support contract for this machine?).
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 8:41:21 PM
|
|
Dave wrote:
> If this is a bug in mpfr on sun4v (or some subset of sun4v), a short
> term solution would perhaps be to ensure a generic version of memset()
> is called, rather than one optimised for this processor.
I was going to suggest calling bzero, but from what I can see there:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/sun4v/cpu/generic_copy.s#1568
it looks like it has the same problem and is not fixed in opensolaris.
> Perhaps this could be done in the link phase of mpfr.
>
> From a practical point of view, I have a short-term solution developed
> by an mpfr developer, but it's clear a longer term solution would be for
> Sun to fix this. I don't know if the fix I have can guarantee to work in
> all cases, whereas calling another implementation of the library would.
It looks like the problem is only for n==0 (the line just before the brz
clears the high 32 bits, if you don't just jump over it), so the
workaround (if(l!=0)memset(*,*,l);) should be fine.
|
|
0
|
|
|
|
Reply
|
Marc
|
7/18/2009 8:58:38 PM
|
|
Marc wrote:
> With this analysis, I don't understand why the problem appears to depend
> on the alignment (aborts for buf[1] but not buf[0] in the testcase), so I
> have probably missed something.
It's a long time since I done any assembly code (the last on a VAX I
think) and a bit more x86 before that, that was two decades ago. I've
never looked a SPARC assembly code before.
> The implementation above looks like the opensolaris one, but there are
> some differences, and in particular the bug seems to be already fixed
> there.
>
> Now you need to get Sun to fix it in S10 (I think you said you had a
> support contract for this machine?).
I'll raise this with Sun on Monday. Yes, this machine is one a 'silver'
support contract.
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/18/2009 9:53:18 PM
|
|
Many people looked at this memset() issue. I now have some information
from Sun. I asked the engineer if I could make it public, and he has
said yes. He is in fact going to put some of it on a mailing list, and
it will eventually appear in Sunsolve.
I'm told the fix will be backported to Solaris 10 and I should have a
Interim Diagnostic Relief to test myself in a few weeks, but it wont be
a public patch for some time, until it's fully tested.
Dave
----
Your service case regarding memset(3C)'s behaviour on sun4v systems
when the size_t argument is nonzero but zero mod 2^32 has now been
transferred to Europe, and I have taken ownership. And I intend to
keep ownership until we've reached a mutually acceptable resolution,
barring vacation stand-ins and unforeseen events.
Let me quickly recapitulate the facts:
+ This is on record as a bug under Change Request id 6507249,
+ it's fixed in the internal development version of (future)
Solaris and thus in OpenSolaris based on builds snv_62 or later,
+ it affects only
+ 32-bit applications
+ running on Solaris 10
+ on all SPARC sun4v (CoolThreads^TM) platforms,
+ it originates in the hardware-optimized libc_psr_hwcap[12].so.1
which (by default) get mounted over /platform/sun4v/lib/libc_psr.so.1
during the Solaris boot sequence,
+ it affects invocations of memset(3C) where the third (size_t)
argument is nonzero but its low-order 32 bits are zero (thus
it ought to be zero considered as a size_t).
(A subtle point is that it won't affect the *first* call to memset()
after exec, as the runtime loader processing for lazy symbol binding
will clear the upper 32 bits as a side effect before passing control
to the newly-bound function entry point.)
The bugfix has not (yet) been beackported to Solaris 10 because there
has not (yet) been any tangible demand for such a backport. Until
just now, there had not yet been a single external customer record
on CR#6507249!
I am adding one for this present service request now.
In fact, the vast majority of application code would not be at risk
of being affected by this bug. Most uses of memset() pass a compile-
time constant for the size, often some sizeof(struct such_and_such).
Passing a manifest 32-bit int variable for the size will also avoid
the bug. It can only happen when memset() is invoked with some
nontrivial arithmetic expression, or some explicit 64-bit variable
for the size. Such code idioms are quite rare.
Also, there are a number of workarounds to choose from, depending on
the situation:
+ When the application source code is available for modification:
+ store the expression result in a variable and then pass the
variable to memset() (though compiler optimizations might
subvert this),
+ test the variable for being 32-bit-equal-to-zero and bypass
memset() if it is,
+ or at runtime:
+ invoke the application with LD_NOAUXFLTR=1 in the environment
(cf. man ld.so.1(1), which selectively disables the optimized
libc_psr.so.1 just for this process),
+ umount the optimized libc_psr.so.1 system-wide,
+ interpose a different memset() implementation e.g. via an
LD_PRELOAD'ed shared object.
> Since the MPFR library code we are using is open source, we have managed
> to work around this Solaris bug, by sticking an 'if' statement in front
> of the call to macro which calls memset().
>
> Though of course I don't know if it will affect anything else. So I
> guess it is safer to unmount this, but I assume that will have quite a
> performance impact.
The performance impact of not using the optimized libc_psr.so.1
varies widely among applications, depending on how much memset()ing
and memcpy()ing and memmove()ing they do. It can range all the way
from negligible to a few ten percent in benchmarks.
But the LD_NOAUXFLTR=1 approach limits the performance impact to
those applications which are known or suspected to be affected
by the bug.---
<SNIP>
Would you be willing to test-drive any future binary fix in the
shape of an Interim Diagnostic Relief prior to patch creation,
as well as a release-candidate patch at the T-Patch stage prior
to patch release? For background information on IDRs, please see:
http://sunsolve.sun.com/show.do?target=IDR
Since the affected deliverables (libc_psr_hwcap1.so.1) have also
been modified by existing patches including some Kernel Update
patches, any such IDR would (have to) be built to fit onto a
particular set of patch revisions. The most recent change had
come in Kernel Update patch 127127-11, thus the easiest would
be an IDR with a hard dependency on this patch. Should you have
need for an IDR against older patch levels than this, please do
let me know!
--------------------------------------
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/23/2009 11:58:59 AM
|
|
Dave wrote:
> Many people looked at this memset() issue. I now have some information
> from Sun. I asked the engineer if I could make it public, and he has
> said yes. He is in fact going to put some of it on a mailing list, and
> it will eventually appear in Sunsolve.
Thanks for keeping us informed.
> Your service case regarding memset(3C)'s behaviour on sun4v systems
> when the size_t argument is nonzero but zero mod 2^32 has now been
> transferred to Europe, and I have taken ownership. And I intend to
> keep ownership until we've reached a mutually acceptable resolution,
> barring vacation stand-ins and unforeseen events.
I hope they also fix the other occurences of the problem (there are
several in opensolaris, at least one for memset and one for bzero, that
haven't already been fixed, as opposed to this one).
> (A subtle point is that it won't affect the *first* call to memset()
> after exec, as the runtime loader processing for lazy symbol binding
> will clear the upper 32 bits as a side effect before passing control
> to the newly-bound function entry point.)
Argh! Ok, this is what I was missing in my analysis...
|
|
0
|
|
|
|
Reply
|
Marc
|
7/23/2009 10:18:55 PM
|
|
Marc wrote:
> Dave wrote:
>
>> Many people looked at this memset() issue. I now have some information
>> from Sun. I asked the engineer if I could make it public, and he has
>> said yes. He is in fact going to put some of it on a mailing list, and
>> it will eventually appear in Sunsolve.
>
> Thanks for keeping us informed.
No problem.
>> Your service case regarding memset(3C)'s behaviour on sun4v systems
>> when the size_t argument is nonzero but zero mod 2^32 has now been
>> transferred to Europe, and I have taken ownership. And I intend to
>> keep ownership until we've reached a mutually acceptable resolution,
>> barring vacation stand-ins and unforeseen events.
>
> I hope they also fix the other occurences of the problem (there are
> several in opensolaris, at least one for memset and one for bzero, that
> haven't already been fixed, as opposed to this one).
Can you be more precise and give me the bug ID's. I'll pass that to the
Sun engineer dealing with this.
If there are no bug IDs created, perhaps you can create them.
However, it is clear that fixing bugs takes time, implementing them
takes time for system admins and there is always a risk of introducing
another bug. Hence Sun wont fix bugs unless there is some justification
for it. My report was the first time they were aware of a customer
having an issue with this, although they were aware of the bug. But they
have agreed to fix it.
This is particularly good given Sun donated the machine for nothing.
>> (A subtle point is that it won't affect the *first* call to memset()
>> after exec, as the runtime loader processing for lazy symbol binding
>> will clear the upper 32 bits as a side effect before passing control
>> to the newly-bound function entry point.)
>
> Argh! Ok, this is what I was missing in my analysis...
Yes, it was not so obvious was it.
--
I respectfully request that this message is not archived by companies as
unscrupulous as 'Experts Exchange' . In case you are unaware,
'Experts Exchange' take questions posted on the web and try to find
idiots stupid enough to pay for the answers, which were posted freely
by others. They are leeches.
|
|
0
|
|
|
|
Reply
|
Dave
|
7/24/2009 12:25:34 AM
|
|
Dave wrote:
>> I hope they also fix the other occurences of the problem (there are
>> several in opensolaris, at least one for memset and one for bzero, that
>> haven't already been fixed, as opposed to this one).
>
> Can you be more precise and give me the bug ID's. I'll pass that to the
> Sun engineer dealing with this.
>
> If there are no bug IDs created, perhaps you can create them.
I'll try to do that some time.
|
|
0
|
|
|
|
Reply
|
Marc
|
7/24/2009 1:09:11 AM
|
|
|
22 Replies
440 Views
(page loaded in 0.692 seconds)
|