Books for advanced C++ debugging

  • Follow


I have trouble debugging C++. For instance, I learned recently that code
that compiled with gcc -Wall without a single warning can be completely 
buggy.

The reason is that the C++ standard says that using a pointer of another
type to change memory as the type declared that this memory should be
is "undefined"...

Well, gcc has an option (that I discovered later)

-Wstrict-aliasing

that should warn you if you are doing something wrong. Since that
warning is enabled when -Wall is used, I thought we were protected,
and filed a bug report when we discovered that gcc generated code that
reads a value from uninitialized memory.


Of course, we wanted to make things easy to the gcc folks, and
wasted a lot of time isolating the bug, and then we sent it
to gcc's bug database.

The answer was simply

"Your code has aliasing problems. This is not the place to
educate you about C/C++"

We forgot to pass the code snippet through -Wall and the gcc folks
apparently did not understand that this snippet wasn't the problem

Great.

But how come that gcc doesn't emit the slightest warning?

Well, we discovered that -Wall does enable the -Wstrict-aliasing but
that option has a "scale" i.e. you can say

-Wstrict-aliasing=1
up to
-Wstrict-aliasing=5

When you set -Wall you enable -Wstrict-aliasing=3.

OK. We increased the level to 5 (lengthening the compile time of our
software that is already a staggering 20 minutes in a 4 core machine)

Still, gcc doesn't emit A SINGLE WARNING!

Now, how can we discover this?

The problem is that this code was written well before the new C++
standard was written. It was written in a time when doing this was
correct (In 32 bit pointers)

struct twoPointers {
	void *a;
	void *b;
};

And you could manipulate that data as it would be a single 64 bit
integer.

Since we do NOT rewrite all our software every time the C++ standard
changes, how can we find this kind of bugs?

I thought there could be a book with *advanced* C++ debugging but a
Google search, then an Amazon search yielded nothing
but books for beginners or user manuals of Visual C++ debugger
written in a book form.

Is there a combination of gcc warnings (that is NOT included in Wall
since we already have that) that could be useful here?

Is there a tool somewhere that could diagnose this problem?

And last but not least: Is there a good book in C++ debugging?

Thanks in advance



jacob navia
0
Reply jacob24 (973) 7/9/2009 8:42:45 PM

jacob navia wrote, On 9.7.2009 22:42:
> I have trouble debugging C++. For instance, I learned recently that code
> that compiled with gcc -Wall without a single warning can be completely
> buggy.
> 
> The reason is that the C++ standard says that using a pointer of another
> type to change memory as the type declared that this memory should be
> is "undefined"...
> 
> Well, gcc has an option (that I discovered later)
> 
> -Wstrict-aliasing
> 
> that should warn you if you are doing something wrong. Since that
> warning is enabled when -Wall is used, I thought we were protected,
> and filed a bug report when we discovered that gcc generated code that
> reads a value from uninitialized memory.
> 
> 
> Of course, we wanted to make things easy to the gcc folks, and
> wasted a lot of time isolating the bug, and then we sent it
> to gcc's bug database.
> 
> The answer was simply
> 
> "Your code has aliasing problems. This is not the place to
> educate you about C/C++"
> 
> We forgot to pass the code snippet through -Wall and the gcc folks
> apparently did not understand that this snippet wasn't the problem
From your description lower in the email, it seems to me that your code is
really the problem.

> 
> Great.
> 
> But how come that gcc doesn't emit the slightest warning?
Violating aliasing rules is UB, UB does not require diagnostics.

> 
> Well, we discovered that -Wall does enable the -Wstrict-aliasing but
> that option has a "scale" i.e. you can say
> 
> -Wstrict-aliasing=1
> up to
> -Wstrict-aliasing=5
> 
> When you set -Wall you enable -Wstrict-aliasing=3.
> 
> OK. We increased the level to 5 (lengthening the compile time of our
> software that is already a staggering 20 minutes in a 4 core machine)
Just 20 minutes? Thats nothing.

> 
> Still, gcc doesn't emit A SINGLE WARNING!
The levels exist because the higher levels can give false positives. And even
then they are not exhaustive. The compiler cannot see or recognize each and
every aliasing rules violation.

> 
> Now, how can we discover this?
> 
> The problem is that this code was written well before the new C++
> standard was written. It was written in a time when doing this was
> correct (In 32 bit pointers)
> 
> struct twoPointers {
>     void *a;
>     void *b;
> };
> 
> And you could manipulate that data as it would be a single 64 bit
> integer.
That was never correct, not since the first C standard which is 1990. You
were just lucky getting away with it. And before you think about trying to
solve this using union of the structure and some 64bit integer, no, that is
not allowed either.

> 
> Since we do NOT rewrite all our software every time the C++ standard
> changes, how can we find this kind of bugs?
The standard has not changed since 1998. Aren't 10+ years enough to learn the
language you are using? :)

> 
> I thought there could be a book with *advanced* C++ debugging but a
> Google search, then an Amazon search yielded nothing
> but books for beginners or user manuals of Visual C++ debugger
> written in a book form.
> 
> Is there a combination of gcc warnings (that is NOT included in Wall
> since we already have that) that could be useful here?
> 
> Is there a tool somewhere that could diagnose this problem?
> 
> And last but not least: Is there a good book in C++ debugging?
I don't think there is anything specific about C++ debugging. You just need
to know the language well enough.

--
VH
0
Reply v.haisman (98) 7/10/2009 5:53:23 AM


jacob navia wrote:
> I have trouble debugging C++. For instance, I learned recently that code
> that compiled with gcc -Wall without a single warning can be completely 
> buggy.

1) write decent unit tests.
2) compile (and run the tests) with more than one compiler.

> The problem is that this code was written well before the new C++
> standard was written. It was written in a time when doing this was
> correct (In 32 bit pointers)
> 
> struct twoPointers {
>     void *a;
>     void *b;
> };
> 
> And you could manipulate that data as it would be a single 64 bit
> integer.

This has nothing to do with the standard and everything to do with the 
platform.  16 and 64 bit system any standard and your code would fail on 
them.

3) compile (and run the tests) on more than one platform.

> Since we do NOT rewrite all our software every time the C++ standard
> changes, how can we find this kind of bugs?

Don't hack.

-- 
Ian Collins
0
Reply ian-news (9908) 7/10/2009 9:57:01 AM

Ian Collins wrote:
> jacob navia wrote:
>> I have trouble debugging C++. For instance, I learned recently that code
>> that compiled with gcc -Wall without a single warning can be 
>> completely buggy.
> 
> 1) write decent unit tests.
> 2) compile (and run the tests) with more than one compiler.
> 
>> The problem is that this code was written well before the new C++
>> standard was written. It was written in a time when doing this was
>> correct (In 32 bit pointers)
>>
>> struct twoPointers {
>>     void *a;
>>     void *b;
>> };
>>
>> And you could manipulate that data as it would be a single 64 bit
>> integer.
> 
> This has nothing to do with the standard and everything to do with the 
> platform.  16 and 64 bit system any standard and your code would fail on 
> them.

Make that "16 and 64 bit systems pre-date any C++ standard"

-- 
Ian Collins
0
Reply ian-news (9908) 7/10/2009 9:58:35 AM

jacob navia <jacob@nospam.org> writes:

> I have trouble debugging C++. For instance, I learned recently that code
> that compiled with gcc -Wall without a single warning can be
> completely buggy.
>
> The reason is that the C++ standard says that using a pointer of another
> type to change memory as the type declared that this memory should be
> is "undefined"...

Yes, that's what the standard says, but the compilers don't try to
catch this 'error'.  


> [...] The answer was simply
>
> "Your code has aliasing problems. This is not the place to
> educate you about C/C++"

Quite an "attitude", isn't it.


> Now, how can we discover this?
>
> The problem is that this code was written well before the new C++
> standard was written. It was written in a time when doing this was
> correct (In 32 bit pointers)
>
> struct twoPointers {
> 	void *a;
> 	void *b;
> };
>
> And you could manipulate that data as it would be a single 64 bit
> integer.
>
> Since we do NOT rewrite all our software every time the C++ standard
> changes, how can we find this kind of bugs?

Quite easily, by tagging the data.  


> I thought there could be a book with *advanced* C++ debugging but a
> Google search, then an Amazon search yielded nothing
> but books for beginners or user manuals of Visual C++ debugger
> written in a book form.
>
> Is there a combination of gcc warnings (that is NOT included in Wall
> since we already have that) that could be useful here?

I wouldn't hold my breadth.


> Is there a tool somewhere that could diagnose this problem?

It's done by the Zeta-C compiler (since the target is the
LispMachine).  Of course, today it might be easier to build a time
machine than to find a LispMachine with the Zeta-C compiler, and
anyways, it doesn't solve the problem of C++.


Perhaps one of the C/C++ interpreters are doing this type check.  Try
them.

C INTERPRETERS:
    CINT - http://root.cern.ch/root/Cint.html
    EiC - http://eic.sourceforge.net/
    Ch - http://www.softintegration.com
    [ MPC (Multi-Platform C -> Java compiler) - http://www.axiomsol.com ]


Otherwise, your best chance would be to patch them, or gcc (or
lcc-win32), to generate tagged data and implement run-time type
checks.

Notice that of the same sort of bug that should be checked at run-time
are the array overflows and invalid pointers dereferences.  The C and
C++ standard explicitely say that derefering a pointer outside of its
pointed array is undefined, even holding a pointer outside of its
array limits (plus 1) is undefined...

char a[5];
char* p=a; // valid
p+=4; // valid
*p;   // valid
p++;  // valid
*p;   // undefined
p++;  // undefined

The problem is that C compiler writers don't bother writting the
run-time checks that would detect these bugs, much less doing the type
inference that would be needed to detecht a small number of them at
compilation-time.


> And last but not least: Is there a good book in C++ debugging?

Well, before writing a good book for C++ debugging, writing a good C++
debugger would be in order, don't you think?


-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/10/2009 10:05:34 AM

Vaclav Haisman <v.haisman@sh.cvut.cz> writes:

> That was never correct, not since the first C standard which is 1990. 

Indeed.  And it is the shame of the C compiler industry not having
produced in 19 years a single implementation allowing to detect this
error automatically.

> You were just lucky getting away with it. And before you think about trying to
> solve this using union of the structure and some 64bit integer, no, that is
> not allowed either.

This is not what is asked here.  What is asked is some help from the
compiler, so we can detect these errors (either at compilation time or
at run time).

That the standard says it's "Undefined behavior" to allow for small
barebone C implementations on small system such as a PDP-11, be it.
But this shouldn't prevent C compiler providers to offer more
sophisticated and helpful compilers on the multi-{giga-{byte,hetz},core}
machines we have today.


-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/10/2009 10:10:27 AM

On Jul 9, 10:42=A0pm, jacob navia <ja...@nospam.org> wrote:
> I have trouble debugging C++. For instance, I learned recently that code
> that compiled with gcc -Wall without a single warning can be completely
> buggy.

And that comes as a surprise to a compiler writer?
I could probably write completely buggy code that is nevertheless
accepted without any complaint by lcc-win.

<snip>
> But how come that gcc doesn't emit the slightest warning?

Probably because your code used a type-cast operation, which the
compiler rightfully interprets as a message saying "Don't bother
complaining about this. I know what I am doing."

>
> Well, we discovered that -Wall does enable the -Wstrict-aliasing but
> that option has a "scale" i.e. you can say
>
> -Wstrict-aliasing=3D1
> up to
> -Wstrict-aliasing=3D5
>
> When you set -Wall you enable -Wstrict-aliasing=3D3.
>
> OK. We increased the level to 5 (lengthening the compile time of our
> software that is already a staggering 20 minutes in a 4 core machine)
>
> Still, gcc doesn't emit A SINGLE WARNING!
>
> Now, how can we discover this?
>
> The problem is that this code was written well before the new C++
> standard was written. It was written in a time when doing this was
> correct (In 32 bit pointers)
>
> struct twoPointers {
> =A0 =A0 =A0 =A0 void *a;
> =A0 =A0 =A0 =A0 void *b;
>
> };
>
> And you could manipulate that data as it would be a single 64 bit
> integer.

That has never been legal, and at best resulted in implementation-
defined behaviour.
To perform the re-interpretation, you need a type-cast. And as stated
before, the compiler then assumes you are aware of all the potential
problems. Including the aliasing problems.

>
> Thanks in advance
>
> jacob navia

Bart v Ingen Schenau
0
Reply bart855 (270) 7/10/2009 11:47:48 AM

jacob navia wrote:

> The problem is that this code was written well before the new C++
> standard was written. It was written in a time when doing this was
> correct (In 32 bit pointers)
> 
> struct twoPointers {
> 	void *a;
> 	void *b;
> };
> 
> And you could manipulate that data as it would be a single 64 bit
> integer.


Can you show an example of how you do that? Because it's not clear to me
from what you posted what the exact problem was.

Have you tried lint?  You can try it online.
http://www.gimpel-online.com/OnlineTesting.html

What other compilers have you tried your code on?  Some may give better
diagnostics than others. You can try Comeau's compiler online too.
http://www.comeaucomputing.com/tryitout/


LR
0
Reply lruss (582) 7/10/2009 12:45:24 PM

On 10 juil, 07:53, Vaclav Haisman <v.hais...@sh.cvut.cz> wrote:
> jacob navia wrote, On 9.7.2009 22:42:
>
> > I have trouble debugging C++. For instance, I learned recently that cod=
e
> > that compiled with gcc -Wall without a single warning can be completely
> > buggy.
>
> > The reason is that the C++ standard says that using a pointer of anothe=
r
> > type to change memory as the type declared that this memory should be
> > is "undefined"...
>
> > Well, gcc has an option (that I discovered later)
>
> > -Wstrict-aliasing
>
> > that should warn you if you are doing something wrong. Since that
> > warning is enabled when -Wall is used, I thought we were protected,
> > and filed a bug report when we discovered that gcc generated code that
> > reads a value from uninitialized memory.
>
> > Of course, we wanted to make things easy to the gcc folks, and
> > wasted a lot of time isolating the bug, and then we sent it
> > to gcc's bug database.
>
> > The answer was simply
>
> > "Your code has aliasing problems. This is not the place to
> > educate you about C/C++"
>
> > We forgot to pass the code snippet through -Wall and the gcc folks
> > apparently did not understand that this snippet wasn't the problem
>
> From your description lower in the email, it seems to me that your code i=
s
> really the problem.
>
>
>
> > Great.
>
> > But how come that gcc doesn't emit the slightest warning?
>
> Violating aliasing rules is UB, UB does not require diagnostics.
>
>
>
> > Well, we discovered that -Wall does enable the -Wstrict-aliasing but
> > that option has a "scale" i.e. you can say
>
> > -Wstrict-aliasing=3D1
> > up to
> > -Wstrict-aliasing=3D5
>
> > When you set -Wall you enable -Wstrict-aliasing=3D3.
>
> > OK. We increased the level to 5 (lengthening the compile time of our
> > software that is already a staggering 20 minutes in a 4 core machine)
>
> Just 20 minutes? Thats nothing.
>
>
>
> > Still, gcc doesn't emit A SINGLE WARNING!
>
> The levels exist because the higher levels can give false positives. And =
even
> then they are not exhaustive. The compiler cannot see or recognize each a=
nd
> every aliasing rules violation.
>
>
>
> > Now, how can we discover this?
>
> > The problem is that this code was written well before the new C++
> > standard was written. It was written in a time when doing this was
> > correct (In 32 bit pointers)
>
> > struct twoPointers {
> > =A0 =A0 void *a;
> > =A0 =A0 void *b;
> > };
>
> > And you could manipulate that data as it would be a single 64 bit
> > integer.
>
> That was never correct, not since the first C standard which is 1990. You
> were just lucky getting away with it. And before you think about trying t=
o
> solve this using union of the structure and some 64bit integer, no, that =
is
> not allowed either.
>
>
>
> > Since we do NOT rewrite all our software every time the C++ standard
> > changes, how can we find this kind of bugs?
>
> The standard has not changed since 1998. Aren't 10+ years enough to learn=
 the
> language you are using? :)
>
>
>
> > I thought there could be a book with *advanced* C++ debugging but a
> > Google search, then an Amazon search yielded nothing
> > but books for beginners or user manuals of Visual C++ debugger
> > written in a book form.
>
> > Is there a combination of gcc warnings (that is NOT included in Wall
> > since we already have that) that could be useful here?
>
> > Is there a tool somewhere that could diagnose this problem?
>
> > And last but not least: Is there a good book in C++ debugging?
>
> I don't think there is anything specific about C++ debugging. You just ne=
ed
> to know the language well enough.
>
> --
> VH

0
Reply jacob (2538) 7/10/2009 1:54:21 PM

On 10 juil, 07:53, Vaclav Haisman <v.hais...@sh.cvut.cz> wrote:
>
> The standard has not changed since 1998. Aren't 10+ years enough to learn the
> language you are using? :)
>

I did not write this code, I am just trying to make it work.
Thanks for your helpful message. This confirms the attitude of
many here, as if they never had any bugs of course.

GCC emits code that reads from an uninitialized place. This means that
it took us 2 weeks to get to the root of this problem.

But (of course) we are the stupids that "do not know how to write
C++"

The code base is around 7-8MB of C++

0
Reply jacob (2538) 7/10/2009 1:57:41 PM

On 10 juil, 11:57, Ian Collins <ian-n...@hotmail.com> wrote:
>
> Don't hack.
>
> --
> Ian Collins

Sure sure. How helpful. This is a HUGE code base of MB and MB of
C++. I did not write this code. It is my job to make it work, that's
all.

Obviously I am being blamed for asking a question, since asking
questions is obviously a NO NO here.

(If you ask a question it means you do not know everything,
contrary to the gurus here)

"Don't hack"

And how can I know if in those MBs of code there is a hack?

That was my question. Now, please answer THAT, and if you can't
I hope you can at least keep your mouth SHUT!

0
Reply jacob (2538) 7/10/2009 2:01:47 PM

jacob navia wrote:
> [..]
> That was my question. Now, please answer THAT, and if you can't
> I hope you can at least keep your mouth SHUT!
> 

There is no need to crawl into the bottle, jacob.  Let's not start 
telling anybody whose answers we don't like to keep their mouths shut, 
shall we?  This is a free forum, folks post what they see fit, and 
flames don't help accomplish your goals, do they?

You're frustrated beyond your usual level, and that's not so difficult 
to discern.  Every once in a while we get a piece of code to maintain 
and it turns out to be a hack.  Annoying?  You bet.  Infuriating 
sometimes.  Sometimes after beating my head against the wall, I ask, 
"why me?  What did I do to deserve it?"  And it often turns out that I 
was given that code because people trusted me to sort it out.  And that 
there was nobody else close-by who could do it.

I have no particular knowledge about debugging aliasing problems to 
share with you, sorry.  But my approach to debugging has usually been to 
replace the pieces of code that don't work (or those I don't understand) 
with something I do understand and know as working.  Try that.  I don't 
know if there is a book of recipes like that, but I don't think you have 
time to study.  You need to get it working.

Divide and conquer.  Figure our which pieces are OK, leave them as is. 
You probably already know what causes problems, see if you can replace 
it keeping the same interface.  Perhaps you need to introduce some 
interface (to emulate the behavior of the questionable piece of code).

Ask more specific questions about C++, and you will have better answers. 
  But you already knew that, didn't you?

V
-- 
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
0
Reply v.Abazarov (13255) 7/10/2009 2:22:19 PM

On 9 July, 21:42, jacob navia <ja...@nospam.org> wrote:

> I have trouble debugging C++. For instance, I learned recently that code
> that compiled with gcc -Wall without a single warning can be completely
> buggy.

doesn't that apply to all languages and all compilers?

> The reason is that the C++ standard says that using a pointer of another
> type to change memory as the type declared that this memory should be
> is "undefined"...

this applies to C as well. Can your compiler diagnose
such problems in C?

<snip>

--
Nick Keighley
0
Reply nick_keighley_nospam (4575) 7/10/2009 4:59:16 PM

jacob navia wrote:
> On 10 juil, 11:57, Ian Collins <ian-n...@hotmail.com> wrote:
>> Don't hack.
> 
> Sure sure. How helpful. This is a HUGE code base of MB and MB of
> C++. I did not write this code. It is my job to make it work, that's
> all.

Which is why I suggested three techniques I use, which you choose to snip.

> Obviously I am being blamed for asking a question, since asking
> questions is obviously a NO NO here.

No, you are just a little paranoid.

> And how can I know if in those MBs of code there is a hack?
> 
> That was my question. Now, please answer THAT, and if you can't
> I hope you can at least keep your mouth SHUT!

I seldom type with my mouth open, it attracts unwelcome bugs!

As other have said, your code base must contain casts to do what you say 
it's doing.  Finding those would be a good start.  Then try the three 
techniques I suggested, which you choose to snip.

-- 
Ian Collins
0
Reply ian-news (9908) 7/10/2009 8:52:52 PM

In article <4a565651$0$17741$ba4acef3@news.orange.fr>, 
jacob@nospam.org says...

[ ... ]

> Now, how can we discover this?

It's certainly going to be nontrivial.

> The problem is that this code was written well before the new C++
> standard was written. It was written in a time when doing this was
> correct (In 32 bit pointers)
> 
> struct twoPointers {
> 	void *a;
> 	void *b;
> };
> 
> And you could manipulate that data as it would be a single 64 bit
> integer.

Sorry, but there never was such a time. Even the original C standard 
specified that (for example) there could be padding between members 
of a struct, so your code would give undefined results. The same was 
true before there was a standard for C, though obviously there wasn't 
any "official" document to state it.
 
> Since we do NOT rewrite all our software every time the C++ standard
> changes, how can we find this kind of bugs?

The standard has never changed in this respect, and it follows the 
example of the C standard, which codified the existing practice that 
your code gave undefined results.

[ ... ]
 
> Is there a combination of gcc warnings (that is NOT included in
> Wall since we already have that) that could be useful here?

I use don't use gcc much, and use gdb even less, so I can't give you 
much help that's specific to it.

If I had to do this, I think I'd insert another member between the 
two you have right now:

struct twoPointers { 
	void *a;
	int ignore;
	void *b;
};

Then in the debugger I'd set a breakpoint on any write to 'ignore'. 
No existing code should use that member directly, so anything that 
writes to it is essentially certain to be doing so via some sort of 
undefined behavior, and needs to be fixed.

-- 
    Later,
    Jerry.
0
Reply jerryvcoffin (233) 7/10/2009 9:26:55 PM

On Jul 10, 5:05=A0am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
(...)
>
> Notice that of the same sort of bug that should be checked at run-time
> are the array overflows and invalid pointers dereferences. =A0The C and
> C++ standard explicitely say that derefering a pointer outside of its
> pointed array is undefined, even holding a pointer outside of its
> array limits (plus 1) is undefined...
>

Trying to read the value of an uninitialised variable results in UB as
well.


> char a[5];
> char* p=3Da; // valid
> p+=3D4; // valid
> *p; =A0 // valid
> p++; =A0// valid
> *p; =A0 // undefined
> p++; =A0// undefined
>

The first *p that you state as valid results in undefined behaviour
because 'a' is not initialised.

- Anand
0
Reply mailto.anand.hariharan (149) 7/10/2009 9:32:02 PM

Ian Collins wrote:

> jacob navia wrote:

> > Obviously I am being blamed for asking a question, since asking
> > questions is obviously a NO NO here.
> 
> No, you are just a little paranoid.

Nice to see Jacob building friends in a newsgroup besides clc.




Brian
0
Reply defaultuserbr (3657) 7/10/2009 9:56:08 PM

Jerry Coffin wrote:
> 
> If I had to do this, I think I'd insert another member between the 
> two you have right now:
> 
> struct twoPointers { 
> 	void *a;
> 	int ignore;
> 	void *b;
> };
> 
> Then in the debugger I'd set a breakpoint on any write to 'ignore'. 
> No existing code should use that member directly, so anything that 
> writes to it is essentially certain to be doing so via some sort of 
> undefined behavior, and needs to be fixed.

Good tip Jerry!

Putting the dummy member first should also work and this would also help 
debug incorrect size assumptions when moving from 32 to 64 bit.

-- 
Ian Collins
0
Reply ian-news (9908) 7/10/2009 10:26:41 PM

In article <7bpth0F24ej6tU4@mid.individual.net>, ian-news@hotmail.com 
says...
> 
> Jerry Coffin wrote:
> > 
> > If I had to do this, I think I'd insert another member between the 
> > two you have right now:
> > 
> > struct twoPointers { 
> > 	void *a;
> > 	int ignore;
> > 	void *b;
> > };
> > 
> > Then in the debugger I'd set a breakpoint on any write to 'ignore'. 
> > No existing code should use that member directly, so anything that 
> > writes to it is essentially certain to be doing so via some sort of 
> > undefined behavior, and needs to be fixed.
> 
> Good tip Jerry!
> 
> Putting the dummy member first should also work and this would also help 
> debug incorrect size assumptions when moving from 32 to 64 bit.

Thanks. For that matter, you could perfectly well add both...

-- 
    Later,
    Jerry.
0
Reply jerryvcoffin (233) 7/10/2009 10:38:29 PM

jacob navia <ja...@nospam.org> writes:
> "Your code has aliasing problems. This is not the place to
> educate you about C/C++"

[snip]

> Since we do NOT rewrite all our software every time the C++ standard
> changes, how can we find this kind of bugs?

I'm sorry that you're working with code which violates the C89
standard and the C++ standard. Someone has to fix it, and that someone
appears to be you. I don't have much to add beyond that which has been
mentioned already in this thread as to strategies to accomplish this.
Either way, expecting help from gcc developers from a false bug report
is unreasonable. Maybe in a feature request ... :)

On Jul 10, 3:05=A0am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
> jacob navia <ja...@nospam.org> writes:
> > I thought there could be a book with *advanced* C++ debugging but a
> > Google search, then an Amazon search yielded nothing
> > but books for beginners or user manuals of Visual C++ debugger
> > written in a book form.
>
> > Is there a combination of gcc warnings (that is NOT included in Wall
> > since we already have that) that could be useful here?
>
> I wouldn't hold my breadth.
>
> > Is there a tool somewhere that could diagnose this problem?
>
> It's done by the Zeta-C compiler (since the target is the
> LispMachine). =A0Of course, today it might be easier to build a time
> machine than to find a LispMachine with the Zeta-C compiler, and
> anyways, it doesn't solve the problem of C++.
>
> Perhaps one of the C/C++ interpreters are doing this type check. =A0Try
> them.
>
> C INTERPRETERS:
> =A0 =A0 CINT -http://root.cern.ch/root/Cint.html
> =A0 =A0 EiC -http://eic.sourceforge.net/
> =A0 =A0 Ch -http://www.softintegration.com
> =A0 =A0 [ MPC (Multi-Platform C -> Java compiler) -http://www.axiomsol.co=
m]
>
> Otherwise, your best chance would be to patch them, or gcc (or
> lcc-win32), to generate tagged data and implement run-time type
> checks.
>
> Notice that of the same sort of bug that should be checked at run-time
> are the array overflows and invalid pointers dereferences. =A0The C and
> C++ standard explicitely say that derefering a pointer outside of its
> pointed array is undefined, even holding a pointer outside of its
> array limits (plus 1) is undefined...

[snip]

> The problem is that C compiler writers don't bother writting the
> run-time checks that would detect these bugs, much less doing the type
> inference that would be needed to detecht a small number of them at
> compilation-time.

You seem to be taking the opinion that compilers should catch all
undefined behavior. C++ is not Java. C++'s stated primary design goals
include
- runtime performance comparable with assembly
- don't pay for what you don't use
- portable
- easy to write code / programmer productivity (with less relative
emphasis on this one IMHO)

With these design goals in mind, it is not reasonable to expect a
compiler to catch all possible undefined behavior or errors. To do
that would necessarily restrict the language so that it's less
comparable to assembly in speed and/or you start paying for things you
don't use.

In the C and C++ community, the assumption is that the programmer
knows what he's doing, and with that assumption, you can (relatively)
easily write really fast and portable code.

That someone hasn't written a "debugging" compiler which catches all
possible violations of the standard, as a debugging tool only, is
indeed a shame if true. However, Valgrind comes to mind as useful tool
in this area. Also, various versions MSVC do have optional runtime
bounds checking and other runtime checking. Finally, C interpreters
can catch all such misuse which occurs at runtime, the existence of
which you reference in your post. Thus, it appears the tools which you
bemoan do not exist, do indeed exist, and thus I am confused by your
self contradictions.
0
Reply joshuamaurice (576) 7/10/2009 10:54:57 PM

On Jul 10, 5:54 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 10, 3:05 am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
> [snip]
>
> > The problem is that C compiler writers don't bother writting the
> > run-time checks that would detect these bugs, much less doing the type
> > inference that would be needed to detecht a small number of them at
> > compilation-time.
>
> You seem to be taking the opinion that compilers should catch all
> undefined behavior.

No, he does not seem to.  He did say "detecht [sic] a *SMALL NUMBER*
of them at
compilation-time." (my emphasis).


> C++ is not Java. C++'s stated primary design goals
> include
> - runtime performance comparable with assembly
> - don't pay for what you don't use
> - portable
> - easy to write code / programmer productivity (with less relative
> emphasis on this one IMHO)
>

None of those goals come in the way of the *compiler* pointing out
dubious code.  AFAICT, no one wishes C++ prevent all instances of UB.
Code such as -
  int *p = reinterpret_cast<int *>(0x123abc);
  *p = 0x456def;
- is well out of bounds of the standard, but there are several people
who would want to be able to write code like that.  Except for being
an annoyance, I don't see why anyone would want the compiler to NOT
indicate such a code as dubious.


> With these design goals in mind, it is not reasonable to expect a
> compiler to catch all possible undefined behavior or errors. To do
> that would necessarily restrict the language so that it's less
> comparable to assembly in speed and/or you start paying for things you
> don't use.
>

You seem to be under the impression that the compiler "catching
undefined behaviour" is synonymous with either *disallowing* undefined
behaviour or imposing a runtime penalty to track them.  OP clearly
indicated that he only wishes the compiler to indicate to him that he
might be doing something that leads to UB.


> In the C and C++ community, the assumption is that the programmer
> knows what he's doing, and with that assumption, you can (relatively)
> easily write really fast and portable code.
>

Assumption is made within reason, of course.  Otherwise, the standard
would allow for many more implicit conversions than it currently
allows.

- Anand
0
Reply mailto.anand.hariharan (149) 7/11/2009 12:51:11 AM

On Jul 10, 5:51=A0pm, Anand Hariharan <mailto.anand.hariha...@gmail.com>
wrote:
> On Jul 10, 5:54 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > With these design goals in mind, it is not reasonable to expect a
> > compiler to catch all possible undefined behavior or errors. To do
> > that would necessarily restrict the language so that it's less
> > comparable to assembly in speed and/or you start paying for things you
> > don't use.
>
> You seem to be under the impression that the compiler "catching
> undefined behaviour" is synonymous with either *disallowing* undefined
> behaviour or imposing a runtime penalty to track them. =A0OP clearly
> indicated that he only wishes the compiler to indicate to him that he
> might be doing something that leads to UB.

Within C++ as the language rules stand, determining at compile-time if
the program can give undefined behavior through an aliasing violation
is in general undecidable, equivalent to the halting problem.

I did not claim such "catching undefined behavior" and "disallowing
certain constructs andor runtime checks" are not synonymous. However,
they are related.

I believe I was correct and reasonable when I interpreted that the OP
was asking for a compiler which caught all bad aliasing, and I believe
I was correct and reasonable when I stated that doing so is impossible
without disallowing certain kinds of casting or imposing additional
runtime checks (both of which are contrary to the design goals of C+
+). I noted that it's quite reasonable and desirable for a "debugging
compiler" to add runtime checks to catch all such aliasing errors in
development. I disagree with the overall theme of your reply: that I
was incorrect in my statement of fact or that I was incorrect in my
interpretation of the OP's desire to have all aliasing violations
caught.
0
Reply joshuamaurice (576) 7/11/2009 8:08:27 AM

On Jul 11, 1:08=A0am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> I did not claim such "catching undefined behavior" and "disallowing
> certain constructs andor runtime checks" are not synonymous. However,
> they are related.

I meant: "I did not claim they are synonymous". That's what I get for
typing late at night.
0
Reply joshuamaurice (576) 7/11/2009 8:09:51 AM

Joshua Maurice wrote:
> jacob navia <ja...@nospam.org> writes:
>> "Your code has aliasing problems. This is not the place to
>> educate you about C/C++"
> 
> [snip]
> 
>> Since we do NOT rewrite all our software every time the C++ standard
>> changes, how can we find this kind of bugs?
> 
> I'm sorry that you're working with code which violates the C89
> standard and the C++ standard. Someone has to fix it, and that someone
> appears to be you. I don't have much to add beyond that which has been
> mentioned already in this thread as to strategies to accomplish this.
> Either way, expecting help from gcc developers from a false bug report
> is unreasonable. Maybe in a feature request ... :)
> 

Look.

Using the gcc compiler without any optimizations produces perfectly 
valid code that works as intended. Using the 64 bit
gcc compiler (versions 3.3 to 4.3) produces the intended
result even with maximum optimization.

Using the PowerPC IBM compiler works with optimizations
and without them.

This code was working with gcc-3.3.6 and stopped working only with
gcc 4.1.2 with optimization levels higher than 2 and only in linux 32
bits. MSVC windows 32 compiler compiles that code correctly.

How are the maintainers supposed to know that?

Because after 2 weeks of work and work finally we examined the
gcc generated assembler and discovered that gcc generates code
to read from an UNINITIALIZED memory location.

When I write

char tab[5];
char *p = tab;

p += 10;
char c = *p;

this is UB too but will be UB in debug mode AND in release mode.
The value in C will be undefined, but it will be CONSISTENT.

You are just saying the obvious:

C++ is not maintainable without huge efforts.

It is very easy to laugh at the maintenance programmers here. They
are just stupid of course, since if they weren't, they wouldn't be
in maintenance of course!

 > Either way, expecting help from gcc developers from a false bug report
 > is unreasonable. Maybe in a feature request ... :)
 >

yes, ":)"

VERY funny.


gcc (and this is a feature of course, not a bug) generates code that it
is impossible to follow with -O2 or -O3. Then, the gcc compiler
considers that it has the right to generate code that reads from
an uninitialized memory location without even caring to see if they
could (at least) emit a warning.
0
Reply jacob24 (973) 7/11/2009 12:27:28 PM

jacob navia wrote, On 11.7.2009 14:27:
> Joshua Maurice wrote:
>> jacob navia <ja...@nospam.org> writes:
>>> "Your code has aliasing problems. This is not the place to
>>> educate you about C/C++"
>>
>> [snip]
>>
>>> Since we do NOT rewrite all our software every time the C++ standard
>>> changes, how can we find this kind of bugs?
>>
>> I'm sorry that you're working with code which violates the C89
>> standard and the C++ standard. Someone has to fix it, and that someone
>> appears to be you. I don't have much to add beyond that which has been
>> mentioned already in this thread as to strategies to accomplish this.
>> Either way, expecting help from gcc developers from a false bug report
>> is unreasonable. Maybe in a feature request ... :)
>>
> 
> Look.
> 
> Using the gcc compiler without any optimizations produces perfectly
> valid code that works as intended. Using the 64 bit
> gcc compiler (versions 3.3 to 4.3) produces the intended
> result even with maximum optimization.
I think you are misunderstand what UB means. There is no such thing as
"perfactly valid code" when you are invoking UB. Not from the POV of the
standard.

> 
> Using the PowerPC IBM compiler works with optimizations
> and without them.
> 
> This code was working with gcc-3.3.6 and stopped working only with
> gcc 4.1.2 with optimization levels higher than 2 and only in linux 32
> bits. MSVC windows 32 compiler compiles that code correctly.
> 
> How are the maintainers supposed to know that?
Maintainers are supposed to know the language. struct { void* a; void* b; }
x; int64_t y = *(int64_t*)&x; is glaring bug screaming UB.

> 
> Because after 2 weeks of work and work finally we examined the
> gcc generated assembler and discovered that gcc generates code
> to read from an UNINITIALIZED memory location.
> 
> When I write
> 
> char tab[5];
> char *p = tab;
> 
> p += 10;
> char c = *p;
> 
> this is UB too but will be UB in debug mode AND in release mode.
> The value in C will be undefined, but it will be CONSISTENT.
That is not true. It might be the case for some combinations of OS and
compiler but it is not universal.

> 
> You are just saying the obvious:
> 
> C++ is not maintainable without huge efforts.
> 
> It is very easy to laugh at the maintenance programmers here. They
> are just stupid of course, since if they weren't, they wouldn't be
> in maintenance of course!
> 
>> Either way, expecting help from gcc developers from a false bug report
>> is unreasonable. Maybe in a feature request ... :)
>>
> 
> yes, ":)"
> 
> VERY funny.
> 
> 
> gcc (and this is a feature of course, not a bug) generates code that it
> is impossible to follow with -O2 or -O3. Then, the gcc compiler
> considers that it has the right to generate code that reads from
> an uninitialized memory location without even caring to see if they
> could (at least) emit a warning.

--
VH
0
Reply v.haisman (98) 7/11/2009 3:31:13 PM

jacob navia wrote:
> 
> Using the gcc compiler without any optimizations produces perfectly 
> valid code that works as intended. Using the 64 bit
> gcc compiler (versions 3.3 to 4.3) produces the intended
> result even with maximum optimization.

By chance, if the construct invokes UB.

> Using the PowerPC IBM compiler works with optimizations
> and without them.

By chance, if the construct invokes UB.

> This code was working with gcc-3.3.6 and stopped working only with
> gcc 4.1.2 with optimization levels higher than 2 and only in linux 32
> bits. MSVC windows 32 compiler compiles that code correctly.

That's what happens if the construct invokes UB.  Any tool change can 
break the fragile code.

> How are the maintainers supposed to know that?

Should they care?

> Because after 2 weeks of work and work finally we examined the
> gcc generated assembler and discovered that gcc generates code
> to read from an UNINITIALIZED memory location.

Post the source and the generated assembler.

> When I write
> 
> char tab[5];
> char *p = tab;
> 
> p += 10;
> char c = *p;
> 
> this is UB too but will be UB in debug mode AND in release mode.
> The value in C will be undefined, but it will be CONSISTENT.

No, it won't.  It's undefined.  p might point at a location that was 
written by the last programme to run.  Even if the value did appear 
consistent, as soon as the surrounding code changes, it is likely to change.

> You are just saying the obvious:
> 
> C++ is not maintainable without huge efforts.

Poorly written code in any language that relies on undefined behaviour 
is not maintainable.  C and C++ just happen to give you more rope to 
hang your self.  At least C++ has attempted to shorten the rope by 
adding specific and easily searchable casts.

> It is very easy to laugh at the maintenance programmers here. They
> are just stupid of course, since if they weren't, they wouldn't be
> in maintenance of course!

Boy you have a flea up your arse this weekend.  Most people here are 
probably maintenance programmers.  Anyone not working on a green field 
project can be considered a maintenance programmer.

-- 
Ian Collins
0
Reply ian-news (9908) 7/11/2009 10:08:28 PM

On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:

[snipped discussion about run-time error when OP accessed
uninitialized memory. OP complained that C++ compiler (gcc) could not
detect this even though its warning level was set to highest]

> You seem to be taking the opinion that compilers should catch all
> undefined behavior. C++ is not Java. C++'s stated primary design goals
> include
> - runtime performance comparable with assembly
> - don't pay for what you don't use
> - portable
> - easy to write code / programmer productivity (with less relative
> emphasis on this one IMHO)
>
> With these design goals in mind, it is not reasonable to expect a
> compiler to catch all possible undefined behavior or errors. To do
> that would necessarily restrict the language so that it's less
> comparable to assembly in speed and/or you start paying for things you
> don't use.
>
> In the C and C++ community, the assumption is that the programmer
> knows what he's doing, and with that assumption, you can (relatively)
> easily write really fast and portable code.

Just to add my two cents:
1. C++ lets you do everything, so chances are not bad that you can go
beyond your depth. In contrast to this, JAVA restricts your abilities
(no messing around with pointers), which makes your code inherently
safer. I think both are inferior to programming languages like Ada95.
Ada has a real type system (something that neither C++ nor JAVA has)
and will perform zounds of checks (it is the only language I know that
handles integer overflows). Since these checks give you a lot of
performance penalties, you have to provide additional information
about which checks can be omitted. This is maybe the major difference
between C++ and Ada95: Out of the box C++ provides few checks in favor
of speed, whereas Ada95 has all checks turned on. So C++ you have to
OPT-IN for run-time checks, Ada95 has the converse OPT-OUT philosophy.
Needless to say, nobody uses Ada95 except the Bundeswehr in Germany
(AFAIK).

2. Maybe even such fancy languages like Ada cannot reliably detect
memory aliasing issues because it may be the case that this task is
Turing hard. I haven't had time to think about it in detail, but I
think that you could reduce the HALTING problem to the problem of
accessing uninitialized memory through aliasing. This would explain
why the compiler industry didn't come up with a "decent" compiler: It
just may be that detecting _ALL_ such errors is simply impossible
(which doesn't mean that there may be a good heuristic algorithm for
detecting most of the obvious bugs).
I further assume that most cases where you get UB are also due to the
impossibiliy to check for such cases algorithmically.

@jacob:

Don't complain about the gcc team, the problem is definitely in your
code. Since you mess around with raw pointers, you're asking for
trouble (or rather the guy that wrote the code).
Cheer up, you have one of the worst jobs of the world of programming:
Inheriting code for your predecessor (some people say that this is
what object orientation is all about ;-), and having to find the bugs
in this code. Practically noone will give you credit for this, you're
more or less just a scape-goat. Personally, I have made little else
than re-write code that has been written by physicists (which should
be prohibited to writing code by law :-) for the last ten years. I can
imagine that bugfixing such code must be a lot more frustrating, so be
assured that you have our deepest sympathy.

Regards,
Stuart
0
Reply DerTopper (388) 7/14/2009 8:40:48 AM

On Jul 14, 10:40 am, Stuart Redmann <DerTop...@web.de> wrote:
> On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:

> [snipped discussion about run-time error when OP accessed
> uninitialized memory. OP complained that C++ compiler (gcc)
> could not detect this even though its warning level was set to
> highest]

> > You seem to be taking the opinion that compilers should
> > catch all undefined behavior. C++ is not Java. C++'s stated
> > primary design goals include
> > - runtime performance comparable with assembly
> > - don't pay for what you don't use
> > - portable
> > - easy to write code / programmer productivity (with less relative
> > emphasis on this one IMHO)

> > With these design goals in mind, it is not reasonable to
> > expect a compiler to catch all possible undefined behavior
> > or errors. To do that would necessarily restrict the
> > language so that it's less comparable to assembly in speed
> > and/or you start paying for things you don't use.

That's not strictly true.  Both the C and the C++ standards were
designed so that all undefined behavior can be caught.
Sometimes at a significant price, which means that very few
compilers do so.  But there have been some (CenterLine, I
think), and of course, tools like Purify and valgrind catch a
lot (but not all) of the undefined behavior (without rendering
the implementation non-conform).

> > In the C and C++ community, the assumption is that the
> > programmer knows what he's doing, and with that assumption,
> > you can (relatively) easily write really fast and portable
> > code.

> Just to add my two cents:
> 1. C++ lets you do everything, so chances are not bad that you
> can go beyond your depth. In contrast to this, JAVA restricts
> your abilities (no messing around with pointers), which makes
> your code inherently safer.

That's provably false.  Java seriously restricts what you can
do, to the point of not allowing you to write safe code (for a
sufficiently high enough level of "safe").  Basically, C++
doesn't to anything by default to provide safety, but allows you
(or your organization) to take whatever steps are needed for the
level of safety you need.  Java imposes a very specific level of
safety.  If it's adequate, fine---you don't have to do anything
else.  If it's not, you're stuck, because there's nothing else
you can do.  (The specific level Java imposes is NOT adequate
for most of what I do.)

> I think both are inferior to programming languages like Ada95.

From what I've heard of it, you're probably right.  But I've
never had the occasion to really use it, to be sure.

> Ada has a real type system (something that neither C++ nor
> JAVA has) and will perform zounds of checks (it is the only
> language I know that handles integer overflows).

Again, C++ leaves behavior in case of overflow of signed
integral types or floating point types "undefined behavior".  So
an implementation can perform all of the checks it wants.  The
problem is that most implementations defined the behavior much
like Java does, which is useless (at least for "safe" software).
And the real problem is that most programmers accept such
implementations, and consider them normal---that most
programmers don't care about safety.  (I've written C code in
the past which verified integral overflow, and I could do it in
Java or C++.  But such code will never be as efficient as if the
compiler did it.)

> Since these checks give you a lot of performance penalties,

Are you sure of that.  I seem to recall reading that in typical
programs, a decent compiler is able to eliminate 90% of the
checks entirely.  And if the compiler is generating the code,
it's one extra instruction per operation for the checks which
cannot be eliminated (at least on the machines I'm familiar
with).  Not a killer for most applications.

> you have to provide additional information about which checks
> can be omitted. This is maybe the major difference between C++
> and Ada95: Out of the box C++ provides few checks in favor of
> speed, whereas Ada95 has all checks turned on. So C++ you have
> to OPT-IN for run-time checks, Ada95 has the converse OPT-OUT
> philosophy.  Needless to say, nobody uses Ada95 except the
> Bundeswehr in Germany (AFAIK).

Most C++ compilers don't allow you to opt-in, even though it's
the only reasonable option for most software.

> 2. Maybe even such fancy languages like Ada cannot reliably
> detect memory aliasing issues because it may be the case that
> this task is Turing hard.

I'm not sure which aliasing issues you're concerned about, but a
lot of languages I've seen used in the past don't allow you to
take the address of a variable (so pointers can only come from
dynamic allocation), use garbage collection (so a pointer can
never point to a non-allocated object---or worse, memory that
has since been allocated to a different object), and don't
support pointer arithmetic, so pointers can't point into the
middle of objects.  Under such conditions, aliasing isn't
a difficult problem.

> I haven't had time to think about it in detail, but I think
> that you could reduce the HALTING problem to the problem of
> accessing uninitialized memory through aliasing. This would
> explain why the compiler industry didn't come up with a
> "decent" compiler: It just may be that detecting _ALL_ such
> errors is simply impossible (which doesn't mean that there may
> be a good heuristic algorithm for detecting most of the
> obvious bugs).  I further assume that most cases where you get
> UB are also due to the impossibiliy to check for such cases
> algorithmically.

Compile time or runtime.  The C++ standard certainly allows
"fat" pointers, which contain enough information for the runtime
to be able to detect all undefined behavior.  Such an
implementation would run slower; an even greater problem is that
it wouldn't be compatible with the defined ABI of most
platforms.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/14/2009 4:09:28 PM

On Jul 10, 9:01=A0am, jacob navia <ja...@jacob.remcomp.fr> wrote:
> On 10 juil, 11:57, Ian Collins <ian-n...@hotmail.com> wrote:
>
>
>
> > Don't hack.
>
> > --
> > Ian Collins
>
> Sure sure. How helpful. This is a HUGE code base of MB and MB of
> C++. I did not write this code. It is my job to make it work, that's
> all.
>
> Obviously I am being blamed for asking a question, since asking
> questions is obviously a NO NO here.
>
> (If you ask a question it means you do not know everything,
> contrary to the gurus here)
>
> "Don't hack"
>
> And how can I know if in those MBs of code there is a hack?
>
> That was my question. Now, please answer THAT, and if you can't
> I hope you can at least keep your mouth SHUT!

The job of the C++ compiler is simply to compile your code.  Yes, it
could in theory do all the things you want it to do, because sure
there have been techniques invented that do such things.  But might I
suggest your company invest in a static code analysis tool?  While a C+
+ *compiler's* job is to compile your code according to the standard,
a static code analysis tool's purpose is exactly what you seem to be
looking for.  So there's really no point in GCC attempting to add
these kind of features because they take programmer time away from
actually making the compiler more robust, stronger, producing faster
code, and confirming to the evolving standard.  While the static code
analysis writers, on the other hand, have all the time in the world to
do exactly what you're looking for.  There are a number of really good
ones available, possibly even some free ones.  I would have a look on
Google for some if I were you.
0
Reply divisortheory (94) 7/15/2009 3:18:04 AM

On Jul 11, 7:27=A0am, jacob navia <ja...@nospam.org> wrote:
> gcc (and this is a feature of course, not a bug) generates code that it
> is impossible to follow with -O2 or -O3. Then, the gcc compiler
> considers that it has the right to generate code that reads from
> an uninitialized memory location without even caring to see if they
> could (at least) emit a warning.


Warnings are emitted at compile time.  Reads from uninitialized memory
happen at run time.  Doing a complete static data flow analysis of
your program to detect this is not an easy problem in the general
case.  Use a combination of static & dynamic code analysis tools.
Honestly, you could have detected the exact location of the error in
about 5 minutes using Valgrind.  Although you then would have been
scratching your head, wondering why the heck that was uninitialized in
the first place.  Then a static analysis tool would have answered that
for you in about 5 minutes.

Make it part of your build process to fix all the code analysis
warnings in your codebase once a week from now on, much like you do to
fix all GCC warnings.  GCC's a compiler, software development isn't a
one-tool job.  You need debuggers, profilers, static analysis, dynamic
analysis, source code control, etc.  I realize you're frustrated
spending 2 weeks fixing this bug which you think is a stupid bug and
should never happened in the first place.  But hey, you learned an
important lesson.  Don't let it happen in the first place next time.
Use the right tool for the job.
0
Reply divisortheory (94) 7/15/2009 3:24:23 AM

Anand Hariharan <mailto.anand.hariharan@gmail.com> writes:

> On Jul 10, 5:05�am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
> (...)
>>
>> Notice that of the same sort of bug that should be checked at run-time
>> are the array overflows and invalid pointers dereferences. �The C and
>> C++ standard explicitely say that derefering a pointer outside of its
>> pointed array is undefined, even holding a pointer outside of its
>> array limits (plus 1) is undefined...
>>
>
> Trying to read the value of an uninitialised variable results in UB as
> well.
>
>
>> char a[5];
>> char* p=a; // valid
>> p+=4; // valid
>> *p; � // valid
>> p++; �// valid
>> *p; � // undefined
>> p++; �// undefined
>>
>
> The first *p that you state as valid results in undefined behaviour
> because 'a' is not initialised.

Oops!  Make it: char a[5]="abcd";

-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/15/2009 8:21:22 AM

Joshua Maurice <joshuamaurice@gmail.com> writes:

> On Jul 10, 3:05�am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
>> jacob navia <ja...@nospam.org> writes:
>> > I thought there could be a book with *advanced* C++ debugging but a
>> > Google search, then an Amazon search yielded nothing
>> > but books for beginners or user manuals of Visual C++ debugger
>> > written in a book form.
>>
>> > Is there a combination of gcc warnings (that is NOT included in Wall
>> > since we already have that) that could be useful here?
>>
>> I wouldn't hold my breadth.
>>
>> > Is there a tool somewhere that could diagnose this problem?
>>
>> It's done by the Zeta-C compiler (since the target is the
>> LispMachine). �Of course, today it might be easier to build a time
>> machine than to find a LispMachine with the Zeta-C compiler, and
>> anyways, it doesn't solve the problem of C++.
>>
>> Perhaps one of the C/C++ interpreters are doing this type check. �Try
>> them.
>>
>> C INTERPRETERS:
>> � � CINT -http://root.cern.ch/root/Cint.html
>> � � EiC -http://eic.sourceforge.net/
>> � � Ch -http://www.softintegration.com
>> � � [ MPC (Multi-Platform C -> Java compiler) -http://www.axiomsol.com]
>>
>> Otherwise, your best chance would be to patch them, or gcc (or
>> lcc-win32), to generate tagged data and implement run-time type
>> checks.
>>
>> Notice that of the same sort of bug that should be checked at run-time
>> are the array overflows and invalid pointers dereferences. �The C and
>> C++ standard explicitely say that derefering a pointer outside of its
>> pointed array is undefined, even holding a pointer outside of its
>> array limits (plus 1) is undefined...
>
> [snip]
>
>> The problem is that C compiler writers don't bother writting the
>> run-time checks that would detect these bugs, much less doing the type
>> inference that would be needed to detecht a small number of them at
>> compilation-time.
>
> You seem to be taking the opinion that compilers should catch all
> undefined behavior. 

Not necessarily ALL the implementations (compilers or interpreters),
but there should be such implementations, and those should be the
implementation used most of the time, because most of the time, C++
programs are mere application programs that would benefit much more
from  run-time checking than from fast instructions (the more so on
modern processors, where it's pointless to go fast in the processor,
since you always are waiting on the RAM).


> C++ is not Java.  C++'s stated primary design goals
> include
> - runtime performance comparable with assembly

For most programs, we don't care about the speed.


> - don't pay for what you don't use

I wish you'd paid for the uncaught bugs left in executables that
affect the users.


> [...]

> - easy to write code / programmer productivity (with less relative
>   emphasis on this one IMHO)

Programmers would be more productive if the implementations helped to
catch bugs at run-time.



> With these design goals in mind, it is not reasonable to expect a
> compiler to catch all possible undefined behavior or errors. 

Implementations of other programming languages are able to do so, why
not implementations of C++?  It's perfectly reasonable to expect it,
and as a user of C++, I'd rather use such an implementation for 100%
of my C++ development, and 99% of my C++ program deployment.

> To do
> that would necessarily restrict the language so that it's less
> comparable to assembly in speed and/or you start paying for things you
> don't use.

Not at all, the restrictions are already in the language.  (Well,
s/undefined behavior/and error should be signaled a compilation time
or thrown at run-time/).


> In the C and C++ community, the assumption is that the programmer
> knows what he's doing, and with that assumption, you can (relatively)
> easily write really fast and portable code.

But nobody need really fast code.  What we need is correct code, and
code that detects automatically when it goes awry, instead of going on
with invalid data in the memory, or worse, viruses and worms.



> That someone hasn't written a "debugging" compiler which catches all
> possible violations of the standard, as a debugging tool only, is
> indeed a shame if true. 

Ah!  You're conceding my point.  Thank you.


> However, Valgrind comes to mind as useful tool
> in this area. 

But it's far from what we could expect.


> Also, various versions MSVC do have optional runtime
> bounds checking and other runtime checking. 

Good!

Unfortunately on unix I know of no compiler implementing run-time
checks (only interpreters do, unfortunately, C++ interpreters have too
many restrictions on the language implemented so they're generally
useless).


> Finally, C interpreters
> can catch all such misuse which occurs at runtime, the existence of
> which you reference in your post. Thus, it appears the tools which you
> bemoan do not exist, do indeed exist, and thus I am confused by your
> self contradictions.

AFAIK, there's no production-level implementation of C++ on unix
(Linux) providing run-time checks for undefined behavior.

The interpreters who indeed provide run-time checks, don't implement
the full C++ language, so they're not usable on real programs.
(eg. underC, http://home.mweb.co.za/sd/sdonovan/underc.html doesn't
implement multiple-inheritance).




Basically, what we'd like is an option of gcc/g++ (independent of the
optimization level) which would let you deploy programs with full
run-time checks.  No buffer overflow would go undetected in an
executable compiled with that option.


-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/15/2009 8:39:13 AM

Joshua Maurice <joshuamaurice@gmail.com> writes:

> On Jul 10, 5:51�pm, Anand Hariharan <mailto.anand.hariha...@gmail.com>
> wrote:
>> On Jul 10, 5:54 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>> > With these design goals in mind, it is not reasonable to expect a
>> > compiler to catch all possible undefined behavior or errors. To do
>> > that would necessarily restrict the language so that it's less
>> > comparable to assembly in speed and/or you start paying for things you
>> > don't use.
>>
>> You seem to be under the impression that the compiler "catching
>> undefined behaviour" is synonymous with either *disallowing* undefined
>> behaviour or imposing a runtime penalty to track them. �OP clearly
>> indicated that he only wishes the compiler to indicate to him that he
>> might be doing something that leads to UB.
>
> Within C++ as the language rules stand, determining at compile-time if
> the program can give undefined behavior through an aliasing violation
> is in general undecidable, equivalent to the halting problem.

This is the reason why it has to be done at run-time, when it occurs.


> I did not claim such "catching undefined behavior" and "disallowing
> certain constructs andor runtime checks" are not synonymous. However,
> they are related.
>
> I believe I was correct and reasonable when I interpreted that the OP
> was asking for a compiler which caught all bad aliasing, and I believe
> I was correct and reasonable when I stated that doing so is impossible
> without disallowing certain kinds of casting or imposing additional
> runtime checks (both of which are contrary to the design goals of C+
> +).

Notice that the design goals of Common Lisp are the same.  However,
most Common Lisp implementation implement run-time checks most of the
time. (It is possible to disable most of the run-time checks in speed
critical parts).


> I noted that it's quite reasonable and desirable for a "debugging
> compiler" to add runtime checks to catch all such aliasing errors in
> development. I disagree with the overall theme of your reply: that I
> was incorrect in my statement of fact or that I was incorrect in my
> interpretation of the OP's desire to have all aliasing violations
> caught.

-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/15/2009 8:43:58 AM

jacob navia <jacob@nospam.org> writes:
> [...]
> How are the maintainers supposed to know that?

By knowing the language, indeed.  Some reading between the lines has
to be done, but still, it's well known that these constructs have no
standard defined behavior.

> [...]
>
> You are just saying the obvious:
>
> C++ is not maintainable without huge efforts.
>
> It is very easy to laugh at the maintenance programmers here. They
> are just stupid of course, since if they weren't, they wouldn't be
> in maintenance of course!

We may also laugh at the managers who choosed to develop the software
in C++ in the first place, when better programming languages existed,
exist, and will exist.


> gcc (and this is a feature of course, not a bug) generates code that it
> is impossible to follow with -O2 or -O3. 

Yes, but it's FAST! :-)


> Then, the gcc compiler considers that it has the right to generate
> code that reads from an uninitialized memory location without even
> caring to see if they could (at least) emit a warning.

Yes, the C++ standard explicitely allows it to do so. 
Bad standard, change standard.


That said, I don't know a lot of language whose standard doesn't give
a sizeable amount of leaway to the implementations.  Even Common Lisp
leaves a lot of freedom to the implementations, so you have a lot of
constructs that are implementation dependant.


When you want to write portable code, you have to be careful not to
use implementation dependant (including option dependant) constructs.
Yours was one.

-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/15/2009 8:49:28 AM

James Kanze <james.kanze@gmail.com> writes:
>> Since these checks give you a lot of performance penalties,
>
> Are you sure of that.  I seem to recall reading that in typical
> programs, a decent compiler is able to eliminate 90% of the
> checks entirely.  And if the compiler is generating the code,
> it's one extra instruction per operation for the checks which
> cannot be eliminated (at least on the machines I'm familiar
> with).  Not a killer for most applications.

Indeed.  Modern processors (eg. as old as 680x0) provide software
traps to catch overflow/undeflow that used to cost very little, and
that cost nothing with pipelined processors, when the trap is not
taken.


>> 2. Maybe even such fancy languages like Ada cannot reliably
>> detect memory aliasing issues because it may be the case that
>> this task is Turing hard.
>
> I'm not sure which aliasing issues you're concerned about, but a
> lot of languages I've seen used in the past don't allow you to
> take the address of a variable (so pointers can only come from
> dynamic allocation), use garbage collection (so a pointer can
> never point to a non-allocated object---or worse, memory that
> has since been allocated to a different object), and don't
> support pointer arithmetic, so pointers can't point into the
> middle of objects.  Under such conditions, aliasing isn't
> a difficult problem.
>
>> I haven't had time to think about it in detail, but I think
>> that you could reduce the HALTING problem to the problem of
>> accessing uninitialized memory through aliasing. This would
>> explain why the compiler industry didn't come up with a
>> "decent" compiler: It just may be that detecting _ALL_ such
>> errors is simply impossible (which doesn't mean that there may
>> be a good heuristic algorithm for detecting most of the
>> obvious bugs).  I further assume that most cases where you get
>> UB are also due to the impossibiliy to check for such cases
>> algorithmically.
>
> Compile time or runtime.  The C++ standard certainly allows
> "fat" pointers, which contain enough information for the runtime
> to be able to detect all undefined behavior.  Such an
> implementation would run slower; an even greater problem is that
> it wouldn't be compatible with the defined ABI of most
> platforms.

Well, you would have to recompile the libraries, but since most if not
all libraries are written in C or C++, there would be no real
difficulty.  (Common Lisp has not the same luck here).

-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/15/2009 8:57:29 AM

On Jul 15, 10:39 am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
> Joshua Maurice <joshuamaur...@gmail.com> writes:
> > On Jul 10, 3:05 am, p...@informatimago.com (Pascal J. Bourguignon)
> > wrote:
> > [snip]

> >> The problem is that C compiler writers don't bother
> >> writting the run-time checks that would detect these bugs,
> >> much less doing the type inference that would be needed to
> >> detecht a small number of them at compilation-time.

> > You seem to be taking the opinion that compilers should
> > catch all undefined behavior.

> Not necessarily ALL the implementations (compilers or
> interpreters), but there should be such implementations, and
> those should be the implementation used most of the time,
> because most of the time, C++ programs are mere application
> programs that would benefit much more from  run-time checking
> than from fast instructions (the more so on modern processors,
> where it's pointless to go fast in the processor, since you
> always are waiting on the RAM).

I think that there are some implementations.  At least in the
past, CenterLine caught most cases of undefined behavior.  I
don't know what its current status is, but it is still being
sold.  (http://www.ics.com/products/centerline/objectcenter/,
for more information.)

I agree with you that such a compiler should be the default and
usually used compiler.  I have the impression, however, that we
are in a very small minority---at any rate, I don't have the
impression that CenterLine is a market leader.  (ICS, which owns
it, seems to push its GUI expertise and products considerably
more.)

    [...]
> > With these design goals in mind, it is not reasonable to
> > expect a compiler to catch all possible undefined behavior
> > or errors.

> Implementations of other programming languages are able to do
> so, why not implementations of C++?  It's perfectly reasonable
> to expect it, and as a user of C++, I'd rather use such an
> implementation for 100% of my C++ development, and 99% of my
> C++ program deployment.

Implementations of C++ are capable of doing a lot more than they
do.  Apparently, the market doesn't want it.  (Should we
conclude that C++ programmers don't care about quality, or
programmer productivity?)

    [...]
> > Also, various versions MSVC do have optional runtime bounds
> > checking and other runtime checking.

> Good!

But only in the standard library, I think.

> Unfortunately on unix I know of no compiler implementing
> run-time checks (only interpreters do, unfortunately, C++
> interpreters have too many restrictions on the language
> implemented so they're generally useless).

My impression is that g++ and VC++ are about equal with regards
to verifications.  (VC++ does emit a lot of warnings about using
functions which don't, or can't verify, e.g. strcpy and such.)

    [...]
> Basically, what we'd like is an option of gcc/g++ (independent
> of the optimization level) which would let you deploy programs
> with full run-time checks.  No buffer overflow would go
> undetected in an executable compiled with that option.

Arrays in C are very poorly designed, and C++ has inherited this.
In order to do full run-time checking, you need fat pointers.
Which not only slows the code down considerably, but also breaks
the ABI.  If you rigorously avoid C style arrays, and only use
std::vector, g++ does run-time check.  (But as soon as you do
something like &v[i], all bets are off with regards to the
resulting pointer.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/15/2009 9:07:51 AM

pjb@informatimago.com (Pascal J. Bourguignon) writes:

> Joshua Maurice <joshuamaurice@gmail.com> writes:
>
>> C++'s stated primary design goals include
>> - runtime performance comparable with assembly
>
> For most programs, we don't care about the speed.

And for programs that do care about speed, they still have a lot of
spare time to do checks:

http://www.cs.virginia.edu/papers/Hitting_Memory_Wall-wulf94.pdf


-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/15/2009 9:33:09 AM

Stuart Redmann wrote:
> On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> 
> Just to add my two cents:
> 1. C++ lets you do everything, so chances are not bad that you can go
> beyond your depth. In contrast to this, JAVA restricts your abilities
> (no messing around with pointers), which makes your code inherently
> safer. I think both are inferior to programming languages like Ada95.
> Ada has a real type system (something that neither C++ nor JAVA has)
> and will perform zounds of checks (it is the only language I know that
> handles integer overflows). Since these checks give you a lot of
> performance penalties, you have to provide additional information
> about which checks can be omitted. 

Well, I am no expert on Ada, but I had a look on Ada 2005 when searching 
for other languages to learn and wrote only some simple programs. I 
finally changed to Haskell and Ocaml just to learn some new principles 
of programming.

Anyway the Ada people claim, that a lot of these checks can be optimised 
out by the compiler and the remaining ones are rather inexpensive.

 > This is maybe the major difference
> between C++ and Ada95: Out of the box C++ provides few checks in favor
> of speed, whereas Ada95 has all checks turned on. So C++ you have to
> OPT-IN for run-time checks, Ada95 has the converse OPT-OUT philosophy.
> Needless to say, nobody uses Ada95 except the Bundeswehr in Germany
> (AFAIK).

Even that is not quite true. Have a look at:
http://www.seas.gwu.edu/~mfeldman/ada-project-summary.html

Also, comp.lang.ada is quite active and there is even a new language for 
the dotnet framework called A# which is an Ada derivate (like F# is an 
ML derivate).

> 
> 2. Maybe even such fancy languages like Ada cannot reliably detect
> memory aliasing issues because it may be the case that this task is
> Turing hard. 

Ok, my information here is very very unprecise, because I just skimmed 
over that chapters, but in Ada 2005 there is some construct where you 
have to declare e.g. a pointer to Integer with the keyword ALIASING when 
it should have the possibility to be set to already allocated memory, 
which allows the compiler to detect such things.

I am absolutely not sure, how safe this is or what the compiler 
allows/disallows here, anyone more familiar with Ada could probably explain.


> Cheer up, you have one of the worst jobs of the world of programming:
> Inheriting code for your predecessor (some people say that this is
> what object orientation is all about ;-), and having to find the bugs
> in this code. Practically noone will give you credit for this, you're
> more or less just a scape-goat. 

I second that. I did a lot of maintenance/enhancements to existing C++ 
systems which sometimes leads you to ludicrous laughs and sometimes to 
deep depression :)


> Personally, I have made little else
> than re-write code that has been written by physicists (which should
> be prohibited to writing code by law :-) for the last ten years. 

Quite similar here: develop a System in C++, give it out to about 20 
companies to develop/extend/evolve this system, where it uses some very 
old libraries/methods which even prevent you from using e.g. valgrind or 
even gdb in some cases, feed all of this into the main line and then 
give it to poor developers to go on bug-hunt :)
Not to mention, that the main reason, why it is used is more of a 
political issue...

One of our favourite discussions between the developers in my 
departement is about bashing this system...



lg,
Michael
0
Reply muell_om (47) 7/15/2009 11:36:47 AM

On 14 July, 17:09, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 14, 10:40 am, Stuart Redmann <DerTop...@web.de> wrote:
> > On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:

> > [snipped discussion about run-time error when OP accessed
> > uninitialized memory. OP complained that C++ compiler (gcc)
> > could not detect this even though its warning level was set to
> > highest]
>
> > > You seem to be taking the opinion that compilers should
> > > catch all undefined behavior. C++ is not Java. C++'s stated
> > > primary design goals include
> > > - runtime performance comparable with assembly
> > > - don't pay for what you don't use
> > > - portable
> > > - easy to write code / programmer productivity (with less relative
> > > emphasis on this one IMHO)
> > > With these design goals in mind, it is not reasonable to
> > > expect a compiler to catch all possible undefined behavior
> > > or errors. To do that would necessarily restrict the
> > > language so that it's less comparable to assembly in speed
> > > and/or you start paying for things you don't use.
>
> That's not strictly true. =A0Both the C and the C++ standards were
> designed so that all undefined behavior can be caught.

really? Where does it say that? Do you mean at compile time or at
run-time?

I'd always thought about half of UB was in the spec precisely because
it was too hard to detect. The other half was hardware stuff things
like
what the modulo operator does with negative numbers

gets()


> Sometimes at a significant price, which means that very few
> compilers do so. =A0But there have been some (CenterLine, I
> think), and of course, tools like Purify and valgrind catch a
> lot (but not all) of the undefined behavior (without rendering
> the implementation non-conform).

<snip>

> > I haven't had time to think about it in detail, but I think
> > that you could reduce the HALTING problem to the problem of
> > accessing uninitialized memory through aliasing.

ITYM detecting the access of uninitialized memory through aliasing
at compile time is equivalent to the Halting Problem.

> > This would
> > explain why the compiler industry didn't come up with a
> > "decent" compiler: It just may be that detecting _ALL_ such
> > errors is simply impossible (which doesn't mean that there may
> > be a good heuristic algorithm for detecting most of the
> > obvious bugs). =A0I further assume that most cases where you get
> > UB are also due to the impossibiliy to check for such cases
> > algorithmically.
>
> Compile time or runtime. =A0The C++ standard certainly allows
> "fat" pointers, which contain enough information for the runtime
> to be able to detect all undefined behavior. =A0Such an
> implementation would run slower; an even greater problem is that
> it wouldn't be compatible with the defined ABI of most
> platforms.

I can't quite work out how to break a fat-pointer implementation
but can't you do some very nasty things with printf("%p") and scanf
("%p")?

0
Reply nick_keighley_nospam (4575) 7/15/2009 1:18:17 PM

Nick Keighley wrote:
> 
> I'd always thought about half of UB was in the spec precisely because
> it was too hard to detect. The other half was hardware stuff things
> like
> what the modulo operator does with negative numbers

That's not undefined behavior. It's implementation defined.

-- 
   Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)
0
Reply pete2666 (1733) 7/15/2009 2:03:03 PM

In article <026627ea-6afc-457e-9a2f-399baf9976f8
@c36g2000yqn.googlegroups.com>, james.kanze@gmail.com says...
> 
> On Jul 15, 10:39 am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
> > Joshua Maurice <joshuamaur...@gmail.com> writes:

[ ... ]

>     [...]
> > > Also, various versions MSVC do have optional runtime bounds
> > > checking and other runtime checking.
> 
> > Good!
> 
> But only in the standard library, I think.

Not so -- recent versions have flags to tell it to include runtime 
checks in your code. A short description is available at:

http://msdn.microsoft.com/en-us/library/8wtf2dfz%28VS.80%29.aspx

[ ... ]
 
> Arrays in C are very poorly designed, and C++ has inherited this.
> In order to do full run-time checking, you need fat pointers.
> Which not only slows the code down considerably, but also breaks
> the ABI.  If you rigorously avoid C style arrays, and only use
> std::vector, g++ does run-time check.  (But as soon as you do
> something like &v[i], all bets are off with regards to the
> resulting pointer.)

Interestingly, the run-time checks provided by MS VC++ have almost 
exactly the same limitation in one respect -- they can track (to a 
degree) whether you use uninitialized variables, but taking the 
address is treated as equivalent to initialization.

-- 
    Later,
    Jerry.
0
Reply jerryvcoffin (233) 7/15/2009 7:18:11 PM

On Jul 15, 1:39=A0am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:
> Joshua Maurice <joshuamaur...@gmail.com> writes:
> > C++ is not Java. =A0C++'s stated primary design goals
> > include
> > - runtime performance comparable with assembly
>
> For most programs, we don't care about the speed.
>
> > - don't pay for what you don't use
>
> I wish you'd paid for the uncaught bugs left in executables that
> affect the users.
>
> > [...]
> > - easy to write code / programmer productivity (with less relative
> > =A0 emphasis on this one IMHO)
>
> Programmers would be more productive if the implementations helped to
> catch bugs at run-time.

I full heartily agree that current C++ compilers make me sad. They
make me sad for lack of standard compliance (some recent versions of
MSVC don't support covariant return types with multiple inheritance,
all compilers have bugs:
http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
etc.) and lack of developer-focused tools. I would very much want
every compiler out there to use "fat" pointers and other techniques to
catch all undefined behavior, either at compile time or runtime. I
also very much want this to be entirely optional, and for it to be
expressly stated that no "good" C++ program should depend upon such
checks; they should exist only as "terminate the process" asserts
only.

> For most programs, we don't care about the speed.

Agreed, and with the current state of the C++ industry, C++ is not the
best language for every situation. Perhaps Java is be more useful for
most programs.

I very much want C++ to remain focused on runtime performance.
However, \at least\ for developing purposes, I would also very much
like \optional\ Java-like runtime checks to catch all undefined
behavior. Unfortunately, it's impractical because it would break all
platform ABIs, and it requires compiler writers to write such things
which apparently isn't going to happen anytime soon.
0
Reply joshuamaurice (576) 7/15/2009 9:02:51 PM

On Jul 15, 2:07=A0am, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 15, 10:39 am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:
>
>
>
>
>
> > Joshua Maurice <joshuamaur...@gmail.com> writes:
> > > On Jul 10, 3:05 am, p...@informatimago.com (Pascal J. Bourguignon)
> > > wrote:
> > > [snip]
> > >> The problem is that C compiler writers don't bother
> > >> writting the run-time checks that would detect these bugs,
> > >> much less doing the type inference that would be needed to
> > >> detecht a small number of them at compilation-time.
> > > You seem to be taking the opinion that compilers should
> > > catch all undefined behavior.
> > Not necessarily ALL the implementations (compilers or
> > interpreters), but there should be such implementations, and
> > those should be the implementation used most of the time,
> > because most of the time, C++ programs are mere application
> > programs that would benefit much more from =A0run-time checking
> > than from fast instructions (the more so on modern processors,
> > where it's pointless to go fast in the processor, since you
> > always are waiting on the RAM).
>
> I think that there are some implementations. =A0At least in the
> past, CenterLine caught most cases of undefined behavior. =A0I
> don't know what its current status is, but it is still being
> sold. =A0(http://www.ics.com/products/centerline/objectcenter/,
> for more information.)
>
> I agree with you that such a compiler should be the default and
> usually used compiler. =A0I have the impression, however, that we
> are in a very small minority---at any rate, I don't have the
> impression that CenterLine is a market leader. =A0(ICS, which owns
> it, seems to push its GUI expertise and products considerably
> more.)
>
> =A0 =A0 [...]
>
> > > With these design goals in mind, it is not reasonable to
> > > expect a compiler to catch all possible undefined behavior
> > > or errors.
> > Implementations of other programming languages are able to do
> > so, why not implementations of C++? =A0It's perfectly reasonable
> > to expect it, and as a user of C++, I'd rather use such an
> > implementation for 100% of my C++ development, and 99% of my
> > C++ program deployment.
>
> Implementations of C++ are capable of doing a lot more than they
> do. =A0Apparently, the market doesn't want it. =A0

My take is that most C++ compiler vendors were/are attempting
to ride 20th century business models.  They have not adapted
to an on line model and what worked well for decades is now
tanking.  I've said it before, but I think there are only
two C++ compilers with minor on line support.  Comeau hasn't
made much progress in it's on line support in years.  They work
on adding new functionality to their existing products, but
not, from what I can tell, in reworking their products to
beef up their on line support.  There are some dinosaurs out
there company-wise that are being punished now for not
understanding the times ten years ago, let alone today.
Your remark about reading what the market wants is related to
on line products as well.  In the past with very limited
feedback, vendors have to try to figure out what they should
do next.  With an on line approach there's much more concrete
information on which to base product development decisions.


> (Should we
> conclude that C++ programmers don't care about quality, or
> programmer productivity?)

Some programmers care primarily about money and only about
quality because it may affect how much money they make.
I'm thinking of that Russian guy who may have stolen a
bunch of software from Goldman Sachs.


Brian Wood
Ebenezer Enterprises
www.webEbenezer.net
0
Reply coal (257) 7/16/2009 5:59:34 AM

On Jul 15, 3:18 pm, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:
> On 14 July, 17:09, James Kanze <james.ka...@gmail.com> wrote:
> > On Jul 14, 10:40 am, Stuart Redmann <DerTop...@web.de> wrote:
> > > On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > > [snipped discussion about run-time error when OP accessed
> > > uninitialized memory. OP complained that C++ compiler (gcc)
> > > could not detect this even though its warning level was set to
> > > highest]

> > > > You seem to be taking the opinion that compilers should
> > > > catch all undefined behavior. C++ is not Java. C++'s stated
> > > > primary design goals include
> > > > - runtime performance comparable with assembly
> > > > - don't pay for what you don't use
> > > > - portable
> > > > - easy to write code / programmer productivity (with less relative
> > > > emphasis on this one IMHO)
> > > > With these design goals in mind, it is not reasonable to
> > > > expect a compiler to catch all possible undefined behavior
> > > > or errors. To do that would necessarily restrict the
> > > > language so that it's less comparable to assembly in speed
> > > > and/or you start paying for things you don't use.

> > That's not strictly true.  Both the C and the C++ standards
> > were designed so that all undefined behavior can be caught.

> really? Where does it say that? Do you mean at compile time or
> at run-time?

At run-time, at the latest.  (I think that there is some which
can't be detected at compile time.  But probably a lot less than
one might think---compilers have gotten quite good at tracing
intermodular code flow.)  And it's scattered throughout the
standard.  Mostly in the form of "undefined behavior"---the
behavior is undefined precisely so that a checking
implementation can trap it.

> I'd always thought about half of UB was in the spec precisely
> because it was too hard to detect.

Too hard, no.  Too expensive, perhaps: to catch all pointer
violations, you need "fat" pointers---each pointer contains a
current address, plus the limits, each modification of the
pointer value verifies that the current address stays in the
limits, and each access through the pointer verifies that it
isn't using the end pointer (and that the pointer isn't null,
but most hardware traps this already today).

Of course, a good compiler could eliminate a certain number of
these checks, or at least hoist them outside of a loop.  But I
don't think it could easily avoid the fact that the size of a
pointer is multiplied by three, which makes things like copying
significantly more expensive, and can have very negative effects
on locality.

> The other half was hardware stuff things like what the modulo
> operator does with negative numbers

That's unspecified, not undefined behavior.

> gets()

Takes a pointer.  If the pointer contains the bounds, then it
can easily check.

> <snip>

> > > I haven't had time to think about it in detail, but I
> > > think that you could reduce the HALTING problem to the
> > > problem of accessing uninitialized memory through
> > > aliasing.

> ITYM detecting the access of uninitialized memory through
> aliasing at compile time is equivalent to the Halting Problem.

> > > This would explain why the compiler industry didn't come
> > > up with a "decent" compiler: It just may be that detecting
> > > _ALL_ such errors is simply impossible (which doesn't mean
> > > that there may be a good heuristic algorithm for detecting
> > > most of the obvious bugs).  I further assume that most
> > > cases where you get UB are also due to the impossibiliy to
> > > check for such cases algorithmically.

> > Compile time or runtime.  The C++ standard certainly allows
> > "fat" pointers, which contain enough information for the
> > runtime to be able to detect all undefined behavior.  Such
> > an implementation would run slower; an even greater problem
> > is that it wouldn't be compatible with the defined ABI of
> > most platforms.

> I can't quite work out how to break a fat-pointer
> implementation but can't you do some very nasty things with
> printf("%p") and scanf ("%p")?

Given that the standard makes these implementation defined, I
don't think so.  It might make detecting undefined behavior
expensive, however.  About the only way I think that an
implementation could determine that the value read by
scanf("%p") is "a value converted eariler during the same
program execution" is by saving all of the pointers output by
printf("%p") somewhere.  (Inputting any other value is undefined
behavior.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/16/2009 8:47:26 AM

On Jul 15, 9:18 pm, Jerry Coffin <jerryvcof...@yahoo.com> wrote:
> In article <026627ea-6afc-457e-9a2f-399baf9976f8
> @c36g2000yqn.googlegroups.com>, james.ka...@gmail.com says...
> > On Jul 15, 10:39 am, p...@informatimago.com (Pascal J. Bourguignon)
> > wrote:
> > > Joshua Maurice <joshuamaur...@gmail.com> writes:

> [ ... ]

> >     [...]
> > > > Also, various versions MSVC do have optional runtime bounds
> > > > checking and other runtime checking.

> > > Good!

> > But only in the standard library, I think.

> Not so -- recent versions have flags to tell it to include
> runtime checks in your code. A short description is available
> at:

> http://msdn.microsoft.com/en-us/library/8wtf2dfz%28VS.80%29.aspx

Well, it's a start, although it's still very limited.

> [ ... ]

> > Arrays in C are very poorly designed, and C++ has inherited
> > this.  In order to do full run-time checking, you need fat
> > pointers.  Which not only slows the code down considerably,
> > but also breaks the ABI.  If you rigorously avoid C style
> > arrays, and only use std::vector, g++ does run-time check.
> > (But as soon as you do something like &v[i], all bets are
> > off with regards to the resulting pointer.)

> Interestingly, the run-time checks provided by MS VC++ have
> almost exactly the same limitation in one respect -- they can
> track (to a degree) whether you use uninitialized variables,
> but taking the address is treated as equivalent to
> initialization.

Which wasn't really what I was talking about.  My point was that
having done &v[i], you have a raw pointer, on which you can do
pointer arithmetic, and access out of bounds without checking,
even if the implementation checks in vector<>::operator[].

But the documentation on how VC++ detects uninitialized
variables did seem wierd to me.  For a runtime check, I'd have
just associated an additional flag somewhere, setting it when
the variable was set, and checking it otherwise.  Perhaps the
problem is that if initialization occurs through a pointer, the
compiler can't generate the code to update the flag, so if the
address is taken (which would allow such initialization), it
just gives up.

Come to think of it, that's very likely the reason.  The obvious
way of being able to find the flag through the pointer would
change the size or the range of the data type.  And the other
ways I can think of are fairly complex, and probably rather
expensive in run-time.  And you're right that that's sort of the
same problem as with &v[i]---the use of a pointer causes the
code to "loose" any associated data.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/16/2009 8:59:14 AM

On Jul 15, 11:02 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 15, 1:39 am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:

    [...]
> I would very much want
> every compiler out there to use "fat" pointers and other techniques to
> catch all undefined behavior, either at compile time or runtime. I
> also very much want this to be entirely optional, and for it to be
> expressly stated that no "good" C++ program should depend upon such
> checks; they should exist only as "terminate the process" asserts
> only.

The "option" is tricker than you seem to realize.  Anything
which changes the size of an object (e.g. fat pointers vs.
normal pointers) breaks the ABI.  You can't link object files
compiled with different options.  (Or maybe you can link, but
the resulting program will just crash.)

Note that this is already the case with compilers which provide
"debugging" versions of std::vector and others.  The debugging
changes the size and the behavior of std::vector, and mixing
code with and without debugging causes core dumps.

> > For most programs, we don't care about the speed.

> Agreed, and with the current state of the C++ industry, C++ is
> not the best language for every situation. Perhaps Java is be
> more useful for most programs.

Only if you can accept a fairly low level of robustness.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/16/2009 9:09:21 AM

Michael Oswald wrote:
> Stuart Redmann wrote:
>> On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>>
>> Just to add my two cents:
>> 1. C++ lets you do everything, so chances are not bad that you can
>> go beyond your depth. In contrast to this, JAVA restricts your
>> abilities (no messing around with pointers), which makes your code
>> inherently safer. I think both are inferior to programming
>> languages like Ada95. Ada has a real type system (something that
>> neither C++ nor JAVA has) and will perform zounds of checks (it is
>> the only language I know that handles integer overflows). Since
>> these checks give you a lot of performance penalties, you have to
>> provide additional information about which checks can be omitted.
>
> Well, I am no expert on Ada, but I had a look on Ada 2005 when
> searching for other languages to learn and wrote only some simple
> programs. I finally changed to Haskell and Ocaml just to learn some 
> new
> principles of programming.
>
> Anyway the Ada people claim, that a lot of these checks can be
> optimised out by the compiler and the remaining ones are rather
> inexpensive.

This was designed into the language from the beginning, so Ada arrays 
know their size so you can iterate over a'range. No need to range 
check.

for i in a'range loop
  a(i) :=1;  -- always in range
end loop;


Also, the index type can be a subtype restricted to the allowed range 
of the array type. As the index then just cannot be out of range, 
there is no need to check a(i).


In C++ we have it differently.


Bo Persson


0
Reply bop (1070) 7/16/2009 12:26:48 PM

"Bo Persson" <bop@gmb.dk> writes:

> Michael Oswald wrote:
>> Stuart Redmann wrote:
>>> On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>>>
>>> Just to add my two cents:
>>> 1. C++ lets you do everything, so chances are not bad that you can
>>> go beyond your depth. In contrast to this, JAVA restricts your
>>> abilities (no messing around with pointers), which makes your code
>>> inherently safer. I think both are inferior to programming
>>> languages like Ada95. Ada has a real type system (something that
>>> neither C++ nor JAVA has) and will perform zounds of checks (it is
>>> the only language I know that handles integer overflows). Since
>>> these checks give you a lot of performance penalties, you have to
>>> provide additional information about which checks can be omitted.
>>
>> Well, I am no expert on Ada, but I had a look on Ada 2005 when
>> searching for other languages to learn and wrote only some simple
>> programs. I finally changed to Haskell and Ocaml just to learn some 
>> new
>> principles of programming.
>>
>> Anyway the Ada people claim, that a lot of these checks can be
>> optimised out by the compiler and the remaining ones are rather
>> inexpensive.
>
> This was designed into the language from the beginning, so Ada arrays 
> know their size so you can iterate over a'range. No need to range 
> check.
>
> for i in a'range loop
>   a(i) :=1;  -- always in range
> end loop;
>
>
> Also, the index type can be a subtype restricted to the allowed range 
> of the array type. As the index then just cannot be out of range, 
> there is no need to check a(i).
>
>
> In C++ we have it differently.

Notice that with intensive use of classes, we could archive similar results:

template <int MIN,int MAX>class Integer{
int value;
public:
  Integer(int aValue){rangeCheck(MIN,MAX,aValue);value=aValue}
  // operators...
};

std::vector<X> v;
for(Integer<0,v.size()-1> i=0;i<v.size()-1;i++){
    v[i]; // no check needed.
}


Ok, perhaps a more intelligent compiler and some work on the syntax is
needed, but you get the idea.


Now, perhaps it might be slightly easier to write: for i in a'range...


-- 
__Pascal Bourguignon__
0
Reply pjb (7667) 7/16/2009 1:29:12 PM

Pascal J. Bourguignon wrote:
> "Bo Persson" <bop@gmb.dk> writes:
> 
>> Michael Oswald wrote:
>>> Stuart Redmann wrote:
>>>> On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>>>>
>>>> Just to add my two cents:
>>>> 1. C++ lets you do everything, so chances are not bad that you can
>>>> go beyond your depth. In contrast to this, JAVA restricts your
>>>> abilities (no messing around with pointers), which makes your code
>>>> inherently safer. I think both are inferior to programming
>>>> languages like Ada95. Ada has a real type system (something that
>>>> neither C++ nor JAVA has) and will perform zounds of checks (it is
>>>> the only language I know that handles integer overflows). Since
>>>> these checks give you a lot of performance penalties, you have to
>>>> provide additional information about which checks can be omitted.
>>> Well, I am no expert on Ada, but I had a look on Ada 2005 when
>>> searching for other languages to learn and wrote only some simple
>>> programs. I finally changed to Haskell and Ocaml just to learn some 
>>> new
>>> principles of programming.
>>>
>>> Anyway the Ada people claim, that a lot of these checks can be
>>> optimised out by the compiler and the remaining ones are rather
>>> inexpensive.
>> This was designed into the language from the beginning, so Ada arrays 
>> know their size so you can iterate over a'range. No need to range 
>> check.
>>
>> for i in a'range loop
>>   a(i) :=1;  -- always in range
>> end loop;
>>
>>
>> Also, the index type can be a subtype restricted to the allowed range 
>> of the array type. As the index then just cannot be out of range, 
>> there is no need to check a(i).
>>
>>
>> In C++ we have it differently.
> 
> Notice that with intensive use of classes, we could archive similar results:
> 
> template <int MIN,int MAX>class Integer{
> int value;
> public:
>   Integer(int aValue){rangeCheck(MIN,MAX,aValue);value=aValue}
>   // operators...
> };
> 
> std::vector<X> v;
> for(Integer<0,v.size()-1> i=0;i<v.size()-1;i++){
>     v[i]; // no check needed.
> }

Or use tr1::array and iterators.  This is closer to the Ada concept of 
the array knowing its size.

-- 
Ian Collins
0
Reply ian-news (9908) 7/16/2009 8:24:12 PM

On Jul 16, 2:09=A0am, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 15, 11:02 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
> > On Jul 15, 1:39 am, p...@informatimago.com (Pascal J. Bourguignon)
> > wrote:
>
> =A0 =A0 [...]
>
> > I would very much want
> > every compiler out there to use "fat" pointers and other techniques to
> > catch all undefined behavior, either at compile time or runtime. I
> > also very much want this to be entirely optional, and for it to be
> > expressly stated that no "good" C++ program should depend upon such
> > checks; they should exist only as "terminate the process" asserts
> > only.
>
> The "option" is tricker than you seem to realize. =A0Anything
> which changes the size of an object (e.g. fat pointers vs.
> normal pointers) breaks the ABI. =A0You can't link object files
> compiled with different options. =A0(Or maybe you can link, but
> the resulting program will just crash.)

Indeed. I mentioned exactly this in the exact same post you quote.

> Note that this is already the case with compilers which provide
> "debugging" versions of std::vector and others. =A0The debugging
> changes the size and the behavior of std::vector, and mixing
> code with and without debugging causes core dumps.

I just wish they actually caught all errors / undefined behavior in a
systematic fashion instead of in the current half-hazard way.

> > > For most programs, we don't care about the speed.
> > Agreed, and with the current state of the C++ industry, C++ is
> > not the best language for every situation. Perhaps Java is be
> > more useful for most programs.
>
> Only if you can accept a fairly low level of robustness.

This intrigues me. If you elaborate or point me to articles, I'd love
to read up on this. IMHO, I could probably write an application faster
in C++ and have it be "more correct" (aka less testing / bug-fixing
time), but the same probably isn't true of the average developer. I'm
just curious how you're defining "robustness". Are we talking real-
time? Or correct in the face of errors? Stuff like how it's easier to
leak file handles, other non-memory resources? Or how it's exceedingly
annoying and difficult to write correct code in the face of
"dispose" / "close" / "release" calls which can throw exceptions? Or
are we talking about how it's impossible to write correct code in the
face of asynchronous exceptions?
0
Reply joshuamaurice (576) 7/16/2009 9:34:02 PM

In article <32ad876d-2051-43d9-97a2-cd09026eeac8
@o6g2000yqj.googlegroups.com>, james.kanze@gmail.com says...

[ ... ]

> Well, it's a start, although it's still very limited.

Beyond a doubt -- I certainly didn't intend to imply that it was any 
sort of panacea. At the same time, I suspect for some types of code 
it's _really_ helpful.

[ ... ]

> Which wasn't really what I was talking about.  My point was that
> having done &v[i], you have a raw pointer, on which you can do
> pointer arithmetic, and access out of bounds without checking,
> even if the implementation checks in vector<>::operator[].

Right -- my point wasn't that the checks were the same, or anything 
like that, just that (interestingly enough) taking an address happens 
to break both.

> But the documentation on how VC++ detects uninitialized
> variables did seem wierd to me.  For a runtime check, I'd have
> just associated an additional flag somewhere, setting it when
> the variable was set, and checking it otherwise.  Perhaps the
> problem is that if initialization occurs through a pointer, the
> compiler can't generate the code to update the flag, so if the
> address is taken (which would allow such initialization), it
> just gives up.

That's my guess. For code that knows to do so, updating the flag is 
easy -- but if you pass the address to the OS (for example) to read 
from a file into a buffer, it's going to take a (substantial) update 
to the ABI for the OS to find and update the flag appropriately. 
Basically, you wouldn't be able to pass raw addresses to the OS 
anymore -- you'd have to use some sort of fat pointer. I'm not sure 
that's an entirely bad idea either, but I'm afraid in the open market 
such an OS would tend to disappear without a trace. Too much emphasis 
on still placed on raw speed for such things to survive.

OTOH, I'd almost bet that the little bit MS has done basically came 
from the OS side of the house -- specifically, I'd almost bet that 
some bright boy (and I do NOT mean that pejoratively at all) thought 
about the number of times they've run into problems from simple 
buffer overruns and such, and thought that since the programmers 
weren't preventing or catching such errors dependably, it would be a 
good idea to see how much they could do in the compiler instead.

-- 
    Later,
    Jerry.
0
Reply jerryvcoffin (233) 7/17/2009 3:19:22 AM

this was originally on comp.lang.c++
but discussions about Undefined Behaviour seem on-topic to comp.lang.c
as well


On 16 July, 09:47, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 15, 3:18 pm, NickKeighley<nick_keighley_nos...@hotmail.com>
> wrote:
> > On 14 July, 17:09, James Kanze <james.ka...@gmail.com> wrote:
> > > On Jul 14, 10:40 am, Stuart Redmann <DerTop...@web.de> wrote:
> > > > On 11 Jul., 00:54, Joshua Maurice <joshuamaur...@gmail.com> wrote:


> > > > [snipped discussion about run-time error when OP accessed
> > > > uninitialized memory. OP complained that C++ compiler (gcc)
> > > > could not detect this even though its warning level was set to
> > > > highest]
>
> > > > > You seem to be taking the opinion that compilers should
> > > > > catch all undefined behavior. C++ is not Java. C++'s stated
> > > > > primary design goals include
>
> > > > > - runtime performance comparable with assembly
> > > > > - don't pay for what you don't use
> > > > > - portable
> > > > > - easy to write code / programmer productivity (with less relativ=
e
> > > > > emphasis on this one IMHO)
> > > > > With these design goals in mind, it is not reasonable to
> > > > > expect a compiler to catch all possible undefined behavior
> > > > > or errors. To do that would necessarily restrict the
> > > > > language so that it's less comparable to assembly in speed
> > > > > and/or you start paying for things you don't use.
>
> > > That's not strictly true. =A0Both the C and the C++ standards
> > > were designed so that all undefined behavior can be caught.

this surprised me


> > really? Where does it say that? Do you mean at compile time or
> > at run-time?
>
> At run-time, at the latest. =A0(I think that there is some which
> can't be detected at compile time. =A0But probably a lot less than
> one might think---compilers have gotten quite good at tracing
> intermodular code flow.) =A0And it's scattered throughout the
> standard. =A0Mostly in the form of "undefined behavior"---the
> behavior is undefined precisely so that a checking
> implementation can trap it.

I thought a fair amount of Undefined Behaviour was implicit.
Thta is no behaviour was defined therefore the behaviour was
undefined; rather there being an explicit statement that
"this behaviour is not defined".  I'm pretty sure this is true
of C if not C++.


> > I'd always thought about half of UB was in the spec precisely
> > because it was too hard to detect.
>
> Too hard, no. =A0Too expensive, perhaps: to catch all pointer
> violations, you need "fat" pointers---each pointer contains a
> current address, plus the limits, each modification of the
> pointer value verifies that the current address stays in the
> limits, and each access through the pointer verifies that it
> isn't using the end pointer (and that the pointer isn't null,
> but most hardware traps this already today).
>
> Of course, a good compiler could eliminate a certain number of
> these checks, or at least hoist them outside of a loop. =A0But I
> don't think it could easily avoid the fact that the size of a
> pointer is multiplied by three, which makes things like copying
> significantly more expensive, and can have very negative effects
> on locality.
>
> > The other half was hardware stuff things like what the modulo
> > operator does with negative numbers
>
> That's unspecified, not undefined behavior.
>
> > gets()
>
> Takes a pointer. =A0If the pointer contains the bounds, then it
> can easily check.

<snip>


--
Nick Keighley

"Don't spare the neurogrinders!"
                    Toyd Numble V1
0
Reply nick_keighley_nospam (4575) 7/17/2009 7:15:20 AM

On Jul 17, 9:15 am, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:
> this was originally on comp.lang.c++ but discussions about
> Undefined Behaviour seem on-topic to comp.lang.c as well

Given that C++ just takes over the C definition here.

> On 16 July, 09:47, James Kanze <james.ka...@gmail.com> wrote:
> > > > > > With these design goals in mind, it is not reasonable to
> > > > > > expect a compiler to catch all possible undefined behavior
> > > > > > or errors. To do that would necessarily restrict the
> > > > > > language so that it's less comparable to assembly in speed
> > > > > > and/or you start paying for things you don't use.

> > > > That's not strictly true.  Both the C and the C++ standards
> > > > were designed so that all undefined behavior can be caught.

> this surprised me

On thinking about it, I probably overstated it.  The context of
the discussion was things like array bounds and pointer errors,
and that's really what I had in mind.  Although I think things
like i =3D ++i are catchable, I don't think that the intent of
making it undefined was to allow it to be caught at runtime.

There are still large categories of behavior which is undefined
expressedly to allow an implementation to catch it; arithmetic
overflow and array bounds and pointer errors are in this
category.

> > > really? Where does it say that? Do you mean at compile time or
> > > at run-time?

> > At run-time, at the latest.  (I think that there is some which
> > can't be detected at compile time.  But probably a lot less than
> > one might think---compilers have gotten quite good at tracing
> > intermodular code flow.)  And it's scattered throughout the
> > standard.  Mostly in the form of "undefined behavior"---the
> > behavior is undefined precisely so that a checking
> > implementation can trap it.

> I thought a fair amount of Undefined Behaviour was implicit.
> Thta is no behaviour was defined therefore the behaviour was
> undefined; rather there being an explicit statement that "this
> behaviour is not defined".  I'm pretty sure this is true of C
> if not C++.

I don't think so.  I think that almost all of the cases of
undefined behavior are explicitly stated as such.  I think that
the rule of undefined behavior when the standard doesn't say
anything is mainly there to catch oversights.  What "undefined
behaviors" did you have in mind?

(The typical examples of undefined behavior are all explicitely
stated as undefined: pointer and array bounds errors in the
specifications of the various operators on pointers, things like
i=3D++i in the header text for the Expressions section, illegal
operands to functions in the introductory text of the Library
section, and violations of what C++ calls the one definition
rule in section 3.2 in C++, and in section 6.2.7 in C.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/17/2009 8:40:48 AM

On Jul 17, 5:19 am, Jerry Coffin <jerryvcof...@yahoo.com> wrote:
> In article <32ad876d-2051-43d9-97a2-cd09026eeac8
> @o6g2000yqj.googlegroups.com>, james.ka...@gmail.com says...
> [ ... ]

> > Which wasn't really what I was talking about.  My point was
> > that having done &v[i], you have a raw pointer, on which you
> > can do pointer arithmetic, and access out of bounds without
> > checking, even if the implementation checks in
> > vector<>::operator[].

> Right -- my point wasn't that the checks were the same, or
> anything like that, just that (interestingly enough) taking an
> address happens to break both.

Yes.  The problem is that raw pointers are, well, very raw.  And
that in C and C++, you have to use them in contexts where you
really shouldn't, since arrays convert to a pointer at the drop
of a hat, and array indexing is defined in terms of pointer
arithmetic.  (Basically, pointer arithmetic should be reserved
for very low level code, like that inside malloc, and not be
used elsewhere.  In C or C++, however, you often don't have the
choice.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/17/2009 8:49:01 AM

On Jul 16, 11:34 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 16, 2:09 am, James Kanze <james.ka...@gmail.com> wrote:

    [...]
> > > > For most programs, we don't care about the speed.
> > > Agreed, and with the current state of the C++ industry, C++ is
> > > not the best language for every situation. Perhaps Java is be
> > > more useful for most programs.

> > Only if you can accept a fairly low level of robustness.

> This intrigues me. If you elaborate or point me to articles,
> I'd love to read up on this. IMHO, I could probably write an
> application faster in C++ and have it be "more correct" (aka
> less testing / bug-fixing time), but the same probably isn't
> true of the average developer.

I don't think it's possible in Java to reach the level of
robustness I generally require, regardless of the developer.
There are too many things that simply aren't possible, like
programming by contract (which means executable code in the
"interface", which Java doesn't allow), or RAII.  Or simply
being able to abort on an assertion failure.

I think that there are pre-processors which resolve some of
Java's problems in this respect---I've heard of one for
programming by contract, for example, and I'd like to see
something like ESC/Java for C++ (and a couple of quick checks on
the web suggest that Java is evolving to address these
problems).

> I'm just curious how you're defining "robustness".

Vaguely:-).  Basically, just that the code is known to be
correct, to a certain point, and that any errors will be
promptly detected and can easily be fixed.

> Are we talking real- time?

No.

> Or correct in the face of errors?

Possibly.  If you accept that the correct behavior in the case
of a programming error is to abort (which is usually a
requirement in my work), then it's impossible to write code with
this behavior in Java.

> Stuff like how it's easier to leak file handles, other
> non-memory resources? Or how it's exceedingly annoying and
> difficult to write correct code in the face of "dispose" /
> "close" / "release" calls which can throw exceptions?

Partially.  The lack of RAII does make certain types of errors
more likely, or harder to prevent (and missing finally blocks
are a common error in Java).

> Or are we talking about how it's impossible to write correct
> code in the face of asynchronous exceptions?

I'm not too sure what you mean here.  As far as I know, neither
Java nor C++ support what I would call an asynchronous
exception.  On the other hand, the fact that you can't guarantee
a function to never raise an exception in Java does mean that
you can't write really exception safe code.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/17/2009 9:01:13 AM

James Kanze wrote:
> On Jul 17, 9:15 am, Nick Keighley <nick_keighley_nos...@hotmail.com>
> wrote:
>> [...]
>> I thought a fair amount of Undefined Behaviour was implicit.
>> Thta is no behaviour was defined therefore the behaviour was
>> undefined; rather there being an explicit statement that "this
>> behaviour is not defined".  I'm pretty sure this is true of C
>> if not C++.
> 
> I don't think so.  I think that almost all of the cases of
> undefined behavior are explicitly stated as such.  I think that
> the rule of undefined behavior when the standard doesn't say
> anything is mainly there to catch oversights.  What "undefined
> behaviors" did you have in mind?
> [...]

     The Committee's reasons for using three means to declare
behavior "undefined" are unknown to me, but there is no
difference in effect or in quality between "undefined due
to violation," "explicitly undefined" and "undefined by
omission."  ISO/IEC 9899:1999, section 4 paragraph 2:

	If a ��shall�� or ��shall not�� requirement that
	appears outside of a constraint is violated, the
	behavior is undefined. Undefined behavior is otherwise
	indicated in this International Standard by the words
	��undefined behavior�� or by the omission of any
	explicit definition of behavior. There is no difference
	in emphasis among these three; they all describe
	��behavior that is undefined��.

The final sentence says it all (and says it normatively!):
All three means of un-definition are equivalent.  (In C, at
any rate: I don't know That Other Language.)

-- 
Eric Sosman
esosman@ieee-dot-org.invalid
0
Reply esosman2 (2945) 7/17/2009 12:02:08 PM

Eric Sosman wrote:
> James Kanze wrote:
>> On Jul 17, 9:15 am, Nick Keighley <nick_keighley_nos...@hotmail.com>
>> wrote:
>>> [...]
>>> I thought a fair amount of Undefined Behaviour was implicit.
>>> Thta is no behaviour was defined therefore the behaviour was
>>> undefined; rather there being an explicit statement that "this
>>> behaviour is not defined".  I'm pretty sure this is true of C
>>> if not C++.
>>
>> I don't think so.  I think that almost all of the cases of
>> undefined behavior are explicitly stated as such.  I think that
>> the rule of undefined behavior when the standard doesn't say
>> anything is mainly there to catch oversights.  What "undefined
>> behaviors" did you have in mind?
>> [...]
> 
>     The Committee's reasons for using three means to declare
> behavior "undefined" are unknown to me, but there is no
> difference in effect or in quality between "undefined due
> to violation," "explicitly undefined" and "undefined by
> omission."  ISO/IEC 9899:1999, section 4 paragraph 2:
> 
>     If a ‘‘shall’’ or ‘‘shall not’’ requirement that
>     appears outside of a constraint is violated, the
>     behavior is undefined. Undefined behavior is otherwise
>     indicated in this International Standard by the words
>     ‘‘undefined behavior’’ or by the omission of any
>     explicit definition of behavior. There is no difference
>     in emphasis among these three; they all describe
>     ‘‘behavior that is undefined’’.
> 
> The final sentence says it all (and says it normatively!):
> All three means of un-definition are equivalent.  (In C, at
> any rate: I don't know That Other Language.)

The C++ standard does not mention "shall" as a method of indicating 
undefined behavior. Section 1.3.13 says "Undefined behavior may also be 
expected when this International Standard omits the description of any 
explicit definition of behavior.", which strikes me as bad wording - the 
phrase "may ... be expected" reflects and reinforces the misconception 
that "undefined behavior" refers to a specific type of undesireable 
behavior.

I think that the C++ wording provides more support for James Kanze's 
opinion that the C wording does.
0
Reply jameskuyper (5207) 7/17/2009 12:26:34 PM

On 17 July, 09:40, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 17, 9:15 am, Nick Keighley <nick_keighley_nos...@hotmail.com>
> wrote:
>
> > this was originally on comp.lang.c++ but discussions about
> > Undefined Behaviour seem on-topic to comp.lang.c as well
>
> Given that C++ just takes over the C definition here.

which is why I thought it was a legitimate x-post



> > On 16 July, 09:47, James Kanze <james.ka...@gmail.com> wrote:
> > > > > > > With these design goals in mind, it is not reasonable to
> > > > > > > expect a compiler to catch all possible undefined behavior
> > > > > > > or errors. To do that would necessarily restrict the
> > > > > > > language so that it's less comparable to assembly in speed
> > > > > > > and/or you start paying for things you don't use.
> > > > > That's not strictly true. =A0Both the C and the C++ standards
> > > > > were designed so that all undefined behavior can be caught.
> > this surprised me
>
> On thinking about it, I probably overstated it. =A0The context of
> the discussion was things like array bounds and pointer errors,
> and that's really what I had in mind. =A0Although I think things
> like i =3D ++i are catchable, I don't think that the intent of
> making it undefined was to allow it to be caught at runtime.

ah, that what I was disputing. Or rather that it was took me by
surprise I'd always kind of assumed they bunged in UB just to make
the implementor's job easier.


> There are still large categories of behavior which is undefined
> expressedly to allow an implementation to catch it; arithmetic
> overflow and array bounds and pointer errors are in this
> category.
>
> > > > really? Where does it say that? Do you mean at compile time or
> > > > at run-time?
> > > At run-time, at the latest. =A0(I think that there is some which
> > > can't be detected at compile time. =A0But probably a lot less than
> > > one might think---compilers have gotten quite good at tracing
> > > intermodular code flow.) =A0And it's scattered throughout the
> > > standard. =A0Mostly in the form of "undefined behavior"---the
> > > behavior is undefined precisely so that a checking
> > > implementation can trap it.
> > I thought a fair amount of Undefined Behaviour was implicit.
> > Thta is no behaviour was defined therefore the behaviour was
> > undefined; rather there being an explicit statement that "this
> > behaviour is not defined". =A0I'm pretty sure this is true of C
> > if not C++.
>
> I don't think so. =A0I think that almost all of the cases of
> undefined behavior are explicitly stated as such. =A0I think that
> the rule of undefined behavior when the standard doesn't say
> anything is mainly there to catch oversights. =A0What "undefined
> behaviors" did you have in mind?
>
> (The typical examples of undefined behavior are all explicitely
> stated as undefined: pointer and array bounds errors in the
> specifications of the various operators on pointers, things like
> i=3D++i in the header text for the Expressions section, illegal
> operands to functions in the introductory text of the Library
> section, and violations of what C++ calls the one definition
> rule in section 3.2 in C++, and in section 6.2.7 in C.)
0
Reply nick_keighley_nospam (4575) 7/17/2009 2:04:34 PM

On Jul 17, 2:01=A0am, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 16, 11:34 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > Or are we talking about how it's impossible to write correct
> > code in the face of asynchronous exceptions?
>
> I'm not too sure what you mean here. =A0As far as I know, neither
> Java nor C++ support what I would call an asynchronous
> exception. =A0On the other hand, the fact that you can't guarantee
> a function to never raise an exception in Java does mean that
> you can't write really exception safe code.

http://java.sun.com/docs/books/jls/first_edition/html/11.doc.html

details asynchronous exceptions. I haven't thought it through
thoroughly enough, but at the very least it would be in practice
impossible to write correct code in the face of asynchronous
exceptions, and may be in-fact impossible to write it. Depends on
exactly what they call "a transfer of control" and "statements" in
regards to when an asynchronous exception can be raised. However, this
is getting a little off topic, so I guess I'll leave it at that.
0
Reply joshuamaurice (576) 7/17/2009 6:01:45 PM

"Nick Keighley" <nick_keighley_nospam@hotmail.com> ha scritto nel messaggio
news:adfbac3b-0be5-4c03-beb5-53d06179f6cf@c1g2000yqi.googlegroups.com...
this was originally on comp.lang.c++
but discussions about Undefined Behaviour seem on-topic to comp.lang.c
as well

in a cpu can not be undefinited behaviour because
if the cpu is in the state X and it read the instruction "a"
the result will be always the state X'

the same for the couple cpu-os
of the cpu-os is in the state XX and it read the instruction "a"
the result will be always the state XX'

UB exist only in the standards



0
Reply io_x 7/18/2009 5:48:33 AM

In article <4a6160c1$0$47540$4fafbaef@reader1.news.tin.it>, 
a@b.c.invalid says...
> 
> "Nick Keighley" <nick_keighley_nospam@hotmail.com> ha scritto nel messaggio
> news:adfbac3b-0be5-4c03-beb5-53d06179f6cf@c1g2000yqi.googlegroups.com...
> this was originally on comp.lang.c++
> but discussions about Undefined Behaviour seem on-topic to comp.lang.c
> as well
> 
> in a cpu can not be undefinited behaviour because
> if the cpu is in the state X and it read the instruction "a"
> the result will be always the state X'
> 
> the same for the couple cpu-os
> of the cpu-os is in the state XX and it read the instruction "a"
> the result will be always the state XX'
> 
> UB exist only in the standards

Not really.

First of all, some CPUs have instructions that cause undefined 
results -- and while on a _specific_ CPU, the result of execution may 
be predictable, different versions of the CPU, down to and including 
different steppings, may give different behavior for that 
instruction.

In other cases, the behavior even on a single CPU could be 
unpredictable -- just for example, Intel has included a thermal diode 
in some of their CPUs that's intended as high quality (albeit slow) 
source of truly random numbers. While there are certainly defined 
ways to access that diode, it's entirely possible that executing some 
undefined instruction could do so as well -- and at least part of the 
result state after doing so could be entirely unpredictable.

-- 
    Later,
    Jerry.
0
Reply jerryvcoffin (233) 7/18/2009 6:08:30 AM

>but discussions about Undefined Behaviour seem on-topic to comp.lang.c
>as well
>
>in a cpu can not be undefinited behaviour because
>if the cpu is in the state X and it read the instruction "a"
>the result will be always the state X'

This isn't technically true if the CPU reads data from outside the
CPU (e.g. memory, memory-mapped I/O, I/O-port-mapped I/O, or even
the environmental sensors in the CPU for CPU temperature):  the
register contents (which is part of the state) will change depending
on the outside data.  Of course this is expected to happen, but the
resulting state is not fixed.  It's especially true if the CPU has
a special instruction to fetch the contents of a random-number
generator, which some CPUs in the Pentium class have.

Try reading the manuals for a CPU sometime.  It's not that uncommon
for there to be opcode combinations whose action is not defined
(and may differ between different models of the same CPU family).
It's also not that uncommon for a bit in a special-purpose register
to be documented as "reserved", and you're only supposed to write
back what was read out of it in the first place.  Future versions
of the same CPU family may define what this bit does.  Intel processor
documentation for the x86 family does this a lot, with things like
the EFLAGS register and various machine-specific registers.

Some CPUs dump out a bunch of status information for situations like
a "machine check" where some error occurred.  If you mess with the
status information, then try to use it to restart where you left off,
you may end up with undefined and bizarre behavior, since some of
this relates to the internal state of undocumented stuff like
CPU pipelining, assignment of physical registers to logical registers,
etc..

>the same for the couple cpu-os
>of the cpu-os is in the state XX and it read the instruction "a"
>the result will be always the state XX'

It's much harder to make hard statements like this for a multi-tasking
OS or a multi-core CPU.  Especially, external timing of things like
disk I/O or network packets can change things a lot.

>UB exist only in the standards

There are standards for CPUs and CPU families also, even if it's
only one created by the manufacturer.

0
Reply gordonb.fye78 (1) 7/18/2009 6:24:35 AM

"Gordon Burditt" <gordonb.fye78@burditt.org> ha scritto nel messaggio
news:zeidnRAKJvA-9_zXnZ2dnUVZ_tmdnZ2d@posted.internetamerica...
> >but discussions about Undefined Behaviour seem on-topic to comp.lang.c
>>as well
>>
>>in a cpu can not be undefinited behaviour because
>>if the cpu is in the state X and it read the instruction "a"
>>the result will be always the state X'

>>UB exist only in the standards
>
> There are standards for CPUs and CPU families also, even if it's
> only one created by the manufacturer.

i not speak about standards, i speak about a real cpu
if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
read the binary of "add eax, ebx"
the result will be always the state of cpu
X'(eax=20, ebx=19, ecx=20 ...)

if it is not X' is one error of cpu

if one 386 cpu of state X
read the binary of "Herbert"
and the binary of "Herbert" is not an instruciton
the result will be always Y it should be definited




0
Reply io_x 7/18/2009 7:55:51 AM

In article <4a617e9c$0$832$4fafbaef@reader5.news.tin.it>, 
a@b.c.invalid says...

[ ... ]

> i not speak about standards, i speak about a real cpu
> if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
> read the binary of "add eax, ebx"
> the result will be always the state of cpu
> X'(eax=20, ebx=19, ecx=20 ...)

Yes, but what if what's executed is 'add eax, [ebx]' instead? If ebx 
happens to point to uninitialized memory, you don't know what you'll 
get in eax, and it will probably vary from one invocation of the 
program to the next. If ebx starts out set to zero (or another small 
number, typically anything less than 4 million or so) quite a few 
OSes will detect that you're accessing an illegal address, and halt 
the program with some sort of error message about it doing something 
illegal (of course, the exact message varies between OSes).

-- 
    Later,
    Jerry.
0
Reply jerryvcoffin (233) 7/18/2009 8:50:26 AM

"Jerry Coffin" <jerryvcoffin@yahoo.com> ha scritto nel messaggio
news:MPG.24cb3ab24a1089099896d6@news.sunsite.dk...
> In article <4a617e9c$0$832$4fafbaef@reader5.news.tin.it>,
> a@b.c.invalid says...
>
> [ ... ]
>
>> i not speak about standards, i speak about a real cpu
>> if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
>> read the binary of "add eax, ebx"
>> the result will be always the state of cpu
>> X'(eax=20, ebx=19, ecx=20 ...)
>
> Yes, but what if what's executed is 'add eax, [ebx]' instead? If ebx
> happens to point to uninitialized memory,

i see the hardware 386cpu, the memory that can read, like a system
if this system has one state

State(0)={X(eax=20, ebx=19, ecx=20 ...)
          Memory={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,1 ...}
         }
if the cpu read the instruction
'add eax, [ebx]'

State(1)={X(eax=21, ebx=19, ecx=20 ...)
          Memory={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,1,...}
         }
there is not UB

it is like a phisical system
if i know the position in the time 0 i know the position in the next time 1

UB in that system could be only for the fail of cpu

> you don't know what you'll
> get in eax, and it will probably vary from one invocation of the
> program to the next.

it is the same of the random source
if i know all the states, and the function that descrive them:
nothing is random, all have its function that descrive it

it is like I install the same OS in 2 pc that has the same
hardware; if a program segfault if the input of sys is X
in the other Pc the same prog go to segfault
if it has the same input X.

> If ebx starts out set to zero (or another small
> number, typically anything less than 4 million or so) quite a few
> OSes will detect that you're accessing an illegal address, and halt
> the program with some sort of error message about it doing something
> illegal (of course, the exact message varies between OSes).
>
> -- 
>    Later,
>    Jerry.



0
Reply io_x 7/18/2009 5:58:10 PM

On Sat, 18 Jul 2009 19:58:10 +0200, "io_x" <a@b.c.invalid> wrote:

>
>"Jerry Coffin" <jerryvcoffin@yahoo.com> ha scritto nel messaggio
>news:MPG.24cb3ab24a1089099896d6@news.sunsite.dk...
>> In article <4a617e9c$0$832$4fafbaef@reader5.news.tin.it>,
>> a@b.c.invalid says...
>>
>> [ ... ]
>>
>>> i not speak about standards, i speak about a real cpu
>>> if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
>>> read the binary of "add eax, ebx"
>>> the result will be always the state of cpu
>>> X'(eax=20, ebx=19, ecx=20 ...)
>>
>> Yes, but what if what's executed is 'add eax, [ebx]' instead? If ebx
>> happens to point to uninitialized memory,
>
>i see the hardware 386cpu, the memory that can read, like a system
>if this system has one state
>
>State(0)={X(eax=20, ebx=19, ecx=20 ...)
>          Memory={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,1 ...}
>         }
>if the cpu read the instruction
>'add eax, [ebx]'
>
>State(1)={X(eax=21, ebx=19, ecx=20 ...)
>          Memory={1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,1,...}
>         }
>there is not UB
>
>it is like a phisical system
>if i know the position in the time 0 i know the position in the next time 1
>
>UB in that system could be only for the fail of cpu

Except you will never know all the states.  Every time you run the
program, the clock will be different.  If you are running on a typical
system, you share the CPU, memory, and OS with numerous other tasks
running "simultaneously" and their states will not be the same from
one execution to the next.

If you run the program in the morning and it produces a result of 42
and in the afternoon and it produces 24 or you recompile it on an odd
numbered Thursday during a full moon and it now produces 69 or you
change compilers and it now produces 7734, there is no way for you to
define the behavior of the program.  Hence, its behavior is undefined.

-- 
Remove del for email
0
Reply schwarzb3978 (1363) 7/18/2009 10:37:13 PM

On Jul 18, 7:48 am, "io_x" <a...@b.c.invalid> wrote:
> "Nick Keighley" <nick_keighley_nos...@hotmail.com> ha scritto nel messagg=
ionews:adfbac3b-0be5-4c03-beb5-53d06179f6cf@c1g2000yqi.googlegroups.com...
> this was originally on comp.lang.c++
> but discussions about Undefined Behaviour seem on-topic to
> comp.lang.c as well

> in a cpu can not be undefinited behaviour because if the cpu
> is in the state X and it read the instruction "a" the result
> will be always the state X'

Obviously, you've never worked on real hardware.  CPU's have
certain rules which must be obeyed, and can have undefined
behavior if you violated them.

> the same for the couple cpu-os of the cpu-os is in the state
> XX and it read the instruction "a" the result will be always
> the state XX'

Ditto.

> UB exist only in the standards

Not really.  What you can say is that most of the cases which
result in undefined behavior in the standard will still be
compiled to deterministic code on most most machines.  What that
code does, however, is not defined, and can be pretty much
anything.  Including behavior not recognized by the standard
(like generating a core dump, crashing the system or the
processor, formatting the hard disk, sending spam emails to half
the world...).

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/19/2009 11:11:29 AM

On Jul 19, 12:37 am, Barry Schwarz <schwa...@dqel.com> wrote:
> On Sat, 18 Jul 2009 19:58:10 +0200, "io_x" <a...@b.c.invalid> wrote:

> >"Jerry Coffin" <jerryvcof...@yahoo.com> ha scritto nel messaggio
> >news:MPG.24cb3ab24a1089099896d6@news.sunsite.dk...
> >> In article <4a617e9c$0$832$4fafb...@reader5.news.tin.it>,
> >> a...@b.c.invalid says...

> >> [ ... ]
> >it is like a phisical system if i know the position in the
> >time 0 i know the position in the next time 1

> >UB in that system could be only for the fail of cpu

> Except you will never know all the states.  Every time you run the
> program, the clock will be different.  If you are running on a typical
> system, you share the CPU, memory, and OS with numerous other tasks
> running "simultaneously" and their states will not be the same from
> one execution to the next.

> If you run the program in the morning and it produces a result
> of 42 and in the afternoon and it produces 24 or you recompile
> it on an odd numbered Thursday during a full moon and it now
> produces 69 or you change compilers and it now produces 7734,
> there is no way for you to define the behavior of the program.
> Hence, its behavior is undefined.

The actual speed of the gates on the chip will depend on the
temperature.  Depending on this speed, the results of accessing
"inexistant" memory may vary.  (Just to cite one case I've
actually encountered.  The program caused the machine to hang or
not, depending on how long the machine had been turned on.)

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34

0
Reply james.kanze (9620) 7/19/2009 11:16:10 AM

On Jul 17, 8:01 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Jul 17, 2:01 am, James Kanze <james.ka...@gmail.com> wrote:

> > On Jul 16, 11:34 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > > Or are we talking about how it's impossible to write
> > > correct code in the face of asynchronous exceptions?

> > I'm not too sure what you mean here.  As far as I know,
> > neither Java nor C++ support what I would call an
> > asynchronous exception.  On the other hand, the fact that
> > you can't guarantee a function to never raise an exception
> > in Java does mean that you can't write really exception safe
> > code.

> http://java.sun.com/docs/books/jls/first_edition/html/11.doc.html

> details asynchronous exceptions.

OK.  It's true that things like internal errors in the virtual
machine would be asynchronously.  (The C++ standard says nothing
about what happens when there is a problem in the execution
platform, so it's undefined behavior.  In practice, of course,
regardless of what the language specification says, if the
execution platform doesn't work correctly, you can forget the
language standard.)  The case of Thread.stop() is a special
case; they've explicitly provided a means of creating an
asynchronous exception in another thread.  Current C++ doesn't
recognize threads, but I don't think that the next version
(which does support threading) will have anything similar.  (The
current Java specification deprecates Thread.stop().)

> I haven't thought it through thoroughly enough, but at the
> very least it would be in practice impossible to write correct
> code in the face of asynchronous exceptions, and may be
> in-fact impossible to write it. Depends on exactly what they
> call "a transfer of control" and "statements" in regards to
> when an asynchronous exception can be raised. However, this is
> getting a little off topic, so I guess I'll leave it at that.

Writing exception safe code is on topic, and the nothrow
guarantee for a few, primitive functions is necessary in order
to do so.

--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34
0
Reply james.kanze (9620) 7/19/2009 11:23:38 AM

"James Kanze" <james.kanze@gmail.com> ha scritto nel messaggio
>news:164a9ba3-0ce2-4386-b64e-aca4299f7f94@s15g2000yqs.googlegroups.com...
>On Jul 19, 12:37 am, Barry Schwarz <schwa...@dqel.com> wrote:
>> On Sat, 18 Jul 2009 19:58:10 +0200, "io_x" <a...@b.c.invalid> wrote:
>> >"Jerry Coffin" <jerryvcof...@yahoo.com> ha scritto nel messaggio
>> >news:MPG.24cb3ab24a1089099896d6@news.sunsite.dk...
>> >> In article <4a617e9c$0$832$4fafb...@reader5.news.tin.it>,
>> >> a...@b.c.invalid says...

>> >> [ ... ]
>> >it is like a phisical system if i know the position in the
> >time 0 i know the position in the next time 1

>> >UB in that system could be only for the fail of cpu

>> Except you will never know all the states.  Every time you run the
>> program, the clock will be different.  If you are running on a typical
>> system, you share the CPU, memory, and OS with numerous other tasks
>> running "simultaneously" and their states will not be the same from
>> one execution to the next.

>> If you run the program in the morning and it produces a result
>> of 42 and in the afternoon and it produces 24 or you recompile
>> it on an odd numbered Thursday during a full moon and it now
>> produces 69 or you change compilers and it now produces 7734,
>> there is no way for you to define the behavior of the program.
>> Hence, its behavior is undefined.

>The actual speed of the gates on the chip will depend on the
>temperature.  Depending on this speed, the results of accessing
>"inexistant" memory may vary.  (Just to cite one case I've
>actually encountered.  The program caused the machine to hang or
>not, depending on how long the machine had been turned on.)


and now you too, know what have to go in a PC that is ok,
and what have to go in a PC is not ok.
there is the good way (full controll) and there
is the not good way (UBs)



0
Reply io_x 7/20/2009 8:53:10 AM

io_x wrote:
> "Gordon Burditt" <gordonb.fye78@burditt.org> ha scritto nel messaggio
> news:zeidnRAKJvA-9_zXnZ2dnUVZ_tmdnZ2d@posted.internetamerica...
>>> but discussions about Undefined Behaviour seem on-topic to comp.lang.c
>>> as well
>>>
>>> in a cpu can not be undefinited behaviour because
>>> if the cpu is in the state X and it read the instruction "a"
>>> the result will be always the state X'
> 
>>> UB exist only in the standards
>> There are standards for CPUs and CPU families also, even if it's
>> only one created by the manufacturer.
> 
> i not speak about standards, i speak about a real cpu
> if one 386 cpu of state X(eax=1, ebx=19, ecx=20 ...)
> read the binary of "add eax, ebx"
> the result will be always the state of cpu
> X'(eax=20, ebx=19, ecx=20 ...)

[...]

So, what you are saying is, if you execute a well-defined operation, you get 
well-defined results?  Nothing new here.

Perhaps you only have experience with the 386 CPU?

I have used CPUs which actually execute two instructions in parallel.  If 
you do something such that one of those instructions reads from address X 
and the other writes to address X, "bad things" can happen.  I don't know if 
those "bad things" are defined or not, however I do know that the C 
compilers I have used know to avoid such constructs, and will insert a no-op 
into one of the instructions if necessary.

Even something which is well defined in the high-level language could do 
"bad things" if the compiler didn't take this into account.

Consider:

     int a,b;
     ...
     a = b++;

What would happen if the compiler generated something like this?

     load  r1,b
     incr  b

If b initially holds the value 42, what value gets stored in register r1? 
Remember, both instructions are executed in parallel.

-- 
Kenneth Brody
0
Reply kenbrody (1861) 7/20/2009 4:13:28 PM

On Jul 19, 4:23=A0am, James Kanze <james.ka...@gmail.com> wrote:
> On Jul 17, 8:01 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
> > On Jul 17, 2:01 am, James Kanze <james.ka...@gmail.com> wrote:
> > > On Jul 16, 11:34 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > > > Or are we talking about how it's impossible to write
> > > > correct code in the face of asynchronous exceptions?
> > > I'm not too sure what you mean here. =A0As far as I know,
> > > neither Java nor C++ support what I would call an
> > > asynchronous exception. =A0On the other hand, the fact that
> > > you can't guarantee a function to never raise an exception
> > > in Java does mean that you can't write really exception safe
> > > code.
> >http://java.sun.com/docs/books/jls/first_edition/html/11.doc.html
> > details asynchronous exceptions.
>
> OK. =A0It's true that things like internal errors in the virtual
> machine would be asynchronously. =A0(The C++ standard says nothing
> about what happens when there is a problem in the execution
> platform, so it's undefined behavior. =A0In practice, of course,
> regardless of what the language specification says, if the
> execution platform doesn't work correctly, you can forget the
> language standard.) =A0The case of Thread.stop() is a special
> case; they've explicitly provided a means of creating an
> asynchronous exception in another thread. =A0Current C++ doesn't
> recognize threads, but I don't think that the next version
> (which does support threading) will have anything similar. =A0(The
> current Java specification deprecates Thread.stop().)
>
> > I haven't thought it through thoroughly enough, but at the
> > very least it would be in practice impossible to write correct
> > code in the face of asynchronous exceptions, and may be
> > in-fact impossible to write it. Depends on exactly what they
> > call "a transfer of control" and "statements" in regards to
> > when an asynchronous exception can be raised. However, this is
> > getting a little off topic, so I guess I'll leave it at that.
>
> Writing exception safe code is on topic, and the nothrow
> guarantee for a few, primitive functions is necessary in order
> to do so.

From that little except, I'm not clear on exactly when a Java
asynchronous exception can be thrown. I'm not sure what my no-throwing
primitives are, if any. Then again, it's sort of an anal point. As you
noted, if the platform encounters a problem it can't handle, your
program is FUBAR anyway. I do strongly dislike how the stack will be
unwound if the JVM hits some kinds of internal errors (see finally
blocks).
0
Reply joshuamaurice (576) 7/20/2009 8:19:05 PM

jacob navia wrote:

> And last but not least: Is there a good book in C++ debugging?

I don't know if it's good but I just came across this reference:

Zeller - Why Programs Fail
http://www.whyprogramsfail.com/


Martin

-- 
Quidquid latine scriptum est, altum videtur.
0
Reply martin.eisenberg (676) 7/23/2009 6:30:21 PM

72 Replies
39 Views

(page loaded in 0.705 seconds)

Similiar Articles:


















7/23/2012 7:01:50 PM


Reply: