Changing the meaning of the array indexing brackets "[ ]" to do bounds
checking and then using another notation for array indexing without
bounds checking, say "[| |]", might help minimize buffer overruns.
By default, previous programs would have bounds checking turned on,
unsless explicilty turned off.
--
Regards,
Casey
|
|
0
|
|
|
|
Reply
|
caseyhHAMMER_TIME (219)
|
11/2/2009 8:21:06 PM |
|
On 2009-11-02, Casey Hawthorne <caseyhHAMMER_TIME@istar.ca> wrote:
> Changing the meaning of the array indexing brackets "[ ]" to do bounds
> checking and then using another notation for array indexing without
> bounds checking, say "[| |]", might help minimize buffer overruns.
> By default, previous programs would have bounds checking turned on,
> unsless explicilty turned off.
It would also cause thousands of programs which currently work by blind
luck to cease working -- not a completely unacceptable outcome, I might
think in my more irritable days, but probably not something acceptable.
That said, I sort of like the notion of a notation for bounds-checked
array access -- I just think that it'd have to be a new notation rather
than a change to the old one.
-s
--
Copyright 2009, all wrongs reversed. Peter Seebach / usenet-nospam@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
|
|
0
|
|
|
|
Reply
|
usenet-nospam (2199)
|
11/2/2009 8:24:50 PM
|
|
On Nov 2, 12:21=A0pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca>
wrote:
> Changing the meaning of the array indexing brackets "[ ]" to do bounds
> checking and then using another notation for array indexing without
> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> By default, previous programs would have bounds checking turned on,
> unsless explicilty turned off.
Won't work in C, as C does not have operator overloading.
However, you could use this, which is easier and works with atomic
types like int:
http://duma.sourceforge.net/
|
|
0
|
|
|
|
Reply
|
dcorbit (2696)
|
11/2/2009 8:27:37 PM
|
|
On Nov 2, 2:21=A0pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
> Changing the meaning of the array indexing brackets "[ ]" to do bounds
> checking and then using another notation for array indexing without
> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> By default, previous programs would have bounds checking turned on,
> unsless explicilty turned off.
There's nothing to prevent someone from doing that now, nor is it a
new idea. It might break some programs which were relying on
undefined behavior, but I don't think too many people would find that
too objectionable (and assuming there was a way to turn off the
function, presumably with a compiler switch).
There are two problems. The minor one is overhead. Even if you know
the object being accessed, bounds checking is not free, and that will
cause issues in many places where C is a good choice of languages. To
some extent clever compilers can minimize the overhead, but they
cannot eliminate it altogether.
The major problem is that at the point where the array access happens,
whether that be the brackets or pointer arithmetic with deferencing,
the meta information about the "array," or whatever other object is
being accessed, is not really available to the C compiler. So you
don't really know the bounds of the array (or whatever) you're
accessing at that point. The only real solution is a fat pointer of
some sort that contains information about the object it's pointing to,
but that has very considerable overhead in both size and space. And
again, while compilers can mitigate that somewhat, there are limits,
especially considering the space overhead.
|
|
0
|
|
|
|
Reply
|
robertwessel2 (1339)
|
11/2/2009 9:31:18 PM
|
|
Casey Hawthorne a �crit :
> Changing the meaning of the array indexing brackets "[ ]" to do bounds
> checking and then using another notation for array indexing without
> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> By default, previous programs would have bounds checking turned on,
> unsless explicilty turned off.
> --
> Regards,
> Casey
The lcc-win C compiler implements this. It is called "operator overloading".
You can define your own [ ] operators, and the compiler will call your function
that can do bounds checking in the array.
You define a structure with size information, and a pointer to the data stored
(and maybe other data). For this structure you redefine the operator [ ] and
you check the array bounds.
When you do not want to check anymore you do NOT define any operator overloading
and you replace the array structure by a normal array. The code that uses the
array doesn't need to be modified.
I have been arguing for this extension to C since 2002.
jacob
|
|
0
|
|
|
|
Reply
|
jacob24 (973)
|
11/2/2009 9:35:44 PM
|
|
jacob navia <jacob@nospam.org> writes:
> Casey Hawthorne a écrit :
>> Changing the meaning of the array indexing brackets "[ ]" to do bounds
>> checking and then using another notation for array indexing without
>> bounds checking, say "[| |]", might help minimize buffer overruns.
>>
>> By default, previous programs would have bounds checking turned on,
>> unsless explicilty turned off.
>
> The lcc-win C compiler implements this. It is called "operator
> overloading".
>
> You can define your own [ ] operators, and the compiler will call
> your function that can do bounds checking in the array.
[...]
Does lcc-win allow a [] operator to be defined for existing types that
already have a built-in [] operator? For example, can I do something
like this (I'm not sure of the syntax)?
int operator[](int *arr, size_t index)
{
return /* something */;
}
I suspect the answer is no.
I think the OP is suggesting changing the semantics of the existing
built-in [] operator, so that existing code gains bounds checking
with no modifications to the source.
Of course implementations are already allowed to do this, even
without operator overloading. All cases in which a bounds check
would fail invoke undefined behavior anyway, so the implementation
is free to generate code that detects the error.
The overhead would be substantial, since you'd need fat pointers to
retain bounds information for function parameters. And it would
break some existing code whose behavior is, strictly speaking,
undefined, but that works in practice. The most obvious examples
are the "struct hack" (which could be modified to use the C99 feature
that replaces it, but the point is to avoid changing the source), and
code that treats a multidimensional array as a one-dimensional array.
Code that avoids these odd corners of the language could certainly
benefit from a bounds checking implementation. And in fact I believe
such implementations exist, though I don't know if they go so far as
to implement fat pointers to enable checking on pointer parameters.
(At the very least, *testing* code under such an implementation could
be instructive.)
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
|
|
0
|
|
|
|
Reply
|
kst-u (21464)
|
11/2/2009 11:18:21 PM
|
|
robertwessel2@yahoo.com wrote:
> On Nov 2, 2:21 pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
>> Changing the meaning of the array indexing brackets "[ ]" to do bounds
>> checking and then using another notation for array indexing without
>> bounds checking, say "[| |]", might help minimize buffer overruns.
>>
> [...]
> The major problem is that at the point where the array access happens,
> whether that be the brackets or pointer arithmetic with deferencing,
> the meta information about the "array," or whatever other object is
> being accessed, is not really available to the C compiler. So you
> don't really know the bounds of the array (or whatever) you're
> accessing at that point. The only real solution is a fat pointer of
> some sort that contains information about the object it's pointing to,
> but that has very considerable overhead in both size and space. And
> again, while compilers can mitigate that somewhat, there are limits,
> especially considering the space overhead.
Structs and unions are splendid sources of complication here.
For example,
union u { int i; char c[10]; };
union u *p = malloc(sizeof *p); /* assume success */
Imagine a not uncommon platform where sizeof(int) == 4 and
int requires 4-byte alignment; we'll probably wind up with a 12-
byte union, so the final line would be equivalent to
union u *p = malloc(12);
Now, what are the bounds associated with the "fat pointer" that
malloc() returns? malloc() doesn't know how the allocated area
will be used, so all twelve bytes are "in range." (In fact, they
must be "in range," because it's perfectly legal to copy all
twelve bytes of another union u instance into the allocated space.)
Okay, so malloc() returns a fat pointer saying "addresses foo
through foo+11." What bounds are stored in p? It looks liks p
must also say "foo through foo+11," because we must be able to
turn p back into a void* and hand the result to memcpy() or free()
or whatever, without setting off alarms when the final two bytes
are accessed. So p's bounds cover all of the allocated space.
But now it gets even more entertaining:
int *ip = &p->i;
char *cp = p->c;
What bounds are carried with the fat pointers ip and cp? For
purposes of assignment and so on only ip[0] is valid, and only
cp[0] through cp[9] are valid, so we'd like to be attach the
limits "foo through foo+3" and "foo through foo+9." But we
can't! Either pointer can be converted to a union u*, and
the resulting pointer must cover all of "foo through foo+11."
This must be the case even if the conversion is indirect, via
umpty-seven layers of void* and separately compiled modules
where union u has never been heard of:
strcpy (p->c, "x");
char *q = strchr(p->c, 'x');
union u *p2 = (union u*)q;
The pointer p2 must cover all twelve bytes, even though it is
derived from a pointer that we'd prefer to have cover only ten.
Seems to me that doing a thorough job might require not just
a pointer-plus-bounds, but a pointer-with-history -- not just a
fat pointer, but an obese pointer. More likely you'll get a
sort of "approximate bounds check," where the bounds of anything
derived from p would cover all the space returned by malloc(),
and boundaries inside that allocations wouldn't be checked.
(I read about a bounds-checking extension for gcc a dozen or so
years ago, and ISTR that's how it worked: Fat pointers for full
allocations, no intra-allocation checking.)
--
Eric.Sosman@sun.com
|
|
0
|
|
|
|
Reply
|
Eric.Sosman (4228)
|
11/2/2009 11:33:41 PM
|
|
"Eric Sosman" <Eric.Sosman@sun.com> wrote in message
news:1257204809.568894@news1nwk...
> robertwessel2@yahoo.com wrote:
>> On Nov 2, 2:21 pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
>>> Changing the meaning of the array indexing brackets "[ ]" to do bounds
>>> checking and then using another notation for array indexing without
>>> bounds checking, say "[| |]", might help minimize buffer overruns.
>>>
> Structs and unions are splendid sources of complication here.
> For example,
>
> union u { int i; char c[10]; };
> union u *p = malloc(sizeof *p); /* assume success */
>
> Imagine a not uncommon platform where sizeof(int) == 4 and
> int requires 4-byte alignment; we'll probably wind up with a 12-
> byte union, so the final line would be equivalent to
>
> union u *p = malloc(12);
....
> But now it gets even more entertaining:
>
> int *ip = &p->i;
> char *cp = p->c;
>
> What bounds are carried with the fat pointers ip and cp? For
....
This must be the case even if the conversion is indirect, via
> umpty-seven layers of void* and separately compiled modules
> where union u has never been heard of:
>
> strcpy (p->c, "x");
> char *q = strchr(p->c, 'x');
> union u *p2 = (union u*)q;
>
> The pointer p2 must cover all twelve bytes, even though it is
> derived from a pointer that we'd prefer to have cover only ten.
>
> Seems to me that doing a thorough job might require not just
> a pointer-plus-bounds, but a pointer-with-history -- not just a
> fat pointer, but an obese pointer.
Your points all seem valid, but they remind me of when my boss wanted some
special feature or wanted something to work as he thought it should. I would
then proceed to explain a dozen ways why it didn't make sense and was
completely impossible.
But a day or so later I got the thing (hardware, software or both) working
just like he wanted! It wasn't quite so impossible after all.
The OP is only talking about arrays, and not pointers. But these are
inextricably linked so I guess we need 'proper' arrays, ie. distinct from
the sort of arrays/pointers currently in C. Then there wouldn't be all this
baggage to make the implementation of array bounds 'impossible'. That's
still a major language extension though.
--
Bartc
|
|
0
|
|
|
|
Reply
|
bartc (783)
|
11/3/2009 12:22:26 AM
|
|
Keith Thompson wrote:
> The overhead would be substantial, since you'd need fat pointers to
> retain bounds information for function parameters. And it would
> break some existing code whose behavior is, strictly speaking,
> undefined, but that works in practice. The most obvious examples
> are the "struct hack" (which could be modified to use the C99 feature
> that replaces it, but the point is to avoid changing the source), and
> code that treats a multidimensional array as a one-dimensional array.
Why would it break the struct hack? The fat pointer would have the
bounds of the actual allocation, not the struct type, right?
Or are you proposing that converting a fat pointer from one type to
another changes the bounds? I can see some bugs that would catch, but
it would also break a lot of perfectly valid code, which IMHO is
unacceptable.
> Code that avoids these odd corners of the language could certainly
> benefit from a bounds checking implementation. And in fact I believe
> such implementations exist, though I don't know if they go so far as
> to implement fat pointers to enable checking on pointer parameters.
There was a GCC branch that did fat pointers and bounds checking, but it
didn't get very far since it changed the ABI, requiring a recompile of
every library on the system. Few people are willing to go through that.
Presumably they could have added another feature to disable bounds
checking on certain functions or arguments, but they apparently didn't
or that caused other problems...
> (At the very least, *testing* code under such an implementation could
> be instructive.)
It would be a good start, but presumably if you knew all the cases to
test, you could eliminate the bugs themselves and not need the bounds
checking in the first place. If you don't test every case, the bounds
violation will escape into the wild where there is no checking.
S
--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
|
|
0
|
|
|
|
Reply
|
stephen (1122)
|
11/3/2009 12:30:37 AM
|
|
On Mon, 02 Nov 2009 15:18:21 -0800, Keith Thompson <kst-u@mib.org>
wrote:
>jacob navia <jacob@nospam.org> writes:
>> Casey Hawthorne a ?crit :
>>> Changing the meaning of the array indexing brackets "[ ]" to do bounds
>>> checking and then using another notation for array indexing without
>>> bounds checking, say "[| |]", might help minimize buffer overruns.
>>>
>>> By default, previous programs would have bounds checking turned on,
>>> unsless explicilty turned off.
>>
>> The lcc-win C compiler implements this. It is called "operator
>> overloading".
>>
>> You can define your own [ ] operators, and the compiler will call
>> your function that can do bounds checking in the array.
>[...]
>
>Does lcc-win allow a [] operator to be defined for existing types that
>already have a built-in [] operator? For example, can I do something
>like this (I'm not sure of the syntax)?
>
> int operator[](int *arr, size_t index)
> {
> return /* something */;
> }
>
>I suspect the answer is no.
>
>I think the OP is suggesting changing the semantics of the existing
>built-in [] operator, so that existing code gains bounds checking
>with no modifications to the source.
>
>Of course implementations are already allowed to do this, even
>without operator overloading. All cases in which a bounds check
>would fail invoke undefined behavior anyway, so the implementation
>is free to generate code that detects the error.
>
>The overhead would be substantial, since you'd need fat pointers to
>retain bounds information for function parameters. And it would
>break some existing code whose behavior is, strictly speaking,
>undefined, but that works in practice. The most obvious examples
>are the "struct hack" (which could be modified to use the C99 feature
>that replaces it, but the point is to avoid changing the source), and
>code that treats a multidimensional array as a one-dimensional array.
>
>Code that avoids these odd corners of the language could certainly
>benefit from a bounds checking implementation. And in fact I believe
>such implementations exist, though I don't know if they go so far as
>to implement fat pointers to enable checking on pointer parameters.
>(At the very least, *testing* code under such an implementation could
>be instructive.)
I thought of this question, of buffer overruns, after one of the
people interviewed for the book "Coders at Work" said that C was great
for systems programming by well trained programmers, but that C had
leaked out into the applications area.
For systems programming you do need the access to the machine that C
provides, but for applications programming, you don't need/shouldn't
have such access.
--
Regards,
Casey
|
|
0
|
|
|
|
Reply
|
caseyhHAMMER_TIME (219)
|
11/3/2009 12:52:46 AM
|
|
On 2 Nov, 20:27, user923005 <dcor...@connx.com> wrote:
> On Nov 2, 12:21=A0pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca>
> > Changing the meaning of the array indexing brackets "[ ]" to do bounds
> > checking and then using another notation for array indexing without
> > bounds checking, say "[| |]", might help minimize buffer overruns.
note: C++'s vector class did this the other way round
> > By default, previous programs would have bounds checking turned on,
> > unsless explicilty turned off.
>
> Won't work in C, as C does not have operator overloading.
what? He's (She's?) talking about making a change to the language.
> However, you could use this, which is easier and works with atomic
> types like int:http://duma.sourceforge.net/
--
"Resistance is futile. Read the C-faq."
-- James Hu (c.l.c.)
|
|
0
|
|
|
|
Reply
|
nick_keighley_nospam (4574)
|
11/3/2009 10:34:25 AM
|
|
On 2 Nov, 21:31, "robertwess...@yahoo.com" <robertwess...@yahoo.com>
wrote:
> On Nov 2, 2:21=A0pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
>
> > Changing the meaning of the array indexing brackets "[ ]" to do bounds
> > checking and then using another notation for array indexing without
> > bounds checking, say "[| |]", might help minimize buffer overruns.
>
> > By default, previous programs would have bounds checking turned on,
> > unsless explicilty turned off.
>
> There's nothing to prevent someone from doing that now, nor is it a
> new idea. =A0It might break some programs which were relying on
> undefined behavior, but I don't think too many people would find that
> too objectionable
the struct hack
typedef struct
{
Byte type;
Byte data[1];
} Message;
void process_msg (byte raw [64], size_t size)
{
Message *msg;
msg =3D (Message*)raw;
if (msg->data[1] =3D=3D LOCAL)
....
}
the struct hack may be deprecated but it does occur quite often. (I
prefer it to C99;s variable array)
> (and assuming there was a way to turn off the
> function, presumably with a compiler switch).
>
> There are two problems. =A0The minor one is overhead. =A0Even if you know
> the object being accessed, bounds checking is not free, and that will
> cause issues in many places where C is a good choice of languages. =A0To
> some extent clever compilers can minimize the overhead, but they
> cannot eliminate it altogether.
>
> The major problem is that at the point where the array access happens,
> whether that be the brackets or pointer arithmetic with deferencing,
> the meta information about the "array," or whatever other object is
> being accessed, is not really available to the C compiler. =A0So you
> don't really know the bounds of the array (or whatever) you're
> accessing at that point. =A0The only real solution is a fat pointer of
> some sort that contains information about the object it's pointing to,
> but that has very considerable overhead in both size and space. =A0And
> again, while compilers can mitigate that somewhat, there are limits,
> especially considering the space overhead.
Yeah, exactly, much of the time the compiler cannot easily determine
what the bounds are.
void refrobnicate (char*);
void snafu (size_t n)
{
char *fonz =3D malloc (n);
if (fonz =3D=3D 0) handle_memory_error();
if (fonz[42] =3D=3D 0) /* is there a bounds error here? */
{
char *yolonde =3D fonz + 42;
refrobnicate (yalonde);
}
}
void refrobnicate (char *yalonde)
{
if (yalonde [23] =3D=3D 0) /* or here? */
more_stuff();
}
so you have to fat pointerise the code (pass some hidden info around
with the memory block).
|
|
0
|
|
|
|
Reply
|
nick_keighley_nospam (4574)
|
11/3/2009 10:50:04 AM
|
|
On 3 Nov, 00:22, "bartc" <ba...@freeuk.com> wrote:
> "Eric Sosman" <Eric.Sos...@sun.com> wrote in message
>
> news:1257204809.568894@news1nwk...
>
>
>
>
>
> > robertwess...@yahoo.com wrote:
> >> On Nov 2, 2:21 pm, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
> >>> Changing the meaning of the array indexing brackets "[ ]" to do bound=
s
> >>> checking and then using another notation for array indexing without
> >>> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> > =A0 =A0 Structs and unions are splendid sources of complication here.
> > For example,
>
> > union u { int i; char c[10]; };
> > union u *p =3D malloc(sizeof *p); =A0/* assume success */
>
> > =A0 =A0 Imagine a not uncommon platform where sizeof(int) =3D=3D 4 and
> > int requires 4-byte alignment; we'll probably wind up with a 12-
> > byte union, so the final line would be equivalent to
>
> > union u *p =3D malloc(12);
> ...
> > =A0 =A0 But now it gets even more entertaining:
>
> > int *ip =3D &p->i;
> > char *cp =3D p->c;
>
> > What bounds are carried with the fat pointers ip and cp? =A0For
>
> ...
> =A0This must be the case even if the conversion is indirect, via
>
> > umpty-seven layers of void* and separately compiled modules
> > where union u has never been heard of:
>
> > strcpy (p->c, "x");
> > char *q =3D strchr(p->c, 'x');
> > union u *p2 =3D (union u*)q;
>
> > The pointer p2 must cover all twelve bytes, even though it is
> > derived from a pointer that we'd prefer to have cover only ten.
>
> > =A0 =A0 Seems to me that doing a thorough job might require not just
> > a pointer-plus-bounds, but a pointer-with-history -- not just a
> > fat pointer, but an obese pointer.
>
> Your points all seem valid, but they remind me of when my boss wanted som=
e
> special feature or wanted something to work as he thought it should. I wo=
uld
> then proceed to explain a dozen ways why it didn't make sense and was
> completely impossible.
>
> But a day or so later I got the thing (hardware, software or both) workin=
g
> just like he wanted! It wasn't quite so impossible after all.
>
> The OP is only talking about arrays, and not pointers.
can you really do that?
void process_ptr (char*);
void process_arr (char[]);
Do they take the same type? They do in C but do they in Hawthorne-C?
> But these are
> inextricably linked so I guess we need 'proper' arrays, ie. distinct from
> the sort of arrays/pointers currently in C.
I don't think you have C anymore...
> Then there wouldn't be all this
> baggage to make the implementation of array bounds 'impossible'. That's
> still a major language extension though.
looks a language redesign to me. Why not just abolish pointers and
call it Java?
|
|
0
|
|
|
|
Reply
|
nick_keighley_nospam (4574)
|
11/3/2009 10:54:28 AM
|
|
On 3 Nov, 00:52, Casey Hawthorne <caseyhHAMMER_T...@istar.ca> wrote:
> On Mon, 02 Nov 2009 15:18:21 -0800, Keith Thompson <ks...@mib.org>
> wrote:
>
>
>
>
>
> >jacob navia <ja...@nospam.org> writes:
> >> Casey Hawthorne a ?crit :
> >>> Changing the meaning of the array indexing brackets "[ ]" to do bound=
s
> >>> checking and then using another notation for array indexing without
> >>> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> >>> By default, previous programs would have bounds checking turned on,
> >>> unsless explicilty turned off.
>
> >> The lcc-win C compiler implements this. It is called "operator
> >> overloading".
>
> >> You can define your own [ ] operators, and the compiler will call
> >> your function that can do bounds checking in the array.
> >[...]
>
> >Does lcc-win allow a [] operator to be defined for existing types that
> >already have a built-in [] operator? =A0For example, can I do something
> >like this (I'm not sure of the syntax)?
>
> > =A0 =A0int operator[](int *arr, size_t index)
> > =A0 =A0{
> > =A0 =A0 =A0 =A0return /* something */;
> > =A0 =A0}
>
> >I suspect the answer is no.
>
> >I think the OP is suggesting changing the semantics of the existing
> >built-in [] operator, so that existing code gains bounds checking
> >with no modifications to the source.
>
> >Of course implementations are already allowed to do this, even
> >without operator overloading. =A0All cases in which a bounds check
> >would fail invoke undefined behavior anyway, so the implementation
> >is free to generate code that detects the error.
>
> >The overhead would be substantial, since you'd need fat pointers to
> >retain bounds information for function parameters. =A0And it would
> >break some existing code whose behavior is, strictly speaking,
> >undefined, but that works in practice. =A0The most obvious examples
> >are the "struct hack" (which could be modified to use the C99 feature
> >that replaces it, but the point is to avoid changing the source), and
> >code that treats a multidimensional array as a one-dimensional array.
>
> >Code that avoids these odd corners of the language could certainly
> >benefit from a bounds checking implementation. =A0And in fact I believe
> >such implementations exist, though I don't know if they go so far as
> >to implement fat pointers to enable checking on pointer parameters.
> >(At the very least, *testing* code under such an implementation could
> >be instructive.)
>
> I thought of this question, of buffer overruns, after one of the
> people interviewed for the book "Coders at Work" said that C was great
> for systems programming by well trained programmers, but that C had
> leaked out into the applications area.
> For systems programming you do need the access to the machine that C
> provides, but for applications programming, you don't need/shouldn't
> have such access.
I'm nor sure the distinction is as clean as you'd like. Are
communication protocols system or application? What about embedded
programming? I suppose C-Hash's Safe and Unsafe modes (or whatever
they're called) might be a way to go.
|
|
0
|
|
|
|
Reply
|
nick_keighley_nospam (4574)
|
11/3/2009 10:59:02 AM
|
|
Nick Keighley wrote:
> On 3 Nov, 00:22, "bartc" <ba...@freeuk.com> wrote:
>> The OP is only talking about arrays, and not pointers.
>
> can you really do that?
>
> void process_ptr (char*);
> void process_arr (char[]);
>
> Do they take the same type? They do in C but do they in Hawthorne-C?
>
>> But these are
>> inextricably linked so I guess we need 'proper' arrays, ie. distinct
>> from the sort of arrays/pointers currently in C.
>
> I don't think you have C anymore...
I think it can be done by adding a type, 'checked array' or some such name.
But you can't mess around with these as freely as you can with regular
arrays.
A pointer to an element is possible, but then you aren't protected if you
decide to derefence that pointer with a too-large offset. But you can
probably create a slice instead, which is another checked array.
The implementation? Probably just another fat pointer (pointer+length), but
ring-fenced for extra protection.
>> Then there wouldn't be all this
>> baggage to make the implementation of array bounds 'impossible'.
>> That's still a major language extension though.
>
> looks a language redesign to me. Why not just abolish pointers and
> call it Java?
Switching language is one solution.
--
Bartc
|
|
0
|
|
|
|
Reply
|
bartc (783)
|
11/3/2009 11:26:24 AM
|
|
Casey Hawthorne wrote:
> Changing the meaning of the array indexing brackets "[ ]" to do bounds
> checking and then using another notation for array indexing without
> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> By default, previous programs would have bounds checking turned on,
> unless explicitly turned off.
You might find the following articles interesting.
http://gcc.gnu.org/wiki/Mudflap_Pointer_Debugging
http://gcc.fyxm.net/summit/2003/mudflap.pdf
On an unrelated note, it seems like the mudflap library itself recently
introduced a vulnerability.
http://xorl.wordpress.com/2009/11/01/gcc-mudflap-arbitrary-code-execution/
|
|
0
|
|
|
|
Reply
|
Noob
|
11/3/2009 12:33:17 PM
|
|
bartc wrote:
> [... array bounds checking ...]
>
> I think it can be done by adding a type, 'checked array' or some such
> name. But you can't mess around with these as freely as you can with
> regular arrays.
>
> A pointer to an element is possible, but then you aren't protected if
> you decide to derefence that pointer with a too-large offset. [...]
So, no protection for
checked_array char hello[5];
strcpy (hello, "Hello there, world, old buddy, old pal!");
? Seems to me this is exactly the use case the O.P. tries to
address! You might want to re-count the babies after throwing
out the bathwater ...
Or you could forbid using a checked_array with library
functions that take pointers, which would make checked_array
pretty useless. Or you could implement an entire parallel
suite of library functions to handle checked_arrays instead
of pointers. There'd be a certain amount of bloat: strcpy()
alone would need four versions, and I don't want to *think*
about scanf() ...
--
Eric Sosman
esosman@ieee-dot-org.invalid
|
|
0
|
|
|
|
Reply
|
esosman2 (2945)
|
11/3/2009 2:06:21 PM
|
|
"Eric Sosman" <esosman@ieee-dot-org.invalid> wrote in message
news:hcpddc$qra$1@news.eternal-september.org...
> bartc wrote:
>> [... array bounds checking ...]
>>
>> I think it can be done by adding a type, 'checked array' or some such
>> name. But you can't mess around with these as freely as you can with
>> regular arrays.
>>
>> A pointer to an element is possible, but then you aren't protected if you
>> decide to derefence that pointer with a too-large offset. [...]
>
> So, no protection for
>
> checked_array char hello[5];
> strcpy (hello, "Hello there, world, old buddy, old pal!");
The checked_array is just converted to a normal pointer in this case.
>
> ? Seems to me this is exactly the use case the O.P. tries to
> address! You might want to re-count the babies after throwing
> out the bathwater ...
The OP was talking about indexing arrays. strcpy() and friends are mainly
about pointers.
> Or you could forbid using a checked_array with library
> functions that take pointers, which would make checked_array
> pretty useless.
A warning would suffice.
Or you could implement an entire parallel
> suite of library functions to handle checked_arrays instead
> of pointers. There'd be a certain amount of bloat: strcpy()
> alone would need four versions, and I don't want to *think*
> about scanf() ...
It's possible an implementation might have parallel versions of some
functions (ones that write to an array), and make it so that their use is
transparent.
Strcpy wouldn't need 4 versions, just two. And the one taking the
checked_array param can just be implemented in terms of a regular function.
Or maybe there can just be one version, the one with checked_array
(requiring widening of an ordinary pointer to a fat pointer with no upper
limit).
--
bartc
|
|
0
|
|
|
|
Reply
|
bartc (783)
|
11/3/2009 4:49:25 PM
|
|
Casey Hawthorne wrote:
> Changing the meaning of the array indexing brackets "[ ]" to do bounds
> checking and then using another notation for array indexing without
> bounds checking, say "[| |]", might help minimize buffer overruns.
>
> By default, previous programs would have bounds checking turned on,
> unsless explicilty turned off.
What is your opinion of Cyclone?
http://cyclone.thelanguage.org/wiki/Cyclone%20for%20C%20Programmers
http://en.wikipedia.org/wiki/Cyclone_(programming_language)
"""
Cyclone attempts to avoid some of the common pitfalls of the C
programming language, while still maintaining the look and performance
of C. To this end, Cyclone places the following restrictions upon programs:
* NULL checks are inserted to prevent segmentation faults
* Pointer arithmetic is restricted
* Pointers must be initialized before use (this is enforced by
definite assignment analysis)
* Dangling pointers are prevented through region analysis and
limitations on free()
* Only "safe" casts and unions are allowed
* goto into scopes is disallowed
* switch labels in different scopes are disallowed
* Pointer-returning functions must execute return
* setjmp and longjmp are not supported
"""
Regards.
|
|
0
|
|
|
|
Reply
|
Noob
|
11/4/2009 3:42:47 PM
|
|
|
18 Replies
58 Views
(page loaded in 0.222 seconds)
Similiar Articles:7/27/2012 6:08:35 PM
|