Hi
I have put together a document explaining most extensions of lcc-win32,
why they were done, why they could be useful, and a documentation of the
string/container library that uses those extensions.
This document is not finished but I would appreciate your feedback.
jacob
Url:
ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
|
|
0
|
|
|
|
Reply
|
jacob
|
5/14/2006 8:35:14 PM |
|
On Sun, 14 May 2006 22:35:14 +0200, jacob navia
<jacob@jacob.remcomp.fr> wrote Re New document:
>I have put together a document explaining most extensions of lcc-win32,
>why they were done, why they could be useful, and a documentation of the
>string/container library that uses those extensions.
Thanks Jacob!
--
To email me directly, remove CLUTTER.
|
|
0
|
|
|
|
Reply
|
Vic
|
5/15/2006 10:03:29 AM
|
|
jacob navia a =E9crit :
> Hi
>
> I have put together a document explaining most extensions of lcc-win32,
> why they were done, why they could be useful, and a documentation of the
> string/container library that uses those extensions.
>
> This document is not finished but I would appreciate your feedback.
>
> jacob
>
> Url:
>
> ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
I think this type of document is a very good thing. C is very powerful
for a lost of small utilities (and sometimes for 'not so small
utilities'). There are no indeed good tools and bad tools, there is
tools we known and tools we don't known. First one are often (not
always) the best to make things.
For the part I've read, it's good work !
|
|
0
|
|
|
|
Reply
|
Bat
|
5/15/2006 6:25:03 PM
|
|
jacob navia wrote:
> Hi
>
> I have put together a document explaining most extensions of lcc-win32,
> why they were done, why they could be useful, and a documentation of the
> string/container library that uses those extensions.
>
> This document is not finished but I would appreciate your feedback.
>
> jacob
>
> Url:
>
> ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
I haven't read the entire document but here are some comments on what I
did review:
1.1 "Motivation":
"All development of C as an independent language has ceased and C has
been relegated to the past."
This is the kind of statement that most people here will consider
nonsense and is likely to keep many people from taking seriously
anything you say after that point. You even contradict yourself in the
very next sentence:
"The need for a simple and efficient language persists however, and C
is the language of
choice for many systems running today and many new ones."
So C has been relegated to the past but is still the language of choice
for many current and new systems?
"The objective of this proposal is to correct certain missing features
of C like its string library and the lack of a container library with a
few improvements that have been part of many computer languages since a
long time without introducing any new complexity or performance loss."
I don't think you can claim that the ideas in your proposal don't add
complexity or can be implemented without incurring a performance loss.
You present a couple of good points in the document, ones that may be
worth considering on their own as solutions to recognized issues in the
language. As a whole though, you seem to be proposing a number of
(what many would consider) radical changes which would really result in
an entirely new language.
I don't understand the mentality of trying to change the language to
more closely resemble other higher-level languages as opposed to using
something else that is out there or proposing a seperate language
altogether. If I need garbage collection, containers, operator
overloading, etc. I am going to use Java or Ruby or some other language
that better suits my needs. If C was retrofitted with all the extras
that you are proposing it wouldn't make it more attrative to me for
such tasks and would make the langauge less attractive for the things I
currently use it for. The net result would, I believe, push people
away from the language. There hasn't been much demand for many of the
features you discuss and given the reception of the C99 Standard, which
included many features that were in greater demand than your
suggestions, I don't think you are likely to get a warmer reception
from the C community.
My suggestion is to either focus on one or two ideas that address
issues that really matter to the C community (buffer overflow
prevention and string abstractions seem to be hot right now) and are
implemented in ways that are likely to be well-received or just propose
the creation of a new language (like D did).
Robert Gamble
|
|
0
|
|
|
|
Reply
|
Robert
|
5/15/2006 7:59:54 PM
|
|
Robert Gamble wrote:
> jacob navia wrote:
>>
>> I have put together a document explaining most extensions of
>> lcc-win32, why they were done, why they could be useful, and a
>> documentation of the string/container library that uses those
>> extensions.
>>
>> This document is not finished but I would appreciate your
>> feedback.
>>
>> ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
>
> I haven't read the entire document but here are some comments on
> what I did review:
>
> 1.1 "Motivation":
> "All development of C as an independent language has ceased and C
> has been relegated to the past."
>
> This is the kind of statement that most people here will consider
> nonsense and is likely to keep many people from taking seriously
> anything you say after that point. You even contradict yourself
> in the very next sentence:
I was planning to at least read his proposal, but your quote in
itself has deterred that.
Let me point out that all development of the French language has
ceased, not because of lack of innovation, but because of legal
barriers erected in both France and Quebec. I believe use of the
phrase "le hotdog" is now cause for incarceration in the Bastille.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/15/2006 9:04:32 PM
|
|
On Sun, 14 May 2006 22:35:14 +0200, jacob navia
<jacob@jacob.remcomp.fr> wrote:
>Hi
>
>I have put together a document explaining most extensions of lcc-win32,
>why they were done, why they could be useful, and a documentation of the
>string/container library that uses those extensions.
>
>This document is not finished but I would appreciate your feedback.
>
>jacob
>
>Url:
>
>ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
First paragraph, third sentence:
"All development of C as an independent language has ceased and C has
been relegated to the past."
Ah, well, it's just another advertisement for Jacob, after all. I
didn't bother to read further.
--
Al Balmer
Sun City, AZ
|
|
0
|
|
|
|
Reply
|
Al
|
5/15/2006 10:55:36 PM
|
|
Al Balmer wrote:
> On Sun, 14 May 2006 22:35:14 +0200, jacob navia
> <jacob@jacob.remcomp.fr> wrote:
>
>
>>Hi
>>
>>I have put together a document explaining most extensions of lcc-win32,
>>why they were done, why they could be useful, and a documentation of the
>>string/container library that uses those extensions.
>>
>>This document is not finished but I would appreciate your feedback.
>>
>>jacob
>>
>>Url:
>>
>>ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
>
>
> First paragraph, third sentence:
> "All development of C as an independent language has ceased and C has
> been relegated to the past."
>
> Ah, well, it's just another advertisement for Jacob, after all. I
> didn't bother to read further.
I read further, but eventually just skimmed. (I think
there's an error in the use of a va_alist in one of his code
samples, but wasn't sufficiently motivated to track it down.)
His contentious style is off-putting (one of his "Motivation"
sections consists entirely of "Everybody agrees FALSEHOOD,"
which doesn't encourage rational argument), but I think there
is serious thought behind the polemics. There may be something
worth while amid the noise.
"Worth while to C," however, is another matter. I believe
the wrenching changes he advocates would simply misappropriate
the label "C" to designate some entirely different language --
and it's likely this language would be unacceptable both to the
"C as it is" crowd and to the "C -- that's SO last millennium"
crowd. The language he describes isn't so much "C improved"
as "C with too much makeup."
Skimming, skimming ... He's got an example of how to overload
the == operator to apply to strings (sorry, Strings), and all of
a sudden there's this Exception doodad creeping into the picture.
Hello? Can't I compare two [Ss]trings without risking an Exception,
whatever that might be? I'd sure hate to be in the middle of the
ISR portion of a device driver when one of those cut loose ...
The paper is entitled "New Directions in C," and it seems to
me that both parts of the title are debatable. The "directions"
don't seem very "new" -- the paper makes a point of listing other
languages that incorporate similar features (but doesn't bother to
assess whether the result is good or bad), so the "newness" is
suspect. As for the second part, the "in C" portion seems
completely unjustified -- consider what the shibboleth of "safety"
has done to pointers in, in, well, we might call it J.
C is not perfect, and I am no defender of its holiness. But
the language JN describes is not C, not any more than Charlemagne
was Holy Roman Emperor or the Czars the inheritors of the Caesars.
--
Eric Sosman
esosman@acm-dot-org.invalid
|
|
0
|
|
|
|
Reply
|
Eric
|
5/16/2006 1:12:02 AM
|
|
In article <YJOdnYxWl4N1u_TZRVn-vQ@comcast.com>,
Eric Sosman <esosman@acm-dot-org.invalid> wrote:
> The paper is entitled "New Directions in C," and it seems to
>me that both parts of the title are debatable.
>As for the second part, the "in C" portion seems
>completely unjustified -- consider what the shibboleth of "safety"
>has done to pointers in, in, well, we might call it J.
J is Iverson's post-modernization of APL.
http://www.jsoftware.com
--
There are some ideas so wrong that only a very intelligent person
could believe in them. -- George Orwell
|
|
0
|
|
|
|
Reply
|
roberson
|
5/16/2006 1:47:42 AM
|
|
Al Balmer said:
> On Sun, 14 May 2006 22:35:14 +0200, jacob navia
> <jacob@jacob.remcomp.fr> wrote:
>
>>ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
>
> First paragraph, third sentence:
> "All development of C as an independent language has ceased and C has
> been relegated to the past."
>
> Ah, well, it's just another advertisement for Jacob, after all. I
> didn't bother to read further.
I didn't even bother to read /that/ far, since he made no effort to make the
document easily readable. HTML would have worked. Text would have worked
even better. But I'm not going to fire up a program that isn't already
kicking around my /proc, just to read Navia's proposals - at least not
until it's evident that he's found a clue or two from somewhere.
Given the line you've quoted, though, it makes no sense whatsoever for him
to post his proposals in clc, since they clearly have nothing to do with
using the C language.
I fail to see how /anyone/ that - um - challenged can write a C compiler.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
|
|
0
|
|
|
|
Reply
|
Richard
|
5/16/2006 6:26:15 AM
|
|
Robert Gamble a �crit :
>
> I don't think you can claim that the ideas in your proposal don't add
> complexity or can be implemented without incurring a performance loss.
>
There is NO performance loss.
None of the proposed changes is automatic and none affects any other
part of the language. This means that performance of C stays the same
when the changes are not used.
The objective of those changes is to furnish the tools for building a
good standard library, specially a good string library.
C has become a synonym for "buffer overflow vulnerability". Let's stop
this.
> You present a couple of good points in the document, ones that may be
> worth considering on their own as solutions to recognized issues in the
> language. As a whole though, you seem to be proposing a number of
> (what many would consider) radical changes which would really result in
> an entirely new language.
>
Fortran has adopted operator overloading, Delphi, many other languages
have done that without becoming "a new language". Let's keep our senses.
> I don't understand the mentality of trying to change the language to
> more closely resemble other higher-level languages as opposed to using
> something else that is out there or proposing a seperate language
> altogether. If I need garbage collection, containers, operator
> overloading, etc. I am going to use Java or Ruby or some other language
> that better suits my needs.
That is why C should not be used at all. Who doesn't need strings, or
containers, in any serious programming? Of course we should use "the
better C", i.e. C++ and stop whinning about a primitive language that
should disappear isn't it?
> If C was retrofitted with all the extras
> that you are proposing it wouldn't make it more attrative to me for
> such tasks and would make the langauge less attractive for the things I
> currently use it for.
Why. The changes in the lcc-win32 compiler needed to accomodate ALL that
were just 2000 lines of code!
> The net result would, I believe, push people
> away from the language.
Most people are pushed away from the language the nth time they have
to keep track of the possible buffer overflows, the nth time they have
to code a linked list!
> There hasn't been much demand for many of the
> features you discuss and given the reception of the C99 Standard, which
> included many features that were in greater demand than your
> suggestions, I don't think you are likely to get a warmer reception
> from the C community.
Complex arithmetic? In greater demand that a sane string library?
>
> My suggestion is to either focus on one or two ideas that address
> issues that really matter to the C community (buffer overflow
> prevention and string abstractions seem to be hot right now) and are
> implemented in ways that are likely to be well-received or just propose
> the creation of a new language (like D did).
>
My point is that without abandoning null terminated C strings there is
NO CHANCE we can write a sensible string library.
|
|
0
|
|
|
|
Reply
|
jacob
|
5/16/2006 6:30:22 AM
|
|
jacob navia wrote:
> Robert Gamble a �crit :
>
> Fortran has adopted operator overloading, Delphi, many other languages
> have done that without becoming "a new language". Let's keep our senses.
>
>> I don't understand the mentality of trying to change the language to
>> more closely resemble other higher-level languages as opposed to using
>> something else that is out there or proposing a seperate language
>> altogether. If I need garbage collection, containers, operator
>> overloading, etc. I am going to use Java or Ruby or some other language
>> that better suits my needs.
>
>
> That is why C should not be used at all. Who doesn't need strings, or
> containers, in any serious programming? Of course we should use "the
> better C", i.e. C++ and stop whinning about a primitive language that
> should disappear isn't it?
>
A lot of people have. I tend to use C++ as a first choice these days.
I still use C when that choice isn't available.
I'm sure a lot of people do the same, which might go some way to
explaining the lack of interest in extending C. C does what it does
very well, if you want to do something C doesn't do well, don't use it.
--
Ian Collins.
|
|
0
|
|
|
|
Reply
|
Ian
|
5/16/2006 6:43:11 AM
|
|
On Tue, 16 May 2006 08:30:22 +0200, jacob navia
<jacob@jacob.remcomp.fr> wrote:
<snip>
>
>My point is that without abandoning null terminated C strings there is
>NO CHANCE we can write a sensible string library.
Sure there is. The way I handled this issue in BCET is by using
counted strings, and then after any string operation, tacking a null
byte on at the end. All of the BCET string operations simply ignore
that null, but it is there in case you need to pass the string to some
routine that expects it, like windows. :-) The counted string format
is essentially a 32 bit length field immediately preceding the string
itself. String constants are also allocated with the length field,
and the null.
--
ArarghMail605 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html
To reply by email, remove the garbage from the reply address.
|
|
0
|
|
|
|
Reply
|
ArarghMail605NOSPAM
|
5/16/2006 6:44:50 AM
|
|
jacob navia said:
> C has become a synonym for "buffer overflow vulnerability".
Yes, in the same way that "jacob navia" has become a synonym for "off-topic
troll". And, just as it is not only possible but easy to avoid buffer
overflows in C, so it is not only possible but easy for Jacob Navia to stop
posting idiotic ramblings in clc.
> Let's stop this.
Whenever you're ready, just stop.
>> I don't understand the mentality of trying to change the language to
>> more closely resemble other higher-level languages as opposed to using
>> something else that is out there or proposing a seperate language
>> altogether. If I need garbage collection, containers, operator
>> overloading, etc. I am going to use Java or Ruby or some other language
>> that better suits my needs.
>
> That is why C should not be used at all.
If you really believe that, then stop using it, and drop support for it from
your compiler. If you don't stop using it, or don't drop support for it
from your compiler, then you don't really believe it should not be used.
> Who doesn't need strings, or
> containers, in any serious programming? Of course we should use "the
> better C", i.e. C++ and stop whinning about a primitive language that
> should disappear isn't it?
The language I choose to use is my decision, not yours.
> Most people are pushed away from the language the nth time they have
> to keep track of the possible buffer overflows,
A little care is all that is required.
> the nth time they have to code a linked list!
n = 1, if you do it properly the first time.
> My point is that without abandoning null terminated C strings there is
> NO CHANCE we can write a sensible string library.
<shrug> I can write a sensible string library without rewriting C.
Conversion between C strings and my strings is reasonably trivial, so I get
the best of both worlds. I expect you could do that too, if you tried. But
wait! You think people should stop using C. Fine, so stop using it, end of
problem. The rest of us will carry on as *we* see fit.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
|
|
0
|
|
|
|
Reply
|
Richard
|
5/16/2006 7:20:43 AM
|
|
jacob navia wrote:
> Robert Gamble a �crit :
>
.... snip ...
>
>> The net result would, I believe, push people
>> away from the language.
>
> Most people are pushed away from the language the nth time they
> have to keep track of the possible buffer overflows, the nth time
> they have to code a linked list!
Why do you think a linked list is so hard to code? You have to
define what you are linking to start with:
struct node {
struct node *next;
T data;
};
Now you need a function to create such a node:
struct node *makenode(T datum) {
struct node *newnode;
if (newnode = malloc(sizeof *newnode)) {
newnode->data = datum;
newnode->next = NULL;
}
return newnode;
}
and using it is dead simple
struct node *list = NULL;
struct node *temp;
...
while (getmoredata()) {
if (NULL == (temp = makenode(datum)))
giveupforlackofmemory();
else {
temp->next = list;
list = temp;
}
}
and I think any quasi-capable programmer should be able to write
that in his moribund sleep.
You don't like the list order? Well, you are free to change it,
either by reversing the list when done, or by changing the actual
entry operations, depending on need. No need for a monstrous piece
of library code with twenty-eight and a half options.
You want a sorted list? Maybe a list isn't the optimum data
structure. Maybe sorting the completed list is suitable. Enter
mergesort, also extremely simple coding.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/16/2006 8:02:52 AM
|
|
Richard Heathfield wrote:
>
.... snip ...
>
> I fail to see how /anyone/ that - um - challenged can write a C
> compiler.
He didn't. He bought a licence for portable lcc, reworked the code
generator etc. to use Windoze, and has been modifying ever since.
His version of lcc is about 10 or more revisions behind the
original one. His original work has, I believe, been in the editor
and debugger areas, and in system integration to Windoze. These
are not trivial works.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/16/2006 8:10:05 AM
|
|
Richard Heathfield <invalid@invalid.invalid> writes:
>
> I fail to see how /anyone/ that - um - challenged can write a C
> compiler.
How about cleaning the front of you house?
I fail to see how /anyone/ that - um - challenged can write a C book.
Friedrich
--
Please remove just-for-news- to reply via e-mail.
|
|
0
|
|
|
|
Reply
|
Friedrich
|
5/16/2006 9:00:31 AM
|
|
Friedrich Dominicus said:
> Richard Heathfield <invalid@invalid.invalid> writes:
>
>>
>> I fail to see how /anyone/ that - um - challenged can write a C
>> compiler.
> How about cleaning the front of you house?
You lost me. Please make your meaning clearer.
> I fail to see how /anyone/ that - um - challenged can write a C book.
Oh, anyone can write a C book, irrespective of their level of C knowledge -
and many have. But to write a working C compiler from scratch takes real
knowledge and skill. (It has been claimed elsethread, however, that Jacob
Navia did not do this.)
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
|
|
0
|
|
|
|
Reply
|
Richard
|
5/16/2006 9:12:13 AM
|
|
Richard Heathfield <invalid@invalid.invalid> writes:
> Friedrich Dominicus said:
>
>> Richard Heathfield <invalid@invalid.invalid> writes:
>>
>>>
>>> I fail to see how /anyone/ that - um - challenged can write a C
>>> compiler.
>> How about cleaning the front of you house?
>
> You lost me. Please make your meaning clearer.
C Unleashed chapter 11 p. 353
>
>> I fail to see how /anyone/ that - um - challenged can write a C book.
>
> Oh, anyone can write a C book, irrespective of their level of C knowledge -
> and many have. But to write a working C compiler from scratch takes real
> knowledge and skill. (It has been claimed elsethread, however, that Jacob
> Navia did not do this.)
Jacob has used lcc as a starting point. lcc and lcc-win32 now share
the base data structures however, we ported the compiler to some other
platforms like e.g Linux 64, some DSP and Windows 64. I doubt that lcc
is as near to C99 as lcc-win32 is. Now 10 years of constant working
with some sources obviously lead to quite different source trees.
Friedrich
--
Please remove just-for-news- to reply via e-mail.
|
|
0
|
|
|
|
Reply
|
Friedrich
|
5/16/2006 9:45:05 AM
|
|
jacob navia wrote On 05/16/06 02:30,:
> Robert Gamble a =E9crit :
>=20
>>I don't think you can claim that the ideas in your proposal don't add
>>complexity or can be implemented without incurring a performance loss.
>>
>=20
>=20
> There is NO performance loss.
>=20
> None of the proposed changes is automatic and none affects any other
> part of the language. This means that performance of C stays the same
> when the changes are not used.
>=20
> The objective of those changes is to furnish the tools for building a=20
> good standard library, specially a good string library.
>=20
> C has become a synonym for "buffer overflow vulnerability". Let's stop =
> this.
Ignoring the contentious tone, how does this jibe with
the earlier statement about performance staying the same if
the changes are not used? If the changes are not used there
is no improvement in safety from buffer overflow, is there?
Long ago there was a series of humorous advertisements
for a brand of gasoline, each beginning with an outrageous
claim made by a big-voiced announcer and followed by a sort
of fine-print disclaimer delivered sotto voce:
"One tank of Pluperfect Petrol will last for YEARS!!!"
"(([if you don't drive your car]))"
=2E.. and there's something about "NO performance loss (([if
you don't use the features]))" that reminds me of that old
ad campaign.
--=20
Eric.Sosman@sun.com
|
|
0
|
|
|
|
Reply
|
Eric
|
5/16/2006 1:55:02 PM
|
|
Eric Sosman a �crit :
>
> jacob navia wrote On 05/16/06 02:30,:
>
>>Robert Gamble a �crit :
>>
>>
>>>I don't think you can claim that the ideas in your proposal don't add
>>>complexity or can be implemented without incurring a performance loss.
>>>
>>
>>
>>There is NO performance loss.
>>
>>None of the proposed changes is automatic and none affects any other
>>part of the language. This means that performance of C stays the same
>>when the changes are not used.
>>
>>The objective of those changes is to furnish the tools for building a
>>good standard library, specially a good string library.
>>
>>C has become a synonym for "buffer overflow vulnerability". Let's stop
>>this.
>
>
> Ignoring the contentious tone, how does this jibe with
> the earlier statement about performance staying the same if
> the changes are not used? If the changes are not used there
> is no improvement in safety from buffer overflow, is there?
Obviously you think that scanning memory for the terminating zero is
vastly more efficient than accessing it directly with the string length.
Each time you access the length of a zero terminated string you must
start that unbounded memory scan, source of countless errors. Operations
like strcat depend on the length of the first string, that must be
recalculated over and over.
Obviously you have a different concept for "efficiency" than I do.
Length delimited strings are INHERENTLY faster than zero terminated ones.
Is that too difficult for you to understand?
|
|
0
|
|
|
|
Reply
|
jacob
|
5/16/2006 2:01:23 PM
|
|
jacob navia wrote On 05/16/06 10:01,:
> Eric Sosman a =E9crit :
>=20
>>jacob navia wrote On 05/16/06 02:30,:
>>
>>
>>>Robert Gamble a =E9crit :
>>>
>>>
>>>
>>>>I don't think you can claim that the ideas in your proposal don't add=
>>>>complexity or can be implemented without incurring a performance loss=
=2E
>>>>
>>>
>>>
>>>There is NO performance loss.
>>>
>>>None of the proposed changes is automatic and none affects any other
>>>part of the language. This means that performance of C stays the same
>>>when the changes are not used.
>>>
>>>The objective of those changes is to furnish the tools for building a =
>>>good standard library, specially a good string library.
>>>
>>>C has become a synonym for "buffer overflow vulnerability". Let's stop=
=20
>>>this.
>>
>>
>> Ignoring the contentious tone, how does this jibe with
>>the earlier statement about performance staying the same if
>>the changes are not used? If the changes are not used there
>>is no improvement in safety from buffer overflow, is there?
>=20
>=20
> Obviously you think that scanning memory for the terminating zero is
> vastly more efficient than accessing it directly with the string length=
=2E
>=20
> Each time you access the length of a zero terminated string you must=20
> start that unbounded memory scan, source of countless errors. Operation=
s=20
> like strcat depend on the length of the first string, that must be
> recalculated over and over.
>=20
> Obviously you have a different concept for "efficiency" than I do.
>=20
> Length delimited strings are INHERENTLY faster than zero terminated one=
s.
>=20
> Is that too difficult for you to understand?
Not one word of this -- well, "response" isn't right,
because it's not responsive -- not one word of this post
addresses the question raised. Once again, for those who
may not have been paying attention:
The claim is made that "There is NO performance
loss." [JN's exact words]
This claim is defended by saying that "[...]
performance of C stays the same when the changes
are not used." [JN's exact words]
My question: If the changes are not used, is C
improved in any way at all?
If you read very v-e-r-y carefully, you will note that
I made no claims about whether counted strings were faster
or slower, safer or more hazardous, slimmer or more fattening.
I simply asked you to explain how "changes [...] not used"
can be an improvement.
Is that too difficult for you to understand?
--=20
Eric.Sosman@sun.com
|
|
0
|
|
|
|
Reply
|
Eric
|
5/16/2006 2:47:10 PM
|
|
jacob navia wrote:
> Eric Sosman a �crit :
>>
>> jacob navia wrote On 05/16/06 02:30,:
>>
>>> Robert Gamble a �crit :
>>>
>>>
>>>> I don't think you can claim that the ideas in your proposal don't add
>>>> complexity or can be implemented without incurring a performance loss.
>>>>
>>>
>>>
>>> There is NO performance loss.
>>>
>>> None of the proposed changes is automatic and none affects any other
>>> part of the language. This means that performance of C stays the same
>>> when the changes are not used.
>>>
>>> The objective of those changes is to furnish the tools for building a
>>> good standard library, specially a good string library.
>>>
>>> C has become a synonym for "buffer overflow vulnerability". Let's
>>> stop this.
>>
>> Ignoring the contentious tone, how does this jibe with
>> the earlier statement about performance staying the same if
>> the changes are not used? If the changes are not used there
>> is no improvement in safety from buffer overflow, is there?
>
> Obviously you think that scanning memory for the terminating zero is
> vastly more efficient than accessing it directly with the string length.
>
> Each time you access the length of a zero terminated string you must
> start that unbounded memory scan, source of countless errors. Operations
> like strcat depend on the length of the first string, that must be
> recalculated over and over.
Strangely enough, my programs don't depend on doing lost of strca and
strlen calls.
> Obviously you have a different concept for "efficiency" than I do.
>
> Length delimited strings are INHERENTLY faster than zero terminated ones.
>
> Is that too difficult for you to understand?
So tell me, if I'm adding one character at a time to a string, keeping a
pointer to the end of the string, how is something equivalent to:
*p++ = whatever;
going to be slower than something equivalent to:
*p++ = whatever;
increment the length of the string p points in to
It seems to me that the latter is going to be slower.
The same applies to pieces of code I have that build up strings from
constant strings. They keep track of the end of the string and a lot of
the time they know in advance how long the string is that will be added.
There is no system that is going to be the fastest in every situation,
and if I want counted strings I can implement them just as Paul Heisch
(sorry, I've probably spelt your name wrong) has.
For some of my string handling I far prefer the way other languages do
it where you don't have to worry about allocating space but instead the
buffer grows as you add to it. It does a lot to get rid of buffer
overflows, but that does not mean I think it is right for all uses or for C.
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc
|
|
0
|
|
|
|
Reply
|
Flash
|
5/16/2006 3:02:15 PM
|
|
jacob navia wrote:
> Robert Gamble a �crit :
>>
>> I don't think you can claim that the ideas in your proposal don't add
>> complexity or can be implemented without incurring a performance loss.
>>
>
> There is NO performance loss.
>
> None of the proposed changes is automatic and none affects any other
> part of the language. This means that performance of C stays the same
> when the changes are not used.
>
> The objective of those changes is to furnish the tools for building a
> good standard library, specially a good string library.
>
> C has become a synonym for "buffer overflow vulnerability". Let's stop
> this.
If we did check every now and then for buffer overflows, there would be
some perfomance loss.
Do you have any hard evidence that C's perfomance is not impaired by the
new additions?
FYI: A colleague informed me that lcc-win32 gave extremely bad
perfomance results compared to mingw. I cannot support such a statement,
as I only use linux. Is this because of the new additions?
--
one's freedom stops where others' begin
Giannis Papadopoulos
Computer and Communications Engineering dept. (CCED)
University of Thessaly
http://dop.freegr.net/
|
|
0
|
|
|
|
Reply
|
ipapadop (140)
|
5/16/2006 3:48:18 PM
|
|
Giannis Papadopoulos a �crit :
> jacob navia wrote:
>
>>Robert Gamble a �crit :
>>
>>>I don't think you can claim that the ideas in your proposal don't add
>>>complexity or can be implemented without incurring a performance loss.
>>>
>>
>>There is NO performance loss.
>>
>>None of the proposed changes is automatic and none affects any other
>>part of the language. This means that performance of C stays the same
>>when the changes are not used.
>>
>>The objective of those changes is to furnish the tools for building a
>>good standard library, specially a good string library.
>>
>>C has become a synonym for "buffer overflow vulnerability". Let's stop
>>this.
>
>
> If we did check every now and then for buffer overflows, there would be
> some perfomance loss.
>
Yes.
At 3GHZ the price to pay for a memory comparison with a constant should
be noticeable after some BILLION comparisons...
Buffer overflows however, are not a "performance" problem?
Does incorrect software "perform" OK ???
> Do you have any hard evidence that C's perfomance is not impaired by the
> new additions?
>
If you check bounds when using strings the performance loss will not be
noticeable in most PCs. It could be a problem in some old embedded
systems. Embedded systems, for instance the Analog devices new CPUs are
32 bits monsters where the above performance problems do not apply.
Besides, if you are working in a slow 50MHZ CPU, you can always use the
old interface. People with newer machines can use a better approach.
It means do we level by the worst common denominator?
Or by the best?
> FYI: A colleague informed me that lcc-win32 gave extremely bad
> perfomance results compared to mingw. I cannot support such a statement,
> as I only use linux. Is this because of the new additions?
>
This is absolutely unverifiable. You should come with *some* data to
justify that.
gcc however, is a better compiler than lcc-win32 in general.
The source of gcc is approx 15MB of C, the source of lcc-win32 is just 800K.
Benchmarks tell me that I am running at approx 70-80% of fully optimized
MSVC.
With gcc it depends, but should be like 80%. (i.e. I am running at 80%
gcc's speed)
|
|
0
|
|
|
|
Reply
|
jacob (2538)
|
5/16/2006 5:36:42 PM
|
|
Flash Gordon a �crit :
>
> Strangely enough, my programs don't depend on doing lost of strca and
> strlen calls.
>
Ahhh OK. You do not use strcat.
But if you do a strchr, for instance, instead of just doing ONE test for
equality for some character you must do TWO tests because you should
test if you have reached the terminating zero...
Using length delimited strings this is reduced to a memchr.
Ahh but obviously you do NOT use strchr either.
>> Obviously you have a different concept for "efficiency" than I do.
>>
>> Length delimited strings are INHERENTLY faster than zero terminated ones.
>>
>> Is that too difficult for you to understand?
>
>
> So tell me, if I'm adding one character at a time to a string, keeping a
> pointer to the end of the string, how is something equivalent to:
> *p++ = whatever;
> going to be slower than something equivalent to:
> *p++ = whatever;
> increment the length of the string p points in to
>
> It seems to me that the latter is going to be slower.
>
Yes, it will be slower. In a normal PC you will notice the difference
after some billion additions.
But you are using a very error prone construct. Do you test
ALWAYS beforehand and test CORRECTLY that you are not going just one
byte beyond the length of the string?
OF COURSE YOU never do such mistakes, your code is always 100%
right the first time.
But not everyone is like you see?
There are stupids like me that make mistakes sometimes.
> The same applies to pieces of code I have that build up strings from
> constant strings. They keep track of the end of the string and a lot of
> the time they know in advance how long the string is that will be added.
>
But... that is the same as length delimited strings... If you keep a
pointer to the end of the string it is conceptually the same as having a
length stored somewhere.
> There is no system that is going to be the fastest in every situation,
> and if I want counted strings I can implement them just as Paul Heisch
> (sorry, I've probably spelt your name wrong) has.
>
Of course, but then, you can't write:
String s = "abcd";
s[2] = 'm';
but you have to write:
String s = CreateStringFromCharP("abcd");
AssignCharAt(s,2,'m');
This means that porting the old code to the new code is much more difficult.
> For some of my string handling I far prefer the way other languages do
> it where you don't have to worry about allocating space but instead the
> buffer grows as you add to it.
Yes, like the string library of lcc-win32. Nice isn't it?
> It does a lot to get rid of buffer
> overflows, but that does not mean I think it is right for all uses or
> for C.
The string library is not "right for all uses" but I do not see why it
should not be right for C.
Why C must be kept artificially at such a low level that no sensible
programming is possible?
If you like those strings in other languages why not doing it in C?
jacob
|
|
0
|
|
|
|
Reply
|
jacob
|
5/16/2006 5:48:53 PM
|
|
Eric Sosman <Eric.Sosman@sun.com> writes:
> jacob navia wrote On 05/16/06 02:30,:
>> Robert Gamble a �crit :
>>>I don't think you can claim that the ideas in your proposal don't add
>>>complexity or can be implemented without incurring a performance loss.
>>
>> There is NO performance loss.
>>
>> None of the proposed changes is automatic and none affects any other
>> part of the language. This means that performance of C stays the same
>> when the changes are not used.
>>
>> The objective of those changes is to furnish the tools for building a
>> good standard library, specially a good string library.
>>
>> C has become a synonym for "buffer overflow vulnerability". Let's stop
>> this.
>
> Ignoring the contentious tone, how does this jibe with
> the earlier statement about performance staying the same if
> the changes are not used? If the changes are not used there
> is no improvement in safety from buffer overflow, is there?
>
> Long ago there was a series of humorous advertisements
> for a brand of gasoline, each beginning with an outrageous
> claim made by a big-voiced announcer and followed by a sort
> of fine-print disclaimer delivered sotto voce:
>
> "One tank of Pluperfect Petrol will last for YEARS!!!"
>
> "(([if you don't drive your car]))"
>
> ... and there's something about "NO performance loss (([if
> you don't use the features]))" that reminds me of that old
> ad campaign.
Eric, I think you're being unfair here.
lcc-win32 provides a number of extensions. jacob's claim is that the
addition of these extensions to the compiler does not affect the
performance of code that doesn't use the extensions (i.e., of portable
C code, which lcc-win32 does support). Unlike "one tank will last for
years (if you don't drive your car)", this is a significant claim.
For example, someone might hypothetically implement exceptions in a
way that causes all function calls, even in programs that don't use
exceptions, to be slower. What jacob is claiming is that you pay the
price of his extensions only if you use them.
I have no way to judge the truth of his claim (and I see no reason not
to give him the benefit of the doubt on this point), but it *is* a
significant point.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
|
|
0
|
|
|
|
Reply
|
Keith
|
5/16/2006 5:54:21 PM
|
|
jacob navia wrote:
> Flash Gordon a �crit :
>>
>> Strangely enough, my programs don't depend on doing lost of strca and
>> strlen calls.
>
> Ahhh OK. You do not use strcat.
>
> But if you do a strchr, for instance, instead of just doing ONE test for
> equality for some character you must do TWO tests because you should
> test if you have reached the terminating zero...
>
> Using length delimited strings this is reduced to a memchr.
>
> Ahh but obviously you do NOT use strchr either.
I said I don't do *lots* of strcat (well, actually, strca) and strlen
calls. I didn't say I don't do any. The same applies to strchr calls. I
will even make calls to a pcre library for doing much more complex
searching!
>>> Obviously you have a different concept for "efficiency" than I do.
>>>
>>> Length delimited strings are INHERENTLY faster than zero terminated
>>> ones.
>>>
>>> Is that too difficult for you to understand?
>>
>>
>> So tell me, if I'm adding one character at a time to a string, keeping
>> a pointer to the end of the string, how is something equivalent to:
>> *p++ = whatever;
>> going to be slower than something equivalent to:
>> *p++ = whatever;
>> increment the length of the string p points in to
>>
>> It seems to me that the latter is going to be slower.
>
> Yes, it will be slower. In a normal PC you will notice the difference
> after some billion additions.
The same applies to strcat, strlen et al. If you use them sensibly then
although they are slower than just looking up the length they won't have
a major impact on performance.
> But you are using a very error prone construct. Do you test
> ALWAYS beforehand and test CORRECTLY that you are not going just one
> byte beyond the length of the string?
>
> OF COURSE YOU never do such mistakes, your code is always 100%
> right the first time.
>
> But not everyone is like you see?
>
> There are stupids like me that make mistakes sometimes.
I've never claimed perfection. However, once I've fixed the
typographical errors that prevent building the program (undefined
reference to strca for example) I don't tend to find problems like that.
>> The same applies to pieces of code I have that build up strings from
>> constant strings. They keep track of the end of the string and a lot
>> of the time they know in advance how long the string is that will be
>> added.
>
> But... that is the same as length delimited strings... If you keep a
> pointer to the end of the string it is conceptually the same as having a
> length stored somewhere.
I use a pointer to the end when it is useful, I don't when it isn't.
Therefore, when it isn't useful, I don't pay the price of maintaining it!
>> There is no system that is going to be the fastest in every situation,
>> and if I want counted strings I can implement them just as Paul Heisch
>> (sorry, I've probably spelt your name wrong) has.
>
> Of course, but then, you can't write:
>
> String s = "abcd";
>
> s[2] = 'm';
>
> but you have to write:
>
> String s = CreateStringFromCharP("abcd");
> AssignCharAt(s,2,'m');
Now you are using long names to deliberately make it look worse.
String s = Strnew("abcd");
s.s[2] = 'm';
Or if you want checking:
StrAssChr(s,2,'m');
Although I would not use this function very often/
> This means that porting the old code to the new code is much more
> difficult.
If you are fundamentally changing the software you *should* examine all
the code it impact on. So I would do this anyway and have editor macros
set up to do the bulk of the editing for me.
>> For some of my string handling I far prefer the way other languages do
>> it where you don't have to worry about allocating space but instead
>> the buffer grows as you add to it.
>
> Yes, like the string library of lcc-win32. Nice isn't it?
No, for a lot of tasks it is extremely horrible. For other tasks it is
useful. So I will use the appropriate language for each job.
>> It does a lot to get rid of buffer overflows, but that does not mean I
>> think it is right for all uses or for C.
>
> The string library is not "right for all uses" but I do not see why it
> should not be right for C.
>
> Why C must be kept artificially at such a low level that no sensible
> programming is possible?
Billions of lines of C code says that sensible programming in C is
possible. Mind you, at least one major application I've written has
exactly *no* string handling in it. Lots of maths, message packet
encoding/decoding, data moving, but no string handling. Not all the
world is as PC or server.
> If you like those strings in other languages why not doing it in C?
Because I use C for the things it is good at and other languages for the
things they are good at. It is impossible to create a language that is
good for every task, so why do you want to try and achieve this
impossible task with C?
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc
Inviato da X-Privat.Org - Registrazione gratuita http://www.x-privat.org/join.php
|
|
0
|
|
|
|
Reply
|
Flash
|
5/16/2006 7:11:02 PM
|
|
jacob navia wrote:
> Eric Sosman a �crit :
>
.... snip ...
>>
>> Ignoring the contentious tone, how does this jibe with
>> the earlier statement about performance staying the same if
>> the changes are not used? If the changes are not used there
>> is no improvement in safety from buffer overflow, is there?
>
> Obviously you think that scanning memory for the terminating zero
> is vastly more efficient than accessing it directly with the
> string length.
It may well be so. It depends on the operation. Did you ever hear
of improving algorithms by using a marker value? Consider the code
to upshift a complete string.
>
> Each time you access the length of a zero terminated string you
> must start that unbounded memory scan, source of countless errors.
> Operations like strcat depend on the length of the first string,
> that must be recalculated over and over.
You are allowed to remember a length. Consider my coding for
strlcpy, which never computes the length of any string, yet is
quite safe.
/* NOTE: these routines are deliberately designed to
not require any assistance from the standard
libraries. This makes them more useful in any
embedded systems that must minimize the load size.
Public domain, by C.B. Falconer
bug reports to mailto:cbfalconer@obfuscated.invalid
*/
/* ---------------------- */
size_t strlcpy(char *dst, const char *src, size_t sz)
{
const char *start = src;
if (src && sz--) {
while ((*dst++ = *src))
if (sz--) src++;
else {
*(--dst) = '\0';
break;
}
}
if (src) {
while (*src++) continue;
return src - start - 1;
}
else if (sz) *dst = '\0';
return 0;
} /* strlcpy */
As a byproduct, and for error checking, it returns the length of
the resultant string. The user is quite free to retain this value
if needed for further operations.
You can see the whole thing at:
<http://cbfalconer.home.att.net/download/strlcpy.zip>
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/16/2006 7:42:15 PM
|
|
jacob navia <jacob@jacob.remcomp.fr> writes:
> Eric Sosman a �crit :
[...]
>> Ignoring the contentious tone, how does this jibe with
>> the earlier statement about performance staying the same if
>> the changes are not used? If the changes are not used there
>> is no improvement in safety from buffer overflow, is there?
>
> Obviously you think that scanning memory for the terminating zero is
> vastly more efficient than accessing it directly with the string length.
>
> Each time you access the length of a zero terminated string you must
> start that unbounded memory scan, source of countless
> errors. Operations like strcat depend on the length of the first
> string, that must be
> recalculated over and over.
>
> Obviously you have a different concept for "efficiency" than I do.
>
> Length delimited strings are INHERENTLY faster than zero terminated ones.
>
> Is that too difficult for you to understand?
jacob, this kind of attitude is a very large part of the reason you're
not taken very seriously around here. You insist on using strawman
arguments, constructing parodies of what you *assume* other people
believe. And you're usually wrong in your assumptions.
I'm 99.9% certain that Eric does *not* "think that scanning memory for
the terminating zero is vastly more efficient than accessing it
directly with the string length".
Stop putting words in people's mouths. We don't care what you think
other people believe. We *might* care what you believe, if you would
take the time to state it without being condescending.
Or is that too difficult for you to understand? (I'm sure it isn't,
but you have yet to demonstrate it.)
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
|
|
0
|
|
|
|
Reply
|
Keith
|
5/16/2006 10:30:44 PM
|
|
jacob navia asked the hypothetical question:
> Buffer overflows however, are not a "performance" problem?
Buffer overflows are not a performance problem, they are
a programming error.
> Does incorrect software "perform" OK ???
No. It performs incorrectly.
> If you check bounds when using strings the performance loss will not be
> noticeable in most PCs.
The inefficiency
involved in bounds checking for strings is unacceptable
in many instances. If I'm willing to accept that loss of
efficiency, I'm probably willing to write the code in
python. The reason I choose C for a given task is
precisely that I cannot afford that loss.
|
|
0
|
|
|
|
Reply
|
bill.pursell (771)
|
5/16/2006 11:01:19 PM
|
|
jacob navia wrote:
> Each time you access the length of a zero terminated string you must
> start that unbounded memory scan, source of countless errors. Operations
> like strcat depend on the length of the first string, that must be
> recalculated over and over.
This statements are absurd. Each time you **calculate** the length
of a zero terminated string, you must scan the string for the null.
However,
if you fill the buffer in the first place, you can simply keep track of
the
length. Or, if you didn't fill the buffer, you can compute it once and
then keep track of it. You only need to keep scanning the buffer
if you don't realize that you can keep track of the result of the
first calculation.
|
|
0
|
|
|
|
Reply
|
Bill
|
5/16/2006 11:09:10 PM
|
|
Bill Pursell wrote:
> jacob navia wrote:
>
>
>>Each time you access the length of a zero terminated string you must
>>start that unbounded memory scan, source of countless errors. Operations
>>like strcat depend on the length of the first string, that must be
>>recalculated over and over.
>
>
>
> This statements are absurd. Each time you **calculate** the length
> of a zero terminated string, you must scan the string for the null.
> However,
> if you fill the buffer in the first place, you can simply keep track of
> the
> length. Or, if you didn't fill the buffer, you can compute it once and
> then keep track of it. You only need to keep scanning the buffer
> if you don't realize that you can keep track of the result of the
> first calculation.
>
I think you overlooked Jacob's mention of strcat and friends. These do
have to scan the string for the terminating 0.
--
Ian Collins.
|
|
0
|
|
|
|
Reply
|
Ian
|
5/16/2006 11:21:31 PM
|
|
CBFalconer wrote:
> Robert Gamble wrote:
> > jacob navia wrote:
> >>
> >> I have put together a document explaining most extensions of
> >> lcc-win32, why they were done, why they could be useful, and a
> >> documentation of the string/container library that uses those
> >> extensions.
> >>
> >> This document is not finished but I would appreciate your
> >> feedback.
> >>
> >> ftp://ftp.cs.virginia.edu/pub/lcc-win32/proposal.pdf
> >
> > I haven't read the entire document but here are some comments on
> > what I did review:
> >
> > 1.1 "Motivation":
> > "All development of C as an independent language has ceased and C
> > has been relegated to the past."
> >
> > This is the kind of statement that most people here will consider
> > nonsense and is likely to keep many people from taking seriously
> > anything you say after that point. You even contradict yourself
> > in the very next sentence:
>
> I was planning to at least read his proposal, but your quote in
> itself has deterred that.
>
> Let me point out that all development of the French language has
> ceased, not because of lack of innovation, but because of legal
> barriers erected in both France and Quebec. I believe use of the
> phrase "le hotdog" is now cause for incarceration in the Bastille.
I can see why.
>
> --
> "If you want to post a followup via groups.google.com, don't use
> the broken "Reply" link at the bottom of the article. Click on
> "show options" at the top of the article, then click on the
> "Reply" at the bottom of the article headers." - Keith Thompson
> More details at: <http://cfaj.freeshell.org/google/>
> Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
toby
|
5/17/2006 2:12:38 AM
|
|
Ian Collins wrote:
>
.... snip ...
>
> I think you overlooked Jacob's mention of strcat and friends.
> These do have to scan the string for the terminating 0.
Not if you have retained the length from earlier operations, or a
pointer to the terminal '\0'. Then str[someflavor]cat becomes:
str[someflavor]cpy(endptr, newstr, ...whatever)
or
str[someflavor]cpy(&old[lgh], newstr, ...whatever)
and I recommend using the (non-std) strlcpy and strlcat.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/17/2006 2:14:30 AM
|
|
CBFalconer wrote:
> Ian Collins wrote:
>
> .... snip ...
>
>>I think you overlooked Jacob's mention of strcat and friends.
>>These do have to scan the string for the terminating 0.
>
>
> Not if you have retained the length from earlier operations, or a
> pointer to the terminal '\0'. Then str[someflavor]cat becomes:
>
> str[someflavor]cpy(endptr, newstr, ...whatever)
> or
> str[someflavor]cpy(&old[lgh], newstr, ...whatever)
>
Good point.
> and I recommend using the (non-std) strlcpy and strlcat.
>
They appear to be quite widely available.
--
Ian Collins.
|
|
0
|
|
|
|
Reply
|
Ian
|
5/17/2006 4:54:54 AM
|
|
On Tue, 16 May 2006 14:01:23 UTC, jacob navia <jacob@jacob.remcomp.fr>
wrote:
> Eric Sosman a �crit :
> >
> > jacob navia wrote On 05/16/06 02:30,:
> >
> >>Robert Gamble a �crit :
> >>
> >>
> >>>I don't think you can claim that the ideas in your proposal don't add
> >>>complexity or can be implemented without incurring a performance loss.
> >>>
> >>
> >>
> >>There is NO performance loss.
> >>
> >>None of the proposed changes is automatic and none affects any other
> >>part of the language. This means that performance of C stays the same
> >>when the changes are not used.
> >>
> >>The objective of those changes is to furnish the tools for building a
> >>good standard library, specially a good string library.
> >>
> >>C has become a synonym for "buffer overflow vulnerability". Let's stop
> >>this.
> >
> >
> > Ignoring the contentious tone, how does this jibe with
> > the earlier statement about performance staying the same if
> > the changes are not used? If the changes are not used there
> > is no improvement in safety from buffer overflow, is there?
>
> Obviously you think that scanning memory for the terminating zero is
> vastly more efficient than accessing it directly with the string length.
>
> Each time you access the length of a zero terminated string you must
> start that unbounded memory scan, source of countless errors. Operations
> like strcat depend on the length of the first string, that must be
> recalculated over and over.
>
> Obviously you have a different concept for "efficiency" than I do.
>
> Length delimited strings are INHERENTLY faster than zero terminated ones.
>
> Is that too difficult for you to understand?
Navia proves once again that he is brain damaged. When is there a need
to count bytes when one knows how native C strings are designed? There
is no standard C function that has a need for a counter because the C
steing itself says when it is at its end. strcpy(), strcopy(),
strchr(), strstr() have no need to count bytes, the have not even to
know how long the strings they hande are. But any string array that is
NOT nul terminated needs some extra operations to count aginst
something: That proves thar Navia has not even a little bit of
knowledge about C. He works on a compiler that is incompatible to the
standard, unuseable at all when one tries to write portable programs
and is at least absolutely superflous as he cries loudely.
Simple ignore anything what the twit named Navia is beaking around
because he has proven too often that he not knows about he quacks.
--
Tschau/Bye
Herbert
Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
|
|
0
|
|
|
|
Reply
|
Herbert
|
5/17/2006 5:58:04 AM
|
|
On Tue, 16 May 2006 15:02:15 UTC, Flash Gordon
<spam@flash-gordon.me.uk> wrote:
> So tell me, if I'm adding one character at a time to a string, keeping a
> pointer to the end of the string, how is something equivalent to:
> *p++ = whatever;
> going to be slower than something equivalent to:
> *p++ = whatever;
> increment the length of the string p points in to
>
> It seems to me that the latter is going to be slower.
Irrelevant! The "extensions" the twit prises are NOT slower than
standard C - when you does NOT use them. So he requires himself NOT to
use his "extensions"!. That means that you should never use anything
the twit has his fingers on. It would never work as expected.
> The same applies to pieces of code I have that build up strings from
> constant strings. They keep track of the end of the string and a lot of
> the time they know in advance how long the string is that will be added.
>
> There is no system that is going to be the fastest in every situation,
> and if I want counted strings I can implement them just as Paul Heisch
> (sorry, I've probably spelt your name wrong) has.
>
> For some of my string handling I far prefer the way other languages do
> it where you don't have to worry about allocating space but instead the
> buffer grows as you add to it. It does a lot to get rid of buffer
> overflows, but that does not mean I think it is right for all uses or for C.
I've written lots of applications, drivers, kernels in C in any case
there was moving, copying, comparing, splitting, combining, printing
of strings were needed there was not a single place where it were
useful to know the size of a string.
In the seldom cases strlen() is required any operation before or
therafter that would simply quicker as having the need to count each
byte during each operation on a string.
Jacob Navia has prvoen himself as twit without absolutely no knowlede
of C already too often. So the best one can do is to ignore him and
anything he has ever produced saying that it were C oder related to C.
--
Tschau/Bye
Herbert
Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
|
|
0
|
|
|
|
Reply
|
Herbert
|
5/17/2006 5:58:04 AM
|
|
On Tue, 16 May 2006 17:48:53 UTC, jacob navia <jacob@jacob.remcomp.fr>
wrote:
> Flash Gordon a �crit :
> >
> > Strangely enough, my programs don't depend on doing lost of strca and
> > strlen calls.
> >
>
> Ahhh OK. You do not use strcat.
Only when I have not already a pointer to the end of the string at
hand. Otherwise there is no need to count the number of bytes to copy.
In 99% of all jobs one has to do with one or more strings is really no
need to know how long a string is, no need to count the bytes already
handled/left over to handle. So in 99% of all cases there is no need
to set up a counter and to test that explicitely.
On most mashines I know of the halfways current compiler (that means
designed at lest in 1980! know tho check the implicit result flag the
mashine instruction *d++ = *s++ sets when the copied char is (not) 0.
So not even a compare instruction is needed. I know a mashine wher
even strlen() is reduced to a single mashine instruction because the
instruction will stop counting when the byte it reads from memory is
0. The instruction costs for each 256 bytes 1 takt zyclus only, so it
is much quicker than to handle an separate value used as lengh and
must count separately down (in hope that Jacob Navia is effectively
smart enough not to count something upwards to compare with the value
in the lengh variabe but counts always backwards to 0 to save the
explicit compare against length.
>
> But if you do a strchr, for instance, instead of just doing ONE test for
> equality for some character you must do TWO tests because you should
> test if you have reached the terminating zero...
>
> Using length delimited strings this is reduced to a memchr.
Yea, counting an separate varible to count the number of chars to
copy/move/compare - that means an extra addition or subtraction beside
an compare is really smarter than a simple compare without that extra
work.
What needs more time?
a) while (*d++ = *s++) ; /* standard C */
b) for count = length; count; count--) *d++ = *s++: /* much quicker
and less instructions as Jacob Navia claims */
Anybody who knows a bit of C will say the while loop is
- much quicker, because there is no need for count
- much shorter because there is no need to handle count.
So Jacob Navia proves himself again as twit without a bit knowledge of
how to program C.
> Ahh but obviously you do NOT use strchr either.
>
> >> Obviously you have a different concept for "efficiency" than I do.
> >>
> >> Length delimited strings are INHERENTLY faster than zero terminated ones.
> >>
> >> Is that too difficult for you to understand?
> >
> >
> > So tell me, if I'm adding one character at a time to a string, keeping a
> > pointer to the end of the string, how is something equivalent to:
> > *p++ = whatever;
> > going to be slower than something equivalent to:
> > *p++ = whatever;
> > increment the length of the string p points in to
> >
> > It seems to me that the latter is going to be slower.
> >
>
> Yes, it will be slower. In a normal PC you will notice the difference
> after some billion additions.
That mounts up quickly to a runtime sequence of hours. It IS much
slower on each loop. When you have 1000 loops in one stage you'll give
away unneeded time. So instead having an 8088 you'll need to run a P7
to get the same performance. Whereas the 8080 were performant enough
and saves some hundred $ per installation.
No, not all and each solution needs Windows eXperiment - most
fullifies the requirements even today with a cheap 8080 instead of a
expensive pentium beside the power, room and cooling requirements a
pentium has and the cheap 8080 or Z80 lets miss.
Jacob Navia speaking to himself:
> But you are using a very error prone construct. Do you test
> ALWAYS beforehand and test CORRECTLY that you are not going just one
> byte beyond the length of the string?
With a little bit brain a programmer should have there is really no
need to check million times the same values. When there is a check to
made it would be done when a value comes the very first time in sight,
not each time its uncahnged value is used. That is when a string comes
in from untrusted source it would rejected immediately it comes in,
not proven for guilty every time it is used. That check would be done
in any case when the programmer is not completey braindead or Jacob
Navia. But for that one has to learn how to program <any programming
language>.
> OF COURSE YOU never do such mistakes, your code is always 100%
> right the first time.
True - because validity checks are made always wqhenever a value comes
in sight, not after it gots accepted as good. Either one has learned
programming failsave or all tests are of no avail. That is one of the
points Jacob Navia knows nothing about.
> But not everyone is like you see?
That means Jacob Navia and other twits.
> There are stupids like me that make mistakes sometimes.
A true, big unterstatement.
> > The same applies to pieces of code I have that build up strings from
> > constant strings. They keep track of the end of the string and a lot of
> > the time they know in advance how long the string is that will be added.
> >
>
> But... that is the same as length delimited strings... If you keep a
> pointer to the end of the string it is conceptually the same as having a
> length stored somewhere.
No, having to maipulate some pointer with some variables to get some
pointer is not the same as having the pointer already at hand.
>
> > There is no system that is going to be the fastest in every situation,
> > and if I want counted strings I can implement them just as Paul Heisch
> > (sorry, I've probably spelt your name wrong) has.
> >
>
> Of course, but then, you can't write:
>
> String s = "abcd";
>
> s[2] = 'm';
>
> but you have to write:
>
> String s = CreateStringFromCharP("abcd");
> AssignCharAt(s,2,'m');
Oh, Jacob Naia shows how ugly and unhandy his stupid counting string
is. Any C programmer knows that the method he says he can't use with
his stupid counting string is is more simple, less time and space
intensive.
>
> This means that porting the old code to the new code is much more difficult.
Yes, that shows clearly that having NOT a native C string costs more
expenditure than simply handle null terminated strings. He contradicts
himself again.
>
> > For some of my string handling I far prefer the way other languages do
> > it where you don't have to worry about allocating space but instead the
> > buffer grows as you add to it.
>
> Yes, like the string library of lcc-win32. Nice isn't it?
No he names his incompatible crap nice. Whereas he required in another
message that nobody should use it because it costs more time and
expediture when one is using it.
> > It does a lot to get rid of buffer
> > overflows, but that does not mean I think it is right for all uses or
> > for C.
>
> The string library is not "right for all uses" but I do not see why it
> should not be right for C.
Liar. Youself requires that nobody should use it because it is not so
quick as standard C string handling.
> Why C must be kept artificially at such a low level that no sensible
> programming is possible?
Because time is money, higher runtime costs more expediture than
sensible programming.
> If you like those strings in other languages why not doing it in C?
GBecause C strings are designed to be optimal quick.
--
Tschau/Bye
Herbert
Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
|
|
0
|
|
|
|
Reply
|
Herbert
|
5/17/2006 5:58:05 AM
|
|
Bill Pursell a �crit :
> jacob navia wrote:
>
>
>>Each time you access the length of a zero terminated string you must
>>start that unbounded memory scan, source of countless errors. Operations
>>like strcat depend on the length of the first string, that must be
>>recalculated over and over.
>
>
>
> This statements are absurd. Each time you **calculate** the length
> of a zero terminated string, you must scan the string for the null.
> However,
> if you fill the buffer in the first place, you can simply keep track of
> the
> length.
But this is exactly what length delimited strings ARE :-)
"You keep track of the length".
Or, if you didn't fill the buffer, you can compute it once and
> then keep track of it.
Exactly
> You only need to keep scanning the buffer
> if you don't realize that you can keep track of the result of the
> first calculation.
>
|
|
0
|
|
|
|
Reply
|
jacob
|
5/17/2006 6:27:11 AM
|
|
Herbert Rosenau wrote:
>
.... snip ...
>
> I've written lots of applications, drivers, kernels in C in any case
> there was moving, copying, comparing, splitting, combining, printing
> of strings were needed there was not a single place where it were
> useful to know the size of a string.
>
> In the seldom cases strlen() is required any operation before or
> therafter that would simply quicker as having the need to count each
> byte during each operation on a string.
Something nobody seems to bother to notice is that C strings tend
to be short, so that execution of strlen() and similar on them is
not a bind. Somebody might care to instrument their actual use of
strlen so as to report the average (and possibly maximum) length at
program run conclusion. I am not about to bother to do so.
In most systems the instrumentation could be done by simply loading
the revised strlen module before searching the system library.
However this would not handle dumping the final results on program
exit. atexit() may be useful here, which in turn requires
auto-initialization in the strlen replacement function.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/17/2006 9:27:11 AM
|
|
> The same applies to strcat, strlen et al. If you use them sensibly then
> although they are slower than just looking up the length
Who says they are?
robert
|
|
0
|
|
|
|
Reply
|
Robert
|
5/17/2006 9:27:25 AM
|
|
Ian Collins wrote:
> CBFalconer wrote:
>
.... snip ...
>
>> and I recommend using the (non-std) strlcpy and strlcat.
>
> They appear to be quite widely available.
Universally, by simply downloading and compiling:
<http://cbfalconer.home.att.net/download/strlcpy.zip>
and I mean universally. Those are written in standard C,
deliberately do not use any routines in the standard library, are
re-entrant, so are quite suitable for the most resource limited
embedded systems as well as anything else.
--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
|
|
0
|
|
|
|
Reply
|
CBFalconer
|
5/17/2006 9:32:39 AM
|
|
["Followup-To:" header set to comp.lang.c.]
On Wed, 17 May 2006 05:58:04 +0000 (UTC),
Herbert Rosenau <os2guy@pc-rosenau.de> wrote
in Msg. <wmzsGguTDN6N-pn2-jEijL9PgunPI@URANUS1.DV-ROSENAU.DE>
> strchr(), strstr() have no need to count bytes, the have not even to
> know how long the strings they hande are. But any string array that is
> NOT nul terminated needs some extra operations to count aginst
> something:
All these performance ramblings, besides being off-topic here, make
unwarranted assumptions about how an implementation or the underlying
CPU work. Nobody says that the implementation has to scan the entire
string each time strlen or strcat are called. On the other hand, copying
chunks of memory (of known size) from one place to another is such a
common operation that it is probably a very fast operation on most CPUs.
In other words: Unless you're an implementor, don't try to out-smart
your implementation.
Says:
robert (who often keeps copies of strlen()'s result around. I'm
supersticious.)
|
|
0
|
|
|
|
Reply
|
Robert
|
5/17/2006 9:42:44 AM
|
|
Robert Latest a �crit :
>>The same applies to strcat, strlen et al. If you use them sensibly then
>>although they are slower than just looking up the length
>
>
> Who says they are?
>
> robert
Well, I say that strlen will be always slower with zero terminated
strings and almost a NOP with length delimited strings.
strlen must always scan the characters to find the zero.
A length delimited string can just return the length immediately.
|
|
0
|
|
|
|
Reply
|
jacob (2538)
|
5/17/2006 9:45:30 AM
|
|
CBFalconer <cbfalconer@yahoo.com> writes:
>
> Something nobody seems to bother to notice is that C strings tend
> to be short, so that execution of strlen() and similar on them is
> not a bind. Somebody might care to instrument their actual use of
> strlen so as to report the average (and possibly maximum) length at
> program run conclusion. I am not about to bother to do so.
You are proabably right about that. However you can check
yourself. Just write a small string and add to it one char after the
other. Of course you will "tune" it e.g that you allocate in larger
chunks or the like, but on the other hand collecting a string is
probalby not a thing you do seldom, expecially now in the context of
"Web" programming. So you can probably burn a lot of time in strxxx
functions
Regards
Friedrich
--
Please remove just-for-news- to reply via e-mail.
|
|
0
|
|
|
|
Reply
|
Friedrich
|
5/17/2006 11:08:49 AM
|
|
CBFalconer said:
> Herbert Rosenau wrote:
>>
> ... snip ...
>>
>> I've written lots of applications, drivers, kernels in C in any case
>> there was moving, copying, comparing, splitting, combining, printing
>> of strings were needed there was not a single place where it were
>> useful to know the size of a string.
>>
>> In the seldom cases strlen() is required any operation before or
>> therafter that would simply quicker as having the need to count each
>> byte during each operation on a string.
>
> Something nobody seems to bother to notice is that C strings tend
> to be short, so that execution of strlen() and similar on them is
> not a bind.
To be more precise, strings in general tend to be short, and C strings are
no exception to this. I suspect that the distribution *range* is actually
very wide indeed, but that the distribution itself extraordinarily skewed
towards the low end.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
|
|
0
|
|
|
|
Reply
|
Richard
|
5/17/2006 12:46:12 PM
|
|
On 2006-05-17, jacob navia <jacob@jacob.remcomp.fr> wrote:
> Robert Latest a �crit :
>>>The same applies to strcat, strlen et al. If you use them sensibly then
>>>although they are slower than just looking up the length
>>
>>
>> Who says they are?
>>
>> robert
>
> Well, I say that strlen will be always slower with zero terminated
> strings and almost a NOP with length delimited strings.
>
> strlen must always scan the characters to find the zero.
>
> A length delimited string can just return the length immediately.
>
Since we're making assumptions about how functions work, here's an
idea: suppose that your length variable is stored on disk in swap;
loading that variable is going to take longer than the processor
using its predefined instructions (such as CMPSB) blowing through
a string.
Also, you have to consider how much more memory it takes to store
the lengths of strings, when it has been shown that in most cases
the lengths is /irrelevant/.
--
Andrew Poelstra <http://www.wpsoftware.net/blog>
|
|
0
|
|
|
|
Reply
|
apoelstra (387)
|
5/17/2006 2:41:53 PM
|
|
Andrew Poelstra a �crit :
> Also, you have to consider how much more memory it takes to store
> the lengths of strings, when it has been shown that in most cases
> the lengths is /irrelevant/.
Irrelevant of course. Completely irrelevant. All buffer overflows were
coded that way:
Who cares about how long that string is exactly?
It will fit in a 512 byte buffer anyway!
|
|
0
|
|
|
|
Reply
|
jacob (2538)
|
5/17/2006 2:49:09 PM
|
|
On Wed, 17 May 2006 09:32:39 UTC, CBFalconer <cbfalconer@yahoo.com>
wrote:
> Ian Collins wrote:
> > CBFalconer wrote:
> >
> ... snip ...
> >
> >> and I recommend using the (non-std) strlcpy and strlcat.
> >
> > They appear to be quite widely available.
>
> Universally, by simply downloading and compiling:
>
> <http://cbfalconer.home.att.net/download/strlcpy.zip>
>
> and I mean universally. Those are written in standard C,
> deliberately do not use any routines in the standard library, are
> re-entrant, so are quite suitable for the most resource limited
> embedded systems as well as anything else.
>
Hy compiler comes with 2 different standard libraries:
- not thread save; designed for single thread apps
- complete thread save because most apps will need that
for theyr requirements - been multithreaded
So you tells the linker only which implementation you use, single or
multithreaded. That does NOT stopping the compiler to inline functions
from the standard library whenever possible.
--
Tschau/Bye
Herbert
Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
|
|
0
|
|
|
|
Reply
|
Herbert
|
5/17/2006 6:12:34 PM
|
|
On Wed, 17 May 2006 14:49:09 UTC, jacob navia <jacob@jacob.remcomp.fr>
wrote:
> Andrew Poelstra a �crit :
> > Also, you have to consider how much more memory it takes to store
> > the lengths of strings, when it has been shown that in most cases
> > the lengths is /irrelevant/.
>
> Irrelevant of course. Completely irrelevant. All buffer overflows were
> coded that way:
>
> Who cares about how long that string is exactly?
You speaks about the pascal programs I had to rewrite _in C_ because
buffer overflow was the cause for 93% of all coredumps it had
produced.
A crazy programmer is in no ways hinderd to produce buffer overflows
in languages they check each and all statements for errors.
> It will fit in a 512 byte buffer anyway!
Really you speaks about pascal or fortran.
When you have a problem with that simply use a language that checks
each and any what can be errornous, like addition, subtraction,
multiplication, division of int, these are the most causes for
unwanted and incorrect errors a program can produce. Buffer overflow
in C is always only the result of a dumb programmer who was unable to
do his job right, like Jacob Navia who has never learnded how to
program failsave.
--
Tschau/Bye
Herbert
Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
|
|
0
|
|
|
|
Reply
|
os2guy1 (1071)
|
5/17/2006 6:12:35 PM
|
|
On Wed, 17 May 2006 11:08:49 UTC, Friedrich Dominicus
<just-for-news-frido@q-software-solutions.de> wrote:
> CBFalconer <cbfalconer@yahoo.com> writes:
>
> >
> > Something nobody seems to bother to notice is that C strings tend
> > to be short, so that execution of strlen() and similar on them is
> > not a bind. Somebody might care to instrument their actual use of
> > strlen so as to report the average (and possibly maximum) length at
> > program run conclusion. I am not about to bother to do so.
> You are proabably right about that. However you can check
> yourself. Just write a small string and add to it one char after the
> other. Of course you will "tune" it e.g that you allocate in larger
> chunks or the like, but on the other hand collecting a string is
> probalby not a thing you do seldom, expecially now in the context of
> "Web" programming. So you can probably burn a lot of time in strxxx
> functions
No. I would simply not use a single of the str...() functions on that
- and I do not so in the editor I develope because when I have to
split a string in multiple ones or concatenate multiple strings inow
one single one it will be done by attaching each chare only once in
the whole run instead of wobble for- and backwards multiple times
through the same bytes.
While I read in the data I have to handle I know the maximum size I
need, setting pointers to the right areas while scanning through the
data makes quick acces to it easy and really quick while working on,
building the internal well normed output when it gets written to
external media the same occures.
getc()/putc() are a nice fuction to overcome the need to handle each
single charater by copying inside system, inside C runtime, inside
thousend of other fuctions. So there is no need to fiddle around with
...get....(), fread/frwite, fprint... and so on. getc() and when the
chare readed is not fit on the current pass unget and back to higher
level that will find it on its next get and do what is needed further.
No buffer who can overflow to will filled with unknown data, no need
to read a number of unknown bytes in unknown lenth, not even the need
of multiple ?alloc() with blind calculated size parameter.
Get the size of the file, malloc() a buffer for the size needed to get
it, malloc() the structs needed to maintain the encoded data and start
reading the file byte by byte. The runtime will internal reserve
enouth buffer space to make the read blocks/sectors/or whatever the
medium uses to organise a data stream most quickly from external
medium into memory so quick as possible.
When data has to be written the sames goes on: write byte after byte,
encode and format on the fly. There is nothing that can give you more
control over a stream and is more quickly done as to read/write on the
fly byte by byte instead to copy with or without help of a format
string from one location to another only to get it copied from there
to somewhere else only to get it copied ......
--
Tschau/Bye
Herbert
Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!
|
|
0
|
|
|
|
Reply
|
os2guy1 (1071)
|
5/17/2006 6:12:35 PM
|
|
On 2006-05-17, jacob navia <jacob@jacob.remcomp.fr> wrote:
> Andrew Poelstra a �crit :
>> Also, you have to consider how much more memory it takes to store
>> the lengths of strings, when it has been shown that in most cases
>> the lengths is /irrelevant/.
>
> Irrelevant of course. Completely irrelevant. All buffer overflows were
> coded that way:
>
> Who cares about how long that string is exactly?
>
> It will fit in a 512 byte buffer anyway!
>
>
That is the worst coding I have ever seen, mainly
because it is so common. However, fitting a string
into a buffer doesn't require you to know the string
length; it merely requires you to allocate more
memory every time yuou overflow the buffer.
Knowing a string length requires having a string,
which in turn requires memory already having been
allocated for it. You can't have the length before
the string, so your scenario has nothing to do with
the issue at hand.
--
Andrew Poelstra < http://www.wpsoftware.net/blog >
|
|
0
|
|
|
|
Reply
|
apoelstra (387)
|
5/17/2006 6:59:16 PM
|
|
jacob navia <jacob@jacob.remcomp.fr> writes:
> Robert Latest a �crit :
>>> The same applies to strcat, strlen et al. If you use them sensibly
>>> then although they are slower than just looking up the length
>> Who says they are?
>
> Well, I say that strlen will be always slower with zero terminated
> strings and almost a NOP with length delimited strings.
>
> strlen must always scan the characters to find the zero.
>
> A length delimited string can just return the length immediately.
And you need extra space to store the length, and you need to update
the length every time it might change.
Carefully written C code can avoid re-scanning strings in many cases,
and can also avoid storing the length when it's not needed.
I'm sure that storing the length of each string can improve
performance in some cases. This improvement would be greatest in
naively written code, things like:
for (i = 0; i < strlen(s); i ++) {
...
}
I also suspect that, for many applications, the overhead of
maintaining the length for each and every string, even when it's not
needed, could exceed the improvement from avoiding re-scanning,
particularly when (a) the application is carefully written, (b) most
strings are short, (c) the compiler is clever enough to hoist some
length calculations out of loops, and/or (d) the hardware is optimized
to work with NUL-terminated strings.
You seem to be assuming that storing the length for every string will
improve performance *by definition*. I don't believe that's true.
Have you *measured* the difference with any real applications? If
not, you're only guessing. (So am I, but I admit it.)
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
|
|
0
|
|
|
|
Reply
|
kst-u (21469)
|
5/17/2006 8:02:15 PM
|
|
"jacob navia" <jacob@jacob.remcomp.fr> wrote
> Robert Latest a �crit :
>>>The same applies to strcat, strlen et al. If you use them sensibly then
>>>although they are slower than just looking up the length
>>
>> Who says they are?
>>
>
> Well, I say that strlen will be always slower with zero terminated strings
> and almost a NOP with length delimited strings.
>
> strlen must always scan the characters to find the zero.
>
> A length delimited string can just return the length immediately.
>
That's obviously right.
I can believe that, for non-trivial string processing, delimited stirngs are
better.
However there are certain problems. For instance, say we want to implement a
sub-string finder. strstr() returns a pointer to the first occurence of the
substring in its argument. How do we replace this by returning a
length-delimited string?
Do we make a local copy of the string? That seems pretty useless, and
expensive. Do we return a length-delimited string that points to the same
memory? The snag there is that when you extend or shrink the master string,
the sub-string has to update. You could say that lendelimited_strstr()
returns an integer to the first instance of the sub string. That's maybe the
answer, but then it puts the burden of constructing a usable substring on
the calling programmer.
By junking NUL-terminated strings, essentially you've destroyed the
cleanliness of the C string library.
The main problem with this whole venture, however, is political. If you were
Supreme Emperor of the West you could force the whole world to use lcc-win.
Maybe that would be a good thing. But since you are not, people don't want
to get locked into a standard that hardly anyone yet uses, and which may
eaisly fail. Also, there is a perception that using lcc-win would cede you
the status of Supreme Emperor. People will show submissive behaviour to
socially-dominant individuals, but not easily.
--
www.personal.leeds.ac.uk/~bgy1mm
|
|
0
|
|
|
|
Reply
|
regniztar (3128)
|
5/17/2006 11:07:09 PM
|
|
jacob navia wrote:
> ...
> Length delimited strings are INHERENTLY faster than zero terminated ones.
This fuzzy statement may give a clue to why you oversell them.
In fact, only "finding the length of" is faster. Any per-character
operation (copying, translating, searching, examples abound) is no
faster at all.
Why is finding the length faster? Because you've cached it. That option
is equally available to terminated string users. None of this is rocket
surgery...
|
|
0
|
|
|
|
Reply
|
toby
|
5/19/2006 10:30:52 PM
|
|
>Obviously you think that scanning memory for the terminating zero is
>vastly more efficient than accessing it directly with the string length.
Obviously you think that all operations for strings require the
length of the string, and that allocating memory and copying a
string costs nothing.
Describe how you'd implement the length-delimited-string version
of strchr() and strrchr(). They are not allowed to run out of
memory, calling malloc() is likely to be expensive, and memory leaks
are not allowed.
If p and p2 are strings, describe how you'd implement in
length-delimited-string what is done in C now with:
p[n] = '\0';
p2 = p+n+1;
That is, split a string creating two new ones.
How do you intend to access the Nth character of a string,
(given that you already know the string is that long),
currently done with:
p[N]
?
>Each time you access the length of a zero terminated string you must
>start that unbounded memory scan, source of countless errors. Operations
Each time you need to create a substring of a length-delimited-string,
and keep a copy of the original, you need to allocate more memory
(and not leak it), a source of even more countless errors.
>like strcat depend on the length of the first string, that must be
>recalculated over and over.
>
>Obviously you have a different concept for "efficiency" than I do.
>
>Length delimited strings are INHERENTLY faster than zero terminated ones.
For certain operations.
Gordon L. Burditt
|
|
0
|
|
|
|
Reply
|
gordonb
|
5/19/2006 11:34:11 PM
|
|
Gordon Burditt a �crit :
>>Obviously you think that scanning memory for the terminating zero is
>>vastly more efficient than accessing it directly with the string length.
>
>
> Obviously you think that all operations for strings require the
> length of the string, and that allocating memory and copying a
> string costs nothing.
>
> Describe how you'd implement the length-delimited-string version
> of strchr() and strrchr(). They are not allowed to run out of
> memory, calling malloc() is likely to be expensive, and memory leaks
> are not allowed.
>
Let's do strrchr ok?
Here is the code for Strrchr:
StringpA EXPORT overloaded Strrchr(StringA & string, int element)
{
int i;
char *p;
StringpA result;
if (!StrvalidA(string))
return invalid_stringpA;
p=string.content + string.count - 1;
for (i = string.count; i >0; --i){
if (*p-- == element){
result.count = string.count-i;
result.content = string.content+i;
result.parent = &string;
}
}
return invalid_stringpA;
}
This function returns a structure describing a position within a String.
This structure is defined as follows:
typedef struct _stringpA {
size_t count;
char *content;
StringA *parent;
} StringpA;
This "fat" pointer defines a position within its parent string, that is
left untouched. Note that the field "content" is redundant since it must
be always the "content" field of the parent string + the count. It is
easier this way though. Probably I will take this field away, but for
now it doesn't matter.
Now consider this test program:
#include <str.h>
extern int _stdcall GetTickCount(void);
int main(void)
{
char buffer[2+64*1024];
int i,t1;
for (i=0; i<64*1024;i++)
buffer[i] = 'M';
buffer[i++] = 'Z';
buffer[i] = 0;
String &str = new_string(buffer);
t1 = GetTickCount();
for (i=0; i<10000;i++) {
Strchr(str,'Z');
}
printf("Strchr takes %d ms\n",GetTickCount() - t1);
t1 = GetTickCount();
for (i=0; i<10000;i++) {
strchr(buffer,'Z');
}
printf("strchr takes %d ms\n",GetTickCount() - t1);
}
This program builds a 64K buffer and at the end puts the character 'Z'.
The output of this program is:
Strchr takes 672 ms
strchr takes 1313 ms
Why?
Note that EACH CALL to Strrchr make an expensive call to StrvalidA to
validate its arguments. Still Strchr is TWICE as fast.
Since Strchr knows the length of the result string, it can start
AT THE END of the buffer and scan backwards! strchr however, can't do
this and must start at the beginning, working all the way to the end of
the buffer.
OF COURSE this is a very artificial example, but it shows you that
knowing the length of the string can be very useful in many situations
besides just strlen.
In the normal case however, strchr could be slightly faster, specially
if the strings are short.
jacob
|
|
0
|
|
|
|
Reply
|
jacob
|
5/20/2006 7:18:11 AM
|
|
"jacob navia" <jacob@jacob.remcomp.fr> wrote
> Here is the code for Strrchr:
> StringpA EXPORT overloaded Strrchr(StringA & string, int element)
> {
> int i;
> char *p;
> StringpA result;
>
> if (!StrvalidA(string))
> return invalid_stringpA;
> p=string.content + string.count - 1;
> for (i = string.count; i >0; --i){
> if (*p-- == element){
> result.count = string.count-i;
> result.content = string.content+i;
> result.parent = &string;
> }
> }
> return invalid_stringpA;
> }
>
> This function returns a structure describing a position within a String.
> This structure is defined as follows:
> typedef struct _stringpA {
> size_t count;
> char *content;
> StringA *parent;
> } StringpA;
>
> This "fat" pointer defines a position within its parent string, that is
> left untouched. Note that the field "content" is redundant since it must
> be always the "content" field of the parent string + the count. It is
> easier this way though. Probably I will take this field away, but for
> now it doesn't matter.
>
> Now consider this test program:
> #include <str.h>
> extern int _stdcall GetTickCount(void);
> int main(void)
> {
> char buffer[2+64*1024];
> int i,t1;
>
> for (i=0; i<64*1024;i++)
> buffer[i] = 'M';
> buffer[i++] = 'Z';
> buffer[i] = 0;
> String &str = new_string(buffer);
>
> t1 = GetTickCount();
> for (i=0; i<10000;i++) {
> Strchr(str,'Z');
> }
> printf("Strchr takes %d ms\n",GetTickCount() - t1);
>
> t1 = GetTickCount();
> for (i=0; i<10000;i++) {
> strchr(buffer,'Z');
> }
> printf("strchr takes %d ms\n",GetTickCount() - t1);
> }
>
> This program builds a 64K buffer and at the end puts the character 'Z'.
>
> The output of this program is:
> Strchr takes 672 ms
> strchr takes 1313 ms
>
The example, as you admit, is contrived to show the superiority of the
length delimited string, by passing in a 64K argument with the target at the
end. However you have less than an order of magnitude speed up, less than 2
times, in fact.
What I would conclude from this is that attempts to speed up the string
library are rather a waste of time, and the focus should be on safety and
usability.
--
www.personal.leeds.ack.uk/~bgy1mm
|
|
0
|
|
|
|
Reply
|
Malcolm
|
5/21/2006 9:45:22 PM
|
|
Malcolm a �crit :
>
> The example, as you admit, is contrived to show the superiority of the
> length delimited string, by passing in a 64K argument with the target at the
> end. However you have less than an order of magnitude speed up, less than 2
> times, in fact.
>
> What I would conclude from this is that attempts to speed up the string
> library are rather a waste of time, and the focus should be on safety and
> usability.
Yes, you are right. I just wanted to say that strrchr is an example
where length delimited strings make a more perfomant algorithm possible,
but we are discussing here about some nanoseconds worth of CPU time.
I agree that the emphasis should be in safety and usability.
jacob
|
|
0
|
|
|
|
Reply
|
jacob
|
5/21/2006 10:47:30 PM
|
|
"jacob navia" <jacob@jacob.remcomp.fr> wrote
>
> Here is the code for Strrchr:
> StringpA EXPORT overloaded Strrchr(StringA & string, int element)
> {
> int i;
> char *p;
> StringpA result;
>
> if (!StrvalidA(string))
> return invalid_stringpA;
> p=string.content + string.count - 1;
> for (i = string.count; i >0; --i){
> if (*p-- == element){
> result.count = string.count-i;
> result.content = string.content+i;
> result.parent = &string;
>
^^^^^^^^^^^^^^^^^^^^^^^^^
return result; /* !!!!!!!!!!!!!! */
>
> }
> }
> return invalid_stringpA;
> }
The function is bugged. That was why it was giving an impossibly poor
performance gain on the problem tuned for it.
--
Buy my book 12 Common Atheist Arguments (refuted)
$1.25 download or $7.20 paper, available www.lulu.com/bgy1mm
|
|
0
|
|
|
|
Reply
|
Malcolm
|
5/23/2006 9:21:04 PM
|
|
Malcolm a �crit :
> "jacob navia" <jacob@jacob.remcomp.fr> wrote
>
>>Here is the code for Strrchr:
>>StringpA EXPORT overloaded Strrchr(StringA & string, int element)
>>{
>> int i;
>> char *p;
>> StringpA result;
>>
>> if (!StrvalidA(string))
>> return invalid_stringpA;
>> p=string.content + string.count - 1;
>> for (i = string.count; i >0; --i){
>> if (*p-- == element){
>> result.count = string.count-i;
>> result.content = string.content+i;
>> result.parent = &string;
>>
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^
> return result; /* !!!!!!!!!!!!!! */
>
>> }
>> }
>> return invalid_stringpA;
>>}
>
>
> The function is bugged. That was why it was giving an impossibly poor
> performance gain on the problem tuned for it.
>
It can be that it is bugged.
Can you specify?
What is bugged?
jacob
|
|
0
|
|
|
|
Reply
|
jacob
|
5/23/2006 9:41:56 PM
|
|
# "jacob navia" <jacob@jacob.remcomp.frwrote
# >
# Here is the code for Strrchr:
char *strrchar(char *s,char c) {
char *r = 0;
for (;;) {
if (*s==c) r = s;
if (!*s) break;
s++;
}
return r;
}
Scanning backwards will only give a 2x speed up (if c is equally
likely anywhere s, on the average strlen(s)/2 characters would
have to be scanned instead strlen(s) characters this way.) Both
scanning directions are still O(strlen(s)) assuming no thrashing.
Scanninng very long strings (megabyte strings are not uncommon
in some modern programs) backwards risk thrashing since vm pagers
are not designed with backward paging heuristics as often as
forward paging heuristics. Also if there's hardware to do the
scanning, more likely it runs forward only.
--
SM Ryan http://www.rawbw.com/~wyrmwif/
GERBILS
GERBILS
GERBILS
|
|
0
|
|
|
|
Reply
|
SM
|
5/24/2006 2:41:59 AM
|
|
|
61 Replies
140 Views
(page loaded in 0.531 seconds)
Similiar Articles: Adding "New Page" to Document Menu in Acrobat - comp.text.pdf ...Many years ago, I read about a really neat trick whereby one can add "New Page" to the Document Menu in Acrobat 5, 6, and 7. Now I've forgotten how... Sharepoint vs. Livelink - comp.doc.managementI am in the process of researching a new document storage technology for a client's policies and procedures. Both Sharepoint and Livelink have been ... how to put my pdf "file" (actually a byte array) into iText PDF ...I imagine once I get the book in hand I'll have a lot more to work with, but for right now, I have my Document, PdfWriter, etc: Document document =3D new Document ... PDF Open Parameters: Go to last page opened - comp.text.pdf ...... of each document opened in which case, the slightest modification to a documents produces a message digest, so that acrobat treats the updated PDF as a new document ... iTextSharp - PDFStamper & Barcode - comp.text.pdf... for concatenating: Imports iTextSharp.text Imports iTextSharp.text.pdf Function ConcatPDF(byref input() as string, output as string) Dim doc As New Document ... reach max. number of columns in PdfPTable in iText? - comp.text ...Here's my code fragment: Document document = new Document(PageSize.A4.rotate(), 10, 10, 10, 10); int NumColumns = 13; try { PdfPTable datatable = new PdfPTable ... Itext / One Page with Dynamic Height - comp.text.pdfI have created a very long document (say 10000), and after writing everything, I got document.GetVerticalPosition value, and created a new document with page size of ... Changing page margins in mid-document - comp.text.texCopy, then paste into the left page margins of the new document ... Best Answer: First, click File -> Print Setup. In the lower righthand section, change orientation ... JavaScript for Acrobat: document.close() does not work - comp.text ...I'd like the document to close after a certain expiration date. So, I use the following JavaScript on the properties of the first page: var d = new D... scanning multiple pages into a single document - how do I merge ...Scan a document - Support - Office.com In the Scan New Document dialog box, do one or more of the following: If your scanner is ... Scan multiple pages into separate files Acrobat -- extracting pages as separate pdf files - comp.text.pdf ...But you can do this by repeatedly * Create new document * Insert page into new document * Save new document * Close new document ... iText Java copy annotations or elements - comp.text.pdfcopy of a table out of a pdf file - comp.text.pdf iText Java copy annotations or elements - comp.text.pdf original pdf file to the new document. Writing into a Word Document using ActiveX control - comp.soft-sys ...Hello, I'm new into using Word-Matlab activex controls. I have a very simple equation and I want its result to be written into a Word Document usin... Moving Acrobat to a New Computer? - comp.text.pdfI'm moving/installing Adobe Acrobat 5 from a Win98SE computer to a new Windows 2000 computer. I want to maintain all my settings (except for folders) ... Copying cross references between Word documents ?? - comp.os.ms ...I'm working on a large word document together with several co- authors. Each author works on a copy of the document with the identical table of con... How to: Create New DocumentsWhen you create a document programmatically, the new document is a native Document object. This object does not have the additional events and data binding ... Microsoft Word: New Blank Document / Normal / Template, default ...Expert: Anne Troy - 7/3/2006. Question Anne - I just copied the normal.dot file and copied it to the new pc. When I start Word, it comes up with basic document and ... 7/12/2012 3:38:53 PM
|