I was re-reading the introduction of Art of Assembly, and I noticed (not
that I hadn't noticed already, but i hadn't dedicated much time to it)
that Randall Hyde writes in the section "WHAT'S RIGHT WITH ASSEMBLY
LANGUAGE":
7§ "Assembly language programs are often less than one-half the size of
comparable HLL programs. This is especially impressive when you consider
the fact that data items generally consume the same amount of space in
both types of programs, and that data is responsible for a good amount
of the space used by a typical application."
Assuming that he is right, how can this be so true? Parhaps it has to do
with the age of the book (it's the 16-bit version), since at the time
barely no one included high resolution graphics in the programs...
JJ
|
|
0
|
|
|
|
Reply
|
UTF
|
12/14/2006 8:43:16 PM |
|
Jo�o Jer�nimo <spamtrap@crayne.org> writes:
> I was re-reading the introduction of Art of Assembly, and I noticed
> (not that I hadn't noticed already, but i hadn't dedicated much time
> to it) that Randall Hyde writes in the section "WHAT'S RIGHT WITH
> ASSEMBLY LANGUAGE":
>
> 7� "Assembly language programs are often less than one-half the size
> of comparable HLL programs. This is especially impressive when you
> consider the fact that data items generally consume the same amount of
> space in both types of programs, and that data is responsible for a
> good amount of the space used by a typical application."
>
>
> Assuming that he is right, how can this be so true? Parhaps it has to
> do with the age of the book (it's the 16-bit version), since at the
> time barely no one included high resolution graphics in the
> programs...
It's simply that human programmers can often write more concise code
than compilers.
But nowadays, with the right optimization level, compilers can do a
good job too, and in most situations, it's definitely cheaper to put
more RAM in the computer than to hire good assembly programmers and
pay them for the hard work needed to write highly hand-optimized code.
--
__Pascal Bourguignon__ http://www.informatimago.com/
This is a signature virus. Add me to your signature and help me to live.
|
|
0
|
|
|
|
Reply
|
Pascal
|
12/14/2006 11:58:40 PM
|
|
João Jerónimo wrote:
> I was re-reading the introduction of Art of Assembly, and I noticed (not
> that I hadn't noticed already, but i hadn't dedicated much time to it)
> that Randall Hyde writes in the section "WHAT'S RIGHT WITH ASSEMBLY
> LANGUAGE":
>
> 7§ "Assembly language programs are often less than one-half the size of
> comparable HLL programs. This is especially impressive when you consider
> the fact that data items generally consume the same amount of space in
> both types of programs, and that data is responsible for a good amount
> of the space used by a typical application."
>
>
> Assuming that he is right, how can this be so true? Parhaps it has to do
> with the age of the book (it's the 16-bit version), since at the time
> barely no one included high resolution graphics in the programs...
>
> JJ
>
Hi JJ. Well, Randall does frequent this group, so expect to see him replying
soon. I'd say that an assembly program can be 1000x smaller than a HLL-coded
app, or even the same size or larger. ASM doesn't make programs smaller;
programmers do. :-)
|
|
0
|
|
|
|
Reply
|
Mark
|
12/15/2006 12:30:55 AM
|
|
On Thu, 14 Dec 2006 20:43:16 +0000, =?UTF-8?B?Sm/Do28gSmVyw7NuaW1v?=
<spamtrap@crayne.org> wrote:
>I was re-reading the introduction of Art of Assembly, and I noticed (not
>that I hadn't noticed already, but i hadn't dedicated much time to it)
>that Randall Hyde writes in the section "WHAT'S RIGHT WITH ASSEMBLY
>LANGUAGE":
>
>7§ "Assembly language programs are often less than one-half the size of
>comparable HLL programs. This is especially impressive when you consider
>the fact that data items generally consume the same amount of space in
>both types of programs, and that data is responsible for a good amount
>of the space used by a typical application."
>
>
>Assuming that he is right, how can this be so true? Parhaps it has to do
>with the age of the book (it's the 16-bit version), since at the time
>barely no one included high resolution graphics in the programs...
>
To add to the other posts, perhaps one advantage of
assembly is that you are more familiar with the actual code,
so you see more opportunities for size optimization.
For example, in writing Windows programs there is a
certain general layout for a dialog handler. Each control,
however, might have its own particular operations that
are needed. For programs with multiple dialogs, the
tutorials just have separate dialog handlers for each,
since each dialog has different controls.
But after you've written a few of these you notice that
you are doing the same things over and over. You start
extracting the redundant control handler code to
subroutines that can be called by any dialog. In my own
case (with several dozen dialogs) it became obvious that
the way to go was with a single dialog handler that used
dialog-indexed tables to find the needed control routines.
When I add a new dialog, I just have to create a few tables
and write any unique control handlers. This saves an
enormous amount of code space. (As far as I can tell,
it also embodies the philosophy of "object-oriented" design,
even though I don't think the OOP compilers use this.)
But having said all that, note that it's not always easy to
know just how much size imrovement there may be, since
you'd have to write the same program with assembly and
with your HLL in order to compare. I suspect this is
rather rare. More typical is a HLL program where you
re-write some small section in assembly, and the usual
reason is for speed improvement, not size reduction.
Best regards,
Bob Masta
dqatechATdaqartaDOTcom
D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Home of DaqGen, the FREEWARE signal generator
|
|
0
|
|
|
|
Reply
|
NoSpam
|
12/15/2006 1:50:26 PM
|
|
Jo�o Jer�nimo escreveu:
> 7� "Assembly language programs are often less than one-half the size of
> comparable HLL programs. This is especially impressive when you consider
> the fact that data items generally consume the same amount of space in
> both types of programs, and that data is responsible for a good amount
> of the space used by a typical application."
>
> Assuming that he is right, how can this be so true? Parhaps it has to do
> with the age of the book (it's the 16-bit version), since at the time
> barely no one included high resolution graphics in the programs...
I haven't expressed myself well... What I was talking about is the fact
the data is responsible for a good amount of the space used by a
typical application... Which is entirely true...
JJ
|
|
0
|
|
|
|
Reply
|
spamtrap
|
12/15/2006 6:12:06 PM
|
|
Jo�o Jer�nimo escreveu:
> 7� "Assembly language programs are often less than one-half the size of
> comparable HLL programs. This is especially impressive when you consider
> the fact that data items generally consume the same amount of space in
> both types of programs, and that data is responsible for a good amount
> of the space used by a typical application."
>
> Assuming that he is right, how can this be so true? Parhaps it has to do
> with the age of the book (it's the 16-bit version), since at the time
> barely no one included high resolution graphics in the programs...
I think I haven't expressed myself quite well... What I was talking
about is the fact the data is responsible for a good amount of the
space used by a typical application... Which is entirely true...
JJ
|
|
0
|
|
|
|
Reply
|
spamtrap
|
12/15/2006 6:13:26 PM
|
|
Bob Masta wrote:
> On Thu, 14 Dec 2006 20:43:16 +0000, =?UTF-8?B?Sm/Do28gSmVyw7NuaW1v?=
> <spamtrap@crayne.org> wrote:
>
>> I was re-reading the introduction of Art of Assembly, and I noticed (not
>> that I hadn't noticed already, but i hadn't dedicated much time to it)
>> that Randall Hyde writes in the section "WHAT'S RIGHT WITH ASSEMBLY
>> LANGUAGE":
>>
>> 7§ "Assembly language programs are often less than one-half the size of
>> comparable HLL programs. This is especially impressive when you consider
>> the fact that data items generally consume the same amount of space in
>> both types of programs, and that data is responsible for a good amount
>> of the space used by a typical application."
>>
>>
>> Assuming that he is right, how can this be so true? Parhaps it has to do
>> with the age of the book (it's the 16-bit version), since at the time
>> barely no one included high resolution graphics in the programs...
>>
>
> To add to the other posts, perhaps one advantage of
> assembly is that you are more familiar with the actual code,
> so you see more opportunities for size optimization.
>
> For example, in writing Windows programs there is a
> certain general layout for a dialog handler. Each control,
> however, might have its own particular operations that
> are needed. For programs with multiple dialogs, the
> tutorials just have separate dialog handlers for each,
> since each dialog has different controls.
>
> But after you've written a few of these you notice that
> you are doing the same things over and over. You start
> extracting the redundant control handler code to
> subroutines that can be called by any dialog. In my own
> case (with several dozen dialogs) it became obvious that
> the way to go was with a single dialog handler that used
> dialog-indexed tables to find the needed control routines.
> When I add a new dialog, I just have to create a few tables
> and write any unique control handlers. This saves an
> enormous amount of code space. (As far as I can tell,
> it also embodies the philosophy of "object-oriented" design,
> even though I don't think the OOP compilers use this.)
This is indeed a form of OO programming. :-)
The real size advantage of asm only happens when you write everything in
asm, with the intention of making it tight.
Back in the MSDOS days this was a lot more important, in particular when
you wrote bios/os extensions in the form of TSRs ("Terminate & Stay
Resident".
Microsoft's handlers for the Norwegian keyboard layout and text mode
modified letters used to need between 20 and 60 KB (out of a maximum of
640 KB total), i.e. quite large even though they did write them in asm.
My replacement program used 1706 bytes (rounded up from sub-1700) to the
next 16-byte boundary), and it handled everything the Microsoft/IBM
versions did, plus it also fixed the fonts for 43 and 50-line text modes
on EGA/VGA adapters.
I once wrote a distributed print server system (for early Novell Netware
networks), my slave printer driver (to allow any personal printer to be
shared via a server queue) was written from scratch, in a single sitting
of about 5 hours, and it consisted of 1500-2000 lines of code (plus a
few comment lines etc).
After fixing about 3 syntax (i.e. typing) errors it assembled and ran
flawlessly:
Interrupt (& polled) drivers for serial and parallel ports, async
network packet handler, dual-buffered network comms, local stack,
self-relocating (saving the space for the serial port driver in case of
a parallel printer and vice versa), as well as relocating all the code
down into the PSP area.
Total memory space was about 1700 bytes.
A few comercial products came out a year or two later, they ran 4x
slower on the same hardware, and their slave printer drivers needed an
order of magnitude more ram.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/15/2006 9:09:55 PM
|
|
spamtrap@crayne.org wrote:
> Jo�o Jer�nimo escreveu:
>> 7� "Assembly language programs are often less than one-half the size of
>> comparable HLL programs. This is especially impressive when you consider
>> the fact that data items generally consume the same amount of space in
>> both types of programs, and that data is responsible for a good amount
>> of the space used by a typical application."
>>
>> Assuming that he is right, how can this be so true? Parhaps it has to do
>> with the age of the book (it's the 16-bit version), since at the time
>> barely no one included high resolution graphics in the programs...
>
> I haven't expressed myself well... What I was talking about is the fact
> the data is responsible for a good amount of the space used by a
> typical application... Which is entirely true...
Which is why a friend of mine insist on paraphrasing my .sig line:
His version is
"almost all programming can be viewed as an exercise in compression"
I.e. one of the very first things you should do is to look at ways to
make your data smaller.
I wrote a custom DB 20+ years ago, to keep track of "bearer bonds" style
gift cards: Each gift card had a serial number (read with a huge OCR
machine), which could be converted to a value via a very small lookup table.
Each card then had to be checked against the DB to verify that (a) Yes,
it has been sold, and (b) No, it hasn't already been redeemed (to guard
against copying).
The commercial mainframe center which tried to do it gave up after a
month, the processing costs and DB usage was just to large.
I noticed that since these cards were sold in serial order (within a
limited number of values), question (a) above could be answered by
simply keeping around two numbers: The first and the last card sold so
far in this range.
At this point (b) becomes a single binary yes/no question, so I simply
made this a bitmap!
Each 4 KB disk block consisted of 96 bytes of header info and exactly
32000 bits.
The total processing time for a run of 10 K random updates to a DB with
10 M active records (too large for a 640 KB machine, even as a bitmap)
was just one second:
Half a second for a custom Quicksort to order the 10 K updates, and half
a second to do the actual DB updates.
Terje
PS. This entire app was written in Turbo Pascal, i.e. an un-optimizing
compiler: As you noted, keeping the data small was the important thing!
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/15/2006 9:23:24 PM
|
|
Pascal Bourguignon wrote:
> But nowadays, with the right optimization level, compilers can do a
> good job too, and in most situations, it's definitely cheaper to put
> more RAM in the computer than to hire good assembly programmers and
> pay them for the hard work needed to write highly hand-optimized code.
True, but (unfortunately) it's not possible to put more RAM where it's
most valuable: in the Level 1 CPU cache! Given that most current CPUs
have, I think, no more than 64K L1 cache there can still be some value
in squeezing down the code size.
Richard.
http://www.rtrussell.co.uk/
|
|
0
|
|
|
|
Reply
|
news
|
12/15/2006 10:38:27 PM
|
|
Jo�o Jer�nimo wrote:
>
> 7� "Assembly language programs are often less than one-half the size of
> comparable HLL programs. This is especially impressive when you consider
> the fact that data items generally consume the same amount of space in
> both types of programs, and that data is responsible for a good amount
> of the space used by a typical application."
>
>
> Assuming that he is right, how can this be so true? Parhaps it has to do
> with the age of the book (it's the 16-bit version), since at the time
> barely no one included high resolution graphics in the programs...
Keep in mind that the 16-bit edition was written before people started
injecting multi-megabyte JPEG files into their executables. Since the
16-bit edition was written (and I began writing it back in 1989), the
amount of gratuitous data inserted into applications has increased
exponentially (just like memory, which is not suprising). In 1989,
typical applications had a balance of code and data, today, code
represents a tiny percentage of the overall application memory
footprint.
Of course, even today I'd make the claim because a *good* assembly
language programmer wouldn't include all that cruft in an application
program :-)
Cheers,
Randy Hyde
|
|
0
|
|
|
|
Reply
|
rhyde
|
12/15/2006 11:34:30 PM
|
|
<spamtrap@crayne.org> wrote in message
news:1166206406.722508.36140@n67g2000cwd.googlegroups.com...
> Jo�o Jer�nimo escreveu:
> > 7� "Assembly language programs are often less than one-half the size of
> > comparable HLL programs. This is especially impressive when you consider
> > the fact that data items generally consume the same amount of space in
> > both types of programs, and that data is responsible for a good amount
> > of the space used by a typical application."
> >
> > Assuming that he is right, how can this be so true? Parhaps it has to do
> > with the age of the book (it's the 16-bit version), since at the time
> > barely no one included high resolution graphics in the programs...
>
> I think I haven't expressed myself quite well... What I was talking
> about is the fact the data is responsible for a good amount of the
> space used by a typical application... Which is entirely true...
Which can't be changed by any programming language. Any data compression
you'd apply in an 'assembly version' might as well be applied in an HLA
version as well.
BTW your typical modern app (-programmer) doesn't bother about data
compression; that's why they come on DVD nowadays :-)
On my site I describe how an entire galaxy of half a (European!) billion
individual star systems were created for a '90s era game. Though the
original algorithms were, indeed, written in assembly, my C version -- which
is much larger -- creates the same galaxy. Written out as data it would
amount to many many many GBs.
[Jongware]
|
|
0
|
|
|
|
Reply
|
jongware
|
12/15/2006 11:50:20 PM
|
|
Terje Mathisen wrote:
> I.e. one of the very first things you should do is to look at ways to
> make your data smaller.
>
> (...)
>
> PS. This entire app was written in Turbo Pascal, i.e. an un-optimizing
> compiler: As you noted, keeping the data small was the important thing!
Yes, but using assembly you don't won't be automatically optimizing data
usage... Indeed, your can do not size optimizing anything, if you
want... not even code size (as Mark Jones said, "ASM doesn't make
programs smaller; programmers do") nor speed...
But I think you can optimize data usage in an HLL too... Well, perhaps
in asm it's simpler to process compressed or "abbreviated" data in asm,
but I don't think so! At least my experience in asm doesn't tell me
so... It's not very easy to process e.g. bitmaps in asm (indeed in HLLs
it's easier!)...
JJ
|
|
0
|
|
|
|
Reply
|
ISO
|
12/16/2006 2:27:00 PM
|
|
rhyde@cs.ucr.edu wrote:
> Keep in mind that the 16-bit edition was written before people started
> injecting multi-megabyte JPEG files into their executables. Since the
> 16-bit edition was written (and I began writing it back in 1989), the
> amount of gratuitous data inserted into applications has increased
> exponentially (just like memory, which is not suprising).
Just as an example to this, in an individual project that I'm doing for
school, one of the things that I proposed involved some programming (not
in assembly, but perhaps on some OOP-designed language) in the area of
the Artificial Intelligence... One of my schoolmates asked me if I would
be distributing the program on CD!
Well, I know what he was thinking about (an application full of fancy
graphics!), but he didn't get the idea... what I want to do is much more
technical than that!...
Well, of course if someone wants their program to display or to process
an image, nothing should prevent them from doing so... But in this case,
please don't link it against the executable! That's why someone invented
dynamic memory allocation...
Really, if you weren't an experienced programming teacher :-), I
wouldn't believe you when you say that people do this!
> In 1989, typical applications had a balance of code and
> data, today, code represents a tiny percentage of the
> overall application memory footprint.
There is also another question: is this really dispensable...
We can argue that some feature (a feature that needs a high resolution
image to be loaded) isn't really useful and that the programmer would
better do not include it because it will dramatically increase the
memory footprint of the program... but including some feature (even
including unneeded features) isn't exactly a bad programming practice...
So, now look at the present programming panorama. You will notice that
it is very tied to the eye candy and the programs are evaluated not by
what they can do but by what they look... Isn't all this memory bloat a
necessary thing?
(though everyone has the right to avoid this bloat if one wants)
The fact is that even if huge data is dynamically loaded by the program
when it's needed, one would need all the files to use the program in
it's full potential, so it would still take the same disk space...
> Of course, even today I'd make the claim because a *good* assembly
> language programmer wouldn't include all that cruft in an application
> program :-)
Oh! No *good programmer* will ever link huge data against an executable!
It's a very bad programming practice, excluding some rare conditions...
It's not so hard to do open stat malloc read close (or the equivalents)
when you need an image...
JJ
PS: The 16-bit version of AoA has been a very good DOS assembly
programming resource since I started programming in asm... really! In
Linux I usually prefer using C, though...
|
|
0
|
|
|
|
Reply
|
UTF
|
12/16/2006 3:43:54 PM
|
|
Terje Mathisen wrote:
> spamtrap@crayne.org wrote:
>> Jo�o Jer�nimo escreveu:
>>> 7� "Assembly language programs are often less than one-half the size of
>>> comparable HLL programs. This is especially impressive when you consider
>>> the fact that data items generally consume the same amount of space in
>>> both types of programs, and that data is responsible for a good amount
>>> of the space used by a typical application."
>>>
>>> Assuming that he is right, how can this be so true? Parhaps it has to do
>>> with the age of the book (it's the 16-bit version), since at the time
>>> barely no one included high resolution graphics in the programs...
>>
>> I haven't expressed myself well... What I was talking about is the fact
>> the data is responsible for a good amount of the space used by a
>> typical application... Which is entirely true...
>
> Which is why a friend of mine insist on paraphrasing my .sig line:
>
> His version is
> "almost all programming can be viewed as an exercise in compression"
>
> I.e. one of the very first things you should do is to look at ways to
> make your data smaller.
>
> I wrote a custom DB 20+ years ago, to keep track of "bearer bonds" style
> gift cards: Each gift card had a serial number (read with a huge OCR
> machine), which could be converted to a value via a very small lookup
> table.
>
> Each card then had to be checked against the DB to verify that (a) Yes,
> it has been sold, and (b) No, it hasn't already been redeemed (to guard
> against copying).
>
> The commercial mainframe center which tried to do it gave up after a
> month, the processing costs and DB usage was just to large.
>
> I noticed that since these cards were sold in serial order (within a
> limited number of values), question (a) above could be answered by
> simply keeping around two numbers: The first and the last card sold so
> far in this range.
>
> At this point (b) becomes a single binary yes/no question, so I simply
> made this a bitmap!
>
> Each 4 KB disk block consisted of 96 bytes of header info and exactly
> 32000 bits.
>
> The total processing time for a run of 10 K random updates to a DB with
> 10 M active records (too large for a 640 KB machine, even as a bitmap)
> was just one second:
>
> Half a second for a custom Quicksort to order the 10 K updates, and half
> a second to do the actual DB updates.
>
> Terje
>
> PS. This entire app was written in Turbo Pascal, i.e. an un-optimizing
> compiler: As you noted, keeping the data small was the important thing!
Hi Terje, are you accepting students?
;-)
|
|
0
|
|
|
|
Reply
|
Mark
|
12/16/2006 5:53:24 PM
|
|
Jo�o Jer�nimo wrote:
>
> Oh! No *good programmer* will ever link huge data against an executable!
> It's a very bad programming practice, excluding some rare conditions...
> It's not so hard to do open stat malloc read close (or the equivalents)
> when you need an image...
Unfortunately, the world is full of programmers that you wouldn't
consider to be very good :-)
Cheers,
Randy Hyde
|
|
0
|
|
|
|
Reply
|
rhyde
|
12/16/2006 7:56:44 PM
|
|
"Terje Mathisen" <spamtrap@crayne.org> wrote in message
news:ebgb54-lsc.ln1@osl016lin.hda.hydro.com...
> spamtrap@crayne.org wrote:
> I.e. one of the very first things you should do is to look at ways to
> make your data smaller.
>
> I wrote a custom DB 20+ years ago, to keep track of "bearer bonds" style
> gift cards: Each gift card had a serial number (read with a huge OCR
> machine), which could be converted to a value via a very small lookup
table.
>
> Each card then had to be checked against the DB to verify that (a) Yes,
> it has been sold, and (b) No, it hasn't already been redeemed (to guard
> against copying).
>
> The commercial mainframe center which tried to do it gave up after a
> month, the processing costs and DB usage was just to large.
>
> I noticed that since these cards were sold in serial order (within a
> limited number of values), question (a) above could be answered by
> simply keeping around two numbers: The first and the last card sold so
> far in this range.
>
> At this point (b) becomes a single binary yes/no question, so I simply
> made this a bitmap!
>
> Each 4 KB disk block consisted of 96 bytes of header info and exactly
> 32000 bits.
>
> The total processing time for a run of 10 K random updates to a DB with
> 10 M active records (too large for a 640 KB machine, even as a bitmap)
> was just one second:
>
> Half a second for a custom Quicksort to order the 10 K updates, and half
> a second to do the actual DB updates.
>
After that (which reminds me of 6502 programming - no memory), I'm curious
as to how you'd implement a filesystem. One needs to index of a large
number of sectors without using much space. And, to not waste more space
when files are allocated, the clusters need to be small. This requires
keeping track of more sectors... One of the things I noticed about the old
CBM format versus the old FAT format, is that the CBM format is much easier
to scale to larger disk sizes than FAT.
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
12/16/2006 7:58:49 PM
|
|
"rhyde@cs.ucr.edu" <spamtrap@crayne.org> wrote in message
news:1166299004.538624.190050@j72g2000cwa.googlegroups.com...
>
> Jo�o Jer�nimo wrote:
> >
> > Oh! No *good programmer* will ever link huge data against an executable!
> > It's a very bad programming practice, excluding some rare conditions...
> > It's not so hard to do open stat malloc read close (or the equivalents)
> > when you need an image...
>
> Unfortunately, the world is full of programmers that you wouldn't
> consider to be very good :-)
How do you define good? The definition of a "good" programmer is highly
biased towards an arbitrary set of requirements created by some individual
who is doing the judging.
Let's say you have a "detail oriented" programmer and a "big picture"
programmer. Then you ask both of them to write the same program. Under
typical circumstances, both will fail. Then you ask them to criticize each
others failed program. The "big picture" programmer will point out that the
"detail oriented" programmer couldn't produce a complete program: just a
bunch of routines, most of which weren't needed. The "detail oriented"
programmer will point out that "big picture" programmer couldn't produce a
accurate program: he got many of the details wrong. The "detail oriented"
programmer will then claim he is a "good" programmer because he was more
accurate. The "big picture" programmer will then claim he is a "good"
programmer because he was able to more completely implement a solution to
the original problem.
;)
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
12/17/2006 12:08:56 AM
|
|
Rod Pemberton wrote:
> >
> > Unfortunately, the world is full of programmers that you wouldn't
> > consider to be very good :-)
>
> How do you define good?
Perhaps you missed the comment I was addressing. To recap:
"Oh! No *good programmer* will ever link huge data against an
executable!"
> The definition of a "good" programmer is highly
> biased towards an arbitrary set of requirements created by some individual
> who is doing the judging.
My comment was strictly based on the criterion provided.
Cheers,
Randy Hyde
|
|
0
|
|
|
|
Reply
|
rhyde
|
12/17/2006 5:50:19 AM
|
|
Mark Jones wrote:
> Terje Mathisen wrote:
>> PS. This entire app was written in Turbo Pascal, i.e. an un-optimizing
>> compiler: As you noted, keeping the data small was the important thing!
>
> Hi Terje, are you accepting students?
> ;-)
No, but I do try to 'pay forward' parts of the debt I have to those I've
learned from.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/17/2006 10:57:36 AM
|
|
Rod Pemberton wrote:
> "Terje Mathisen" <spamtrap@crayne.org> wrote in message
>> Half a second for a custom Quicksort to order the 10 K updates, and half
>> a second to do the actual DB updates.
>>
>
> After that (which reminds me of 6502 programming - no memory), I'm curious
> as to how you'd implement a filesystem. One needs to index of a large
> number of sectors without using much space. And, to not waste more space
> when files are allocated, the clusters need to be small. This requires
> keeping track of more sectors... One of the things I noticed about the old
> CBM format versus the old FAT format, is that the CBM format is much easier
> to scale to larger disk sizes than FAT.
Today there are just a few critical issues for a file system:
a) Never get into an inconsistent state (i.e. maintain ordering for
critical updates, or use a sequential log)
b) Handle both a few huge files and huge numbers of tiny files
effectively. This more or less requires a way to use sub-allocation for
small files and file tails, or possibly a way to handle multiple
allocation sizes
c) Recover gracefully from hw failures, including stuff like a disk
which reports a write as OK, even though it in reality never wrote
anything, or (even worse?) wrote it to the wrong spot.
Sun has been open-sourcing a really interesting file system which seems
to handle most of these requirements, I'm waiting for stable
Linux/FreeBSD ports.
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/17/2006 11:04:54 AM
|
|
"Terje Mathisen" <spamtrap@crayne.org> wrote in message
news:qrkf54-qt6.ln1@osl016lin.hda.hydro.com...
> Rod Pemberton wrote:
> Today there are just a few critical issues for a file system:
>
> a) Never get into an inconsistent state (i.e. maintain ordering for
> critical updates, or use a sequential log)
>
journaling
The problem I have with journaling is that I suspect that it may contribute
to failures under certain circumstances. When I shutdown the computer via
software, I want the journaling to complete either before shutdown or upon
reboot. However, if I shutdown the computer by the power switch or pulling
the plug due to an uncontrollable event I wish to stop, then when the OS
boots I want anything in the journal to be purged: started delete actions,
virus activity... However, I think that the journaling restarts and
processes the queued actions.
> b) Handle both a few huge files and huge numbers of tiny files
> effectively. This more or less requires a way to use sub-allocation for
> small files and file tails, or possibly a way to handle multiple
> allocation sizes
>
The many tiny files issue is the one I'm most interested.
> c) Recover gracefully from hw failures, including stuff like a disk
> which reports a write as OK, even though it in reality never wrote
> anything, or (even worse?) wrote it to the wrong spot.
>
> Sun has been open-sourcing a really interesting file system which seems
> to handle most of these requirements, I'm waiting for stable
> Linux/FreeBSD ports.
>
For those that follow, ZFS filesystem:
http://www.sun.com/2004-0914/feature/
"'We've rethought everything and rearchitected it,' says Jeff Bonwick."
Odd use of architect... I wonder why he didn't use "redesigned" or
"re-engineered". Honestly, I'd take the non-word "rearchitected" to refer
to the decorative fluff and not the important structural elements.
Hopefully, that doesn't really apply to ZFS.
"Neither architecture pays a byte-swapping tax due to Sun's patent-pending
'adaptive endian-ness' technology, which is unique to ZFS."
That worries me. Without confirming anything in regards to their pending
patent, it sounds like Microsoft's attempts to patent old existing
technology which becomes important. Adjusting for endian-ness is a common,
frequent, and extremely simple task. I'm curious as to how they could have
a new patentable technology in this area.
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
12/17/2006 8:15:30 PM
|
|
Rod Pemberton wrote:
> "Terje Mathisen" <spamtrap@crayne.org> wrote in message
>> Sun has been open-sourcing a really interesting file system which seems
>> to handle most of these requirements, I'm waiting for stable
>> Linux/FreeBSD ports.
>>
>
> For those that follow, ZFS filesystem:
> http://www.sun.com/2004-0914/feature/
> "Neither architecture pays a byte-swapping tax due to Sun's patent-pending
> 'adaptive endian-ness' technology, which is unique to ZFS."
>
> That worries me. Without confirming anything in regards to their pending
> patent, it sounds like Microsoft's attempts to patent old existing
> technology which becomes important. Adjusting for endian-ness is a common,
> frequent, and extremely simple task. I'm curious as to how they could have
> a new patentable technology in this area.
Ouch!
Having read their white papers, I believe they are talking about the use
of a TCPIP style wrapping checksum for the first-level verification that
data is OK.
I can't see any way in which simply extending a 16-bit algorithm to 32
or 64 bits makes it patentable.
Very sad. :-(
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/18/2006 1:03:06 PM
|
|
"Rod Pemberton" <spamtrap@crayne.org> wrote:
>
>journaling
>
>The problem I have with journaling is that I suspect that it may contribute
>to failures under certain circumstances. When I shutdown the computer via
>software, I want the journaling to complete either before shutdown or upon
>reboot. However, if I shutdown the computer by the power switch or pulling
>the plug due to an uncontrollable event I wish to stop, then when the OS
>boots I want anything in the journal to be purged: started delete actions,
>virus activity... However, I think that the journaling restarts and
>processes the queued actions.
What you describe is not the point of journalling. The point of
journalling is to ensure that your disk is never left in an inconsistent
state. As long as stuff hasn't been trashed, you can go clean up damage
later.
--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
|
|
0
|
|
|
|
Reply
|
Tim
|
12/20/2006 4:37:45 AM
|
|
Rod Pemberton wrote:
>>b) Handle both a few huge files and huge numbers of tiny files
>>effectively. This more or less requires a way to use sub-allocation for
>>small files and file tails, or possibly a way to handle multiple
>>allocation sizes
>
> The many tiny files issue is the one I'm most interested.
Reiser is famous for it's ability to manage many tiny files...
JJ
|
|
0
|
|
|
|
Reply
|
ISO
|
12/20/2006 5:19:13 AM
|
|
robertwessel2@yahoo.com wrote:
> The reason you can't make EXT2 "as reliable as FAT before adding
> journaling," is that it's impossible. FAT, by design or happenstance,
> has several features that make "careful write" reasonably effective
> (and fairly straightforward). For example, allocating a cluster to a
> file and removing it from the free space list is the same (and hence an
> atomic) operation on a FAT volume. Of course having to scan the FAT
> looking for a free cluster is worse than scanning a free space bitmap,
> and sequentially running through a FAT chain is a crappy way of seeking
> to a random point in a file compared to traversing a tree. Of course
> if the file structure is a tree, and the free space list is a separate
> bitmap, any updates are inherently far more fragile, since *any*
> allocation for a new cluster will require at least two separate disk
> pages to be updated, and the operation is no longer inherently atomic.
Related to this, take a looks at Simple File System.
http://bcos.hopto.org/sfs.html
If you write the file, then modify the Used Urea size, and only then you
register the file in the Index Area, the filesystem will be always in a
consistent state, because having unused blocks in the Used Area is not
against the SFS specification...
However, the SFS in not exactly aimed at everyday use... for example, in
does not support file fragmentation nor sparse files...
JJ
|
|
0
|
|
|
|
Reply
|
ISO
|
12/20/2006 7:05:43 AM
|
|
"Tim Roberts" <spamtrap@crayne.org> wrote in message
news:odfho29t88vtgubf1od5ekkoa3ko54rbl5@4ax.com...
> "Rod Pemberton" <spamtrap@crayne.org> wrote:
> >
> >journaling
> >
> >The problem I have with journaling is that I suspect that it may
contribute
> >to failures under certain circumstances. When I shutdown the computer
via
> >software, I want the journaling to complete either before shutdown or
upon
> >reboot. However, if I shutdown the computer by the power switch or
pulling
> >the plug due to an uncontrollable event I wish to stop, then when the OS
> >boots I want anything in the journal to be purged: started delete
actions,
> >virus activity... However, I think that the journaling restarts and
> >processes the queued actions.
>
> What you describe is not the point of journalling.
I didn't describe the point of journaling. I described the problems I
perceive could occur with journaling. Journaling can cause damage in
specific situations which can be far worse than fixing an inconsistent
state.
> The point of
> journalling is to ensure that your disk is never left in an inconsistent
> state.
The disk is in a consistent state after executing each of the queued
actions. So, execution of the queued actions can stop after any one of them
without creating an inconsistent state. Unfortunately, executing all the
queued actions is undesirable under certain circumstances. Being able to
halt and purge all actions to prevent damage is desirable.
journal:
good action
<--consistent state
good action
<--consistent state
good action
<--consistent state
good action
<--consistent state, would like to purge queued bad actions here...
suspected or known bad action
<--consistent state with the effects of bad action, i.e. damage
suspected or known bad action
<--consistent state with the effects of bad action, i.e. more damage
> As long as stuff hasn't been trashed, you can go clean up damage
> later.
That's the problem. When the queued actions are known or suspected to be
bad, how does one get rid of them prior to the damage? Or, can you
roll-back the changes with journaling?
I believe that part of the problem is poor filesystem or poor OS design.
Scandisk has never found any problems on FAT32 with DOS 7.10.
Scandisk very rarely finds errors after running Win98 (on DOS 7.10).
This is almost always problem with the free space count. However, I've
run different Linux's with Ext2 and in each case fsck would run almost
every third boot. From what I've read, journaling was added to Ext2 to
create Ext3 and eliminate the fsck problem. But, the questions in my mind
are: "Why did Ext2 fail so frequently forcing fsck to run in the first
place?
Why didn't they make Ext2 as reliable as FAT before adding journaling?"
Given a solid failure scenario like this, why shouldn't one be suspect and
view journaling as a patch job which hides serious filesystem design
deficiencies? Is this perspective unreasonable?
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
12/20/2006 8:58:00 AM
|
|
Rod Pemberton wrote:
> I believe that part of the problem is poor filesystem or poor OS design.
> Scandisk has never found any problems on FAT32 with DOS 7.10.
> Scandisk very rarely finds errors after running Win98 (on DOS 7.10).
> This is almost always problem with the free space count. However, I've
> run different Linux's with Ext2 and in each case fsck would run almost
> every third boot. From what I've read, journaling was added to Ext2 to
> create Ext3 and eliminate the fsck problem. But, the questions in my mind
> are: "Why did Ext2 fail so frequently forcing fsck to run in the first
> place?
> Why didn't they make Ext2 as reliable as FAT before adding journaling?"
> Given a solid failure scenario like this, why shouldn't one be suspect and
> view journaling as a patch job which hides serious filesystem design
> deficiencies? Is this perspective unreasonable?
Yes, largely it is. Simple file systems are semi-reliable in the face
of system crashes because the OS can take pains to ensure that updates
to the meta data are carefully managed. For example, you might make
sure that a FAT entry is updated before writing to the allocated sector
and/or changing the file size in the directory entry. A more complex
example is creating a file - the FS has to create a directory entry
(possible expanding the directory "file"), allocate the first cluster,
and then write to it. The idea is to ensure that these changes are
physically committed to the disk in an order that minimizes (or
eliminates) dangerous misinterpretations in case the process is
interrupted partway through. This is usually called "careful write."
Unfortunately this fails utterly, even for simple FS's like FAT, for
any more complex operations. For example, moving a file from one
directory to another (where you cannot avoid a transient state where
the file is physically in both directories or in neither), or moving a
disk block (for example during a reorg - which happens on some if the
more complex FS's quite often - for example a very short file might
have all its data stored within the directory entry, but if the file
grows it gets moved to conventional disk page).
The problem only gets worse when multiple metadata updates are
happening at once, as on a busy system. And worse still this quickly
becomes very complex (especially annoying given that it's not all that
effective in the first place), and tends to be a significant
performance bottleneck (by insisting that things happen in a safe
order, you loose the ability to batch and optimize the physical volume
updates).
The point of journaling is to say, to heck with it, treat the metadata
updates like database changes (which are journaled in about the same
way, and for the same reason), and just ensure that all changes are
either fully completed, or wholly backed out in the case of a failure.
This allows you to greatly decouple metadata updates (which are spread
all over the disk, remember) from the physical writes, except for
synchronization points where all of the accumulated log data must be
physically written to disk (and the journals are largely sequential
writes anyway).
Now journaling does *nothing* to prevent a buggy FS from performing a
bad update of the metadata on a volume, but it will ensure that the
update is either performed fully, or not at all. This is no different
than a buggy application connected to a database performing a bad
change.
The reason you can't make EXT2 "as reliable as FAT before adding
journaling," is that it's impossible. FAT, by design or happenstance,
has several features that make "careful write" reasonably effective
(and fairly straightforward). For example, allocating a cluster to a
file and removing it from the free space list is the same (and hence an
atomic) operation on a FAT volume. Of course having to scan the FAT
looking for a free cluster is worse than scanning a free space bitmap,
and sequentially running through a FAT chain is a crappy way of seeking
to a random point in a file compared to traversing a tree. Of course
if the file structure is a tree, and the free space list is a separate
bitmap, any updates are inherently far more fragile, since *any*
allocation for a new cluster will require at least two separate disk
pages to be updated, and the operation is no longer inherently atomic.
|
|
0
|
|
|
|
Reply
|
robertwessel2
|
12/20/2006 10:25:17 PM
|
|
robertwessel2@yahoo.com <spamtrap@crayne.org> wrote in part:
> The problem only gets worse when multiple metadata updates are
> happening at once, as on a busy system. And worse still this
> quickly becomes very complex (especially annoying given that
> it's not all that effective in the first place), and tends
> to be a significant performance bottleneck (by insisting that
> things happen in a safe order, you loose the ability to batch
> and optimize the physical volume updates).
Look into Kirk McKusick's "SoftUpdates" to be found on
*BSD systems. IIRC, the basic idea is order writes, data
_before_ metadata so the metadata is never pointing at invalid data.
Ordering works because a reasonable OS will run an elevator
algorithm to handle multiple disk requests to take advantage of
the hardware track-to-track and short range seek speed advantage.
You can/will lose the latest updates, but won't trash it. I pulled
the plug towards the end of FreeBSD kernel compiles. Three times,
`make` picked right up and I only lost about one minute's data.
The fourth got a bit more scrambled, and I needed `make clean`.
-- Robert
|
|
0
|
|
|
|
Reply
|
Robert
|
12/20/2006 11:43:42 PM
|
|
"Rod Pemberton" <spamtrap@crayne.org> writes:
> run different Linux's with Ext2 and in each case fsck would run almost
> every third boot. From what I've read, journaling was added to Ext2 to
> create Ext3 and eliminate the fsck problem. But, the questions in my mind
> are: "Why did Ext2 fail so frequently forcing fsck to run in the first
> place?
That doesn't really make sense. 'ext2' cannot fail any more than
'ASCII' can fail. If you don't shut down cleanly, fsck will run
as a sanity check, that doesn't mean that any errors were detected,
merely that the possibility of same was detected.
Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.
|
|
0
|
|
|
|
Reply
|
Phil
|
12/21/2006 7:20:23 AM
|
|
"Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
news:87wt4l4sw8.fsf@nonospaz.fatphil.org...
> "Rod Pemberton" <spamtrap@crayne.org> writes:
> > run different Linux's with Ext2 and in each case fsck would run almost
> > every third boot. From what I've read, journaling was added to Ext2 to
> > create Ext3 and eliminate the fsck problem. But, the questions in my
mind
> > are: "Why did Ext2 fail so frequently forcing fsck to run in the first
> > place?
>
> That doesn't really make sense. 'ext2' cannot fail any more than
> 'ASCII' can fail. If you don't shut down cleanly, fsck will run
> as a sanity check, that doesn't mean that any errors were detected,
> merely that the possibility of same was detected.
>
If "'ext2'" can't "fail any more than 'ASCII' can fail," then why does
fsck exist? And, why would it need to run?
Personally, I've never heard of 'acck' (Ascii Character ChecKer). But, I
guess one could write one to confirm that ASCII characters haven't switched
places without OS consent and to confirm that ASCII characters aren't
overwritting each other...
;-)
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
12/22/2006 6:15:09 AM
|
|
"Rod Pemberton" <spamtrap@crayne.org> writes:
> "Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
> news:87wt4l4sw8.fsf@nonospaz.fatphil.org...
> > "Rod Pemberton" <spamtrap@crayne.org> writes:
> > > run different Linux's with Ext2 and in each case fsck would run almost
> > > every third boot. From what I've read, journaling was added to Ext2 to
> > > create Ext3 and eliminate the fsck problem. But, the questions in my
> mind
> > > are: "Why did Ext2 fail so frequently forcing fsck to run in the first
> > > place?
> >
> > That doesn't really make sense. 'ext2' cannot fail any more than
> > 'ASCII' can fail. If you don't shut down cleanly, fsck will run
> > as a sanity check, that doesn't mean that any errors were detected,
> > merely that the possibility of same was detected.
> >
>
> If "'ext2'" can't "fail any more than 'ASCII' can fail," then why does
> fsck exist?
Because you can leave the partition in an inconsistent state if you do
not let the OS unmount it cleanly. fsck can return the vast majority
of these problems to a consistent state.
> And, why would it need to run?
It needs to run if you do not unmount the partition cleanly, as you
need to be sure the partition is in a consistent state before you do
anything with it.
Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.
|
|
0
|
|
|
|
Reply
|
Phil
|
12/22/2006 8:45:57 AM
|
|
Jo�o Jer�nimo wrote:
> Assuming that he is right, how can this be so true? Parhaps it has to do
> with the age of the book (it's the 16-bit version), since at the time
> barely no one included high resolution graphics in the programs...
>
> JJ
Custom calling conventions.
If function g Is called from a minimal number of places I can architect
g to work well with the callers to minimize/eliminate register moves to
setup parameters. Additionally if g needs to return several values it
can do so without the caller having to pass a pointer to a structure.
I've even used the conditional code register as an output. No need to
check if an output is zero, just branch if the equal flag is set.
I've also used return pointers as part of custom calling convention:
I worked on a decompression routine (in 68k assembly) where there were
3 tight loops, each having to call an analyze_next_bit() function.
Instead of using the bsr instruction which has the overhead of pushing
the return address to the stack, I loaded the return point in a spare
address register before the loop, called the subroutine with a jump,
then returned from the subroutine with a jump to the value in the
register.
Assembly is very flexible and the goto nature can used without fear.
How do you do a multi-level break; in c without a goto ? If you can't
use a goto (due to coding standards/ or your a wussy) your gonna have
to add a flow variable and additional if statements. I know in ADA I
can name the loop levels and break out any number of levels by using
that name.
|
|
0
|
|
|
|
Reply
|
Samuel
|
12/22/2006 9:05:30 AM
|
|
Jo?o Jer?nimo <spamtrap@crayne.org> wrote:
> I was re-reading the introduction of Art of Assembly, and I noticed
(not
> that I hadn't noticed already, but i hadn't dedicated much time to it)
> that Randall Hyde writes in the section "WHAT'S RIGHT WITH ASSEMBLY
> LANGUAGE":
>
> 7? "Assembly language programs are often less than one-half the size of
> comparable HLL programs.
>
> JJ
>
Compilers take the number of things into account that they they were
programmed for. Humans can take much more into account (in code design)
than the machine. The idea that a good optimizing compiler can do as
well as handwritten ASM is still a myth as far as I can tell at this
point. Compilers are getting better all the time.
Steve
|
|
0
|
|
|
|
Reply
|
Steven
|
12/22/2006 3:18:40 PM
|
|
"Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
news:87odpw48u2.fsf@nonospaz.fatphil.org...
> "Rod Pemberton" <spamtrap@crayne.org> writes:
> > "Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
> > news:87wt4l4sw8.fsf@nonospaz.fatphil.org...
> > > "Rod Pemberton" <spamtrap@crayne.org> writes:
> > > > run different Linux's with Ext2 and in each case fsck would run
almost
> > > > every third boot. From what I've read, journaling was added to Ext2
to
> > > > create Ext3 and eliminate the fsck problem. But, the questions in
my
> > mind
> > > > are: "Why did Ext2 fail so frequently forcing fsck to run in the
first
> > > > place?
> > >
> > > That doesn't really make sense. 'ext2' cannot fail any more than
> > > 'ASCII' can fail. If you don't shut down cleanly, fsck will run
> > > as a sanity check, that doesn't mean that any errors were detected,
> > > merely that the possibility of same was detected.
> > >
> >
> > If "'ext2'" can't "fail any more than 'ASCII' can fail," then why
does
> > fsck exist?
>
Both of the questions I posed were meant to be humorous and rhetorical due
to the blatantly obvious correctness of my position. But, you:
A) either chose to ignore that
B) or failed to comprehend that
?
(You're not an alias for Richard Heathfield, are you? He also snips context
into small pieces and then responds to the smaller pieces instead of
grasping the whole... He is one of two reasons I no longer post c.l.c.)
The point was that fsck exists only to correct filesystem errors and is only
run to correct them. It's totally useless otherwise. It never would've
been written otherwise. Since IDE hardware is extremely reliable, fsck
wasn't written to correct hardware errors. So, it was written to correct
software errors. The errors which fsck checks for exist in the ext2
filesystem. If an error exists, this is a failure of the ext2 filesystem.
If there weren't any errors, then the ext2 would be useable as is, i.e.,
without need for fsck.
> Because you can leave the partition in an inconsistent state if you do
> not let the OS unmount it cleanly. fsck can return the vast majority
> of these problems to a consistent state.
>
In other words, ext2 failed. Or, if you prefer, the portion of the OS which
implements ext2 failed. You can attempt to separate the duties of the ext2
filesystem and/or the OS any way you wish. It doesn't change the issues:
A) fsck runs when the filesystem is in a inconsistent state, as it should..
B) fsck runs when the filesystem is in a consistent state, when it
shouldn't.
C) fsck exists to fix problems with ext2.
D) fsck corrupts disks, if it runs when the disk is in a consistent state.
You seem think that:
1) fsck only runs and fixes problems when the partition is in an
inconsistent state. - False. Every Linux I've used is based on the login
count.
2) it matters how software derived problems with the filesystem arise. - It
doesn't. If there is an error in the filesystem, it's a failure of the
filesystem (since hardware errors are extremely rare).
Rod Pemberton
|
|
0
|
|
|
|
Reply
|
Rod
|
12/23/2006 4:00:28 AM
|
|
"Rod Pemberton" <spamtrap@crayne.org> writes:
> "Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
> Both of the questions I posed were meant to be humorous and rhetorical due
> to the blatantly obvious correctness of my position. But, you:
> A) either chose to ignore that
> B) or failed to comprehend that
> ?
I view your stance to be obviously incorrect. I do have a tendency to
ignore blatant incorrectness, but don't like to see it being propagated
on a serious newsgroup such as this one.
> (You're not an alias for Richard Heathfield, are you? He also snips context
> into small pieces and then responds to the smaller pieces instead of
> grasping the whole... He is one of two reasons I no longer post c.l.c.)
I am not Richard Heathfield. I find him to be one of the more reliable
posters on computer-language and computer-behaviour related issues.
Your comparison I take as a complement.
If I am only responding to a small portion of your post I will
only quote that small portion. That's sensible etiquette.
> The point was that fsck exists only to correct filesystem errors and is only
> run to correct them.
That is blatently incorrect, for reasons explained in my previous
posts.
> It's totally useless otherwise. It never would've
> been written otherwise.
Again incorrect.
> Since IDE hardware is extremely reliable, fsck
> wasn't written to correct hardware errors. So, it was written to correct
> software errors.
Again incorrect.
> The errors which fsck checks for exist in the ext2
> filesystem. If an error exists, this is a failure of the ext2 filesystem.
Again incorrect.
> If there weren't any errors, then the ext2 would be useable as is, i.e.,
> without need for fsck.
Again incorrect.
> > Because you can leave the partition in an inconsistent state if you do
> > not let the OS unmount it cleanly. fsck can return the vast majority
> > of these problems to a consistent state.
>
> In other words, ext2 failed.
Incorrect.
> Or, if you prefer, the portion of the OS which
> implements ext2 failed.
Incorrect.
> You can attempt to separate the duties of the ext2
> filesystem and/or the OS any way you wish. It doesn't change the issues:
> A) fsck runs when the filesystem is in a inconsistent state, as it should..
> B) fsck runs when the filesystem is in a consistent state, when it
> shouldn't.
Incorrect.
> C) fsck exists to fix problems with ext2.
Incorrect through incompleteness. That's not the only reason it exists.
> D) fsck corrupts disks, if it runs when the disk is in a consistent state.
Do you have any evidence for this bizarre claim?
Cite or retract.
> You seem think that:
> 1) fsck only runs and fixes problems when the partition is in an
> inconsistent state.
Incorrect. Cite or retract.
> - False. Every Linux I've used is based on the login
> count.
> 2) it matters how software derived problems with the filesystem arise.
Incorrect. Cite or retract.
> - It
> doesn't. If there is an error in the filesystem, it's a failure of the
> filesystem (since hardware errors are extremely rare).
Incorrect.
Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.
|
|
0
|
|
|
|
Reply
|
Phil
|
12/23/2006 1:16:22 PM
|
|
Steven Nichols wrote:
> Compilers take the number of things into account that they they were
> programmed for. Humans can take much more into account (in code design)
> than the machine. The idea that a good optimizing compiler can do as
> well as handwritten ASM is still a myth as far as I can tell at this
> point. Compilers are getting better all the time.
I don't think there's any doubt that a good assembler programmer can,
given adequate time, outperform (or at least match) any compiler in
terms of code performance. Except perhaps in the minds of the
marketing departments of some compiler writers. ;-) That being said,
much of the complex instruction/register/functional unit/etc.
scheduling needed for many advanced CPUs is quite tedious and labor
intensive, and something that compilers can do reasonably well.
Unfortunately, (A) most assembler programmer's aren't that good, and
(B), those that are usually don't have near the time needed to dedicate
to the task except in relatively small portions of the application.
It's the old 90/10 rule: 90% of the CPU time is spent in 10% of the
code - any sane programmer will concentrate on making the 10% fast, and
getting the 90% done as quickly as possible while still being correct.
That often means that the 90%, when written in assembler, ends up being
fairly stylized and simplistic code, and not very optimized at all.
And *that* code will very, very often be beaten by a good compiler.
The 10% is another story, but even there it's quite common to start
with either simplistic code or an HLL implementation, so that things
*work*, and then optimize the bottlenecks (which are often surprising,
and don't make themselves obvious until you profile the code).
None of which is to say that there aren't clearly areas where compilers
get terribly lost. If the problem doesn't fit the HLL well, it's
unlikely that the resulting code will be very good. A good example is
a software implementation of DES (although even the best assembler
implementations of DES still suck - DES being designed to be
implemented in hardware, not software - but they're an order of
magnitude faster than say a good C version). Another example is the
double width integer multiplication available on many CPUs - if you're
doing a bignum library that's very handy, and totally inaccessible from
C.
|
|
0
|
|
|
|
Reply
|
robertwessel2
|
12/26/2006 10:38:28 PM
|
|
Jo�o Jer�nimo wrote:
>
> Related to this, take a looks at Simple File System.
>
> http://bcos.hopto.org/sfs.html
>
Interesting. Is that your draft?
Curious though, isn't this the same goal of the CD filesystem - portability and
compatibility?
It seems like SFS would simply add to the plethora of existing filesystems
already battling for dominance without adding much latent value. Why would SFS
be more enticing than an existing standard like CDFS? It is not practical for a
main partition, so would have to be setup as an "extra" partition, which right
there excludes the majority of end-users. I can see the desire for a global,
unified filesystem but the idea is inherently impractical.
|
|
0
|
|
|
|
Reply
|
Mark
|
12/27/2006 3:08:23 AM
|
|
Is it possible to discuss something in this news groups without someone
feeling the need to tell us that we should be sure to only optimize the
bottle necks?
robertwessel2@yahoo.com wrote:
> Steven Nichols wrote:
> > Compilers take the number of things into account that they they were
> > programmed for. Humans can take much more into account (in code design)
> > than the machine. The idea that a good optimizing compiler can do as
> > well as handwritten ASM is still a myth as far as I can tell at this
> > point. Compilers are getting better all the time.
>
>
> I don't think there's any doubt that a good assembler programmer can,
> given adequate time, outperform (or at least match) any compiler in
> terms of code performance. Except perhaps in the minds of the
> marketing departments of some compiler writers. ;-) That being said,
> much of the complex instruction/register/functional unit/etc.
> scheduling needed for many advanced CPUs is quite tedious and labor
> intensive, and something that compilers can do reasonably well.
>
> Unfortunately, (A) most assembler programmer's aren't that good, and
> (B), those that are usually don't have near the time needed to dedicate
> to the task except in relatively small portions of the application.
> It's the old 90/10 rule: 90% of the CPU time is spent in 10% of the
> code - any sane programmer will concentrate on making the 10% fast, and
> getting the 90% done as quickly as possible while still being correct.
> That often means that the 90%, when written in assembler, ends up being
> fairly stylized and simplistic code, and not very optimized at all.
> And *that* code will very, very often be beaten by a good compiler.
> The 10% is another story, but even there it's quite common to start
> with either simplistic code or an HLL implementation, so that things
> *work*, and then optimize the bottlenecks (which are often surprising,
> and don't make themselves obvious until you profile the code).
>
> None of which is to say that there aren't clearly areas where compilers
> get terribly lost. If the problem doesn't fit the HLL well, it's
> unlikely that the resulting code will be very good. A good example is
> a software implementation of DES (although even the best assembler
> implementations of DES still suck - DES being designed to be
> implemented in hardware, not software - but they're an order of
> magnitude faster than say a good C version). Another example is the
> double width integer multiplication available on many CPUs - if you're
> doing a bignum library that's very handy, and totally inaccessible from
> C.
|
|
0
|
|
|
|
Reply
|
Samuel
|
12/27/2006 10:58:44 AM
|
|
robertwessel2@yahoo.com wrote:
> magnitude faster than say a good C version). Another example is the
> double width integer multiplication available on many CPUs - if you're
> doing a bignum library that's very handy, and totally inaccessible from
> C.
Even with simple multibyte addition, the carry bit, trivially simple to
use in assembly language, simply doesn't exist as a notion within C.
That makes multibyte addition (and subtraction) much simpler and faster
in assembly language than in C.
Ed
|
|
0
|
|
|
|
Reply
|
Ed
|
12/27/2006 12:37:52 PM
|
|
Samuel Stearley wrote:
> Is it possible to discuss something in this news groups without someone
> feeling the need to tell us that we should be sure to only optimize the
> bottle necks?
>
>
>
> robertwessel2@yahoo.com wrote:
> > Steven Nichols wrote:
> > >(...)
Probably the same chances as getting people to stop top-posting.
|
|
0
|
|
|
|
Reply
|
robertwessel2
|
12/27/2006 9:54:50 PM
|
|
robertwessel2@yahoo.com wrote:
> magnitude faster than say a good C version). Another example is the
> double width integer multiplication available on many CPUs - if you're
> doing a bignum library that's very handy, and totally inaccessible from
> C.
That's not true, and hasn't been for some years:
Many compilers will recognize an idiom like this:
inline
uint64_t mul32x32(uint32_t a, uint32_t b)
{
return (uint64_t) a * (uint64_t) b;
}
and figure out that since both inputs are 32 bits, a plain MUL will give
the required 64-bit result, without even having to make a function call.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/27/2006 9:57:11 PM
|
|
Terje Mathisen wrote:
> robertwessel2@yahoo.com wrote:
> > magnitude faster than say a good C version). Another example is the
> > double width integer multiplication available on many CPUs - if you're
> > doing a bignum library that's very handy, and totally inaccessible from
> > C.
>
> That's not true, and hasn't been for some years:
>
> Many compilers will recognize an idiom like this:
>
> inline
> uint64_t mul32x32(uint32_t a, uint32_t b)
> {
> return (uint64_t) a * (uint64_t) b;
> }
>
> and figure out that since both inputs are 32 bits, a plain MUL will give
> the required 64-bit result, without even having to make a function call.
You're assuming the compiler supports a 64 bit integer type (true, of
course with C99, and available as a non-standard extension on several
important C89 compilers for x86-32). Although in this case the CPU
doesn't actually support "single width" (assuming we're talking about
x86-32) operations on the data type in question, so the example is
rather questionable.
If you'd like to talk about x86-64 instead, then what C idiom would
allow you access to the full 128 bit product of a 64x64 multiplication.
|
|
0
|
|
|
|
Reply
|
robertwessel2
|
12/27/2006 11:56:41 PM
|
|
robertwessel2@yahoo.com wrote:
> Terje Mathisen wrote:
>> robertwessel2@yahoo.com wrote:
>>> magnitude faster than say a good C version). Another example is the
>>> double width integer multiplication available on many CPUs - if you're
>>> doing a bignum library that's very handy, and totally inaccessible from
>>> C.
>> That's not true, and hasn't been for some years:
>>
>> Many compilers will recognize an idiom like this:
>>
>> inline
>> uint64_t mul32x32(uint32_t a, uint32_t b)
>> {
>> return (uint64_t) a * (uint64_t) b;
>> }
>>
>> and figure out that since both inputs are 32 bits, a plain MUL will give
>> the required 64-bit result, without even having to make a function call.
>
>
> You're assuming the compiler supports a 64 bit integer type (true, of
> course with C99, and available as a non-standard extension on several
> important C89 compilers for x86-32). Although in this case the CPU
> doesn't actually support "single width" (assuming we're talking about
> x86-32) operations on the data type in question, so the example is
> rather questionable.
This is the result from gcc (-O2):
mul32x32:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
mull 8(%ebp)
leave
ret
.ident "GCC: (GNU) 4.1.1 (Gentoo 4.1.1-r1)"
In general the support of uint64_t in 32 bit mode of gcc is not very
good, but it knows this idiom.
> If you'd like to talk about x86-64 instead, then what C idiom would
> allow you access to the full 128 bit product of a 64x64 multiplication.
There is no C idiom as long as there is not int128 in the C-standard.
But for gcc this works:
typedef unsigned int uint128_t __attribute__((mode(TI)));
uint128_t mul64x64(uint64_t a, uint64_t b)
{
return (uint128_t) a * (uint128_t) b;
}
Result:
mul64x64:
movq %rdi, %rax
mulq %rsi
ret
|
|
0
|
|
|
|
Reply
|
Sebastian
|
12/28/2006 12:46:03 AM
|
|
Sebastian Biallas wrote:
> uint128_t mul64x64(uint64_t a, uint64_t b)
> {
> return (uint128_t) a * (uint128_t) b;
> }
>
> Result:
> mul64x64:
> movq %rdi, %rax
> mulq %rsi
That's pretty good, particularly if/when you also persuade the compiler
to inline the code, getting rid of the call/return overhead.
Re. general 64x64->128 support:
A few years ago I worked on the pure asm version of DFC, which was one
of the AES contenders: The core decorrelation operation (DFC stands for
Decorrelated Fast Cipher) was a 64x64->128 multiplication which was used
to synthesize a modulo (2^64)+13 64-bit mul.
On Alpha, which doesn't have a 64x64->128 opcode, Robert Harvey used
either inline asm or a compiler intrinsic to access the MULH opcode,
which return the top 64 bits of such a 64x64->128 mul: Together with the
plain 64x64->64 compiler-generated mul this was sufficient.
For our x86 version we tripled the speed by first hand-optimizing the
inner loop, then unrolling it completely for all 8 iterations, and
finally doing a little peephole optimization of the resulting code.
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
12/28/2006 3:23:00 PM
|
|
"Samuel Stearley" <spamtrap@crayne.org> wrote:
>
>Is it possible to discuss something in this news groups without someone
>feeling the need to tell us that we should be sure to only optimize the
>bottle necks?
Thanks to services like Google, these newsgroup postings now last forever.
Newbies who go searching for answers and advice next year or three years
from now find these posts, usually without much checking of context. If we
don't repeatedly emphasize the important points, those newbies will draw
the wrong conclusions.
--
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
|
|
0
|
|
|
|
Reply
|
Tim
|
12/30/2006 4:47:33 AM
|
|
Terje Mathisen <spamtrap@crayne.org> writes:
> On Alpha, which doesn't have a 64x64->128 opcode, Robert Harvey used
Programming smarts + Alpha => Harley, not Harvey.
Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.
|
|
0
|
|
|
|
Reply
|
Phil
|
1/3/2007 10:19:02 PM
|
|
Tim Roberts wrote:
> Thanks to services like Google, these newsgroup postings now last forever.
> Newbies who go searching for answers and advice next year or three years
> from now find these posts, usually without much checking of context. If we
> don't repeatedly emphasize the important points, those newbies will draw
> the wrong conclusions.
It is an extremely smart newbie that is actually capable of wasting
their time by inappropriately optimizing something in asm. Like most
everything else learning not to prematurely optimize is learned from
experience.
|
|
0
|
|
|
|
Reply
|
Samuel
|
1/4/2007 3:55:37 AM
|
|
Phil Carmody wrote:
> Terje Mathisen <spamtrap@crayne.org> writes:
>> On Alpha, which doesn't have a 64x64->128 opcode, Robert Harvey used
>
> Programming smarts + Alpha => Harley, not Harvey.
Oops, Mea Culpa! Sorry Robert!
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
|
|
0
|
|
|
|
Reply
|
Terje
|
1/4/2007 7:26:29 AM
|
|
|
47 Replies
134 Views
(page loaded in 0.076 seconds)
|