I was inspired by an email message from a person by the name of Phil
Taylor who had visited my web site and in re-acquainting himself with
assembly language programming. As an exercise, he had written a program
to generate and print each term of the Fibonacci series from the first
to the Nth and he emailed his code and the timings that he had measured
for various numbers of terms.
I created a program to do the same thing, but using a much faster method
that doesn't require any mul or div instructions and I have posted the
source code (DOS and Linux versions) on my web site
http://www.beroset.com
Like most all of the code there, I have donated these programs to the
public domain, so you're free to use them for any purpose. I'd be
interested in seeing if anybody can come up with faster code to do this.
The algorithm is decent, but I am sure that I left a bunch of
potential optimizations lying around in there, and there may be a better
mechanism entirely. Have fun!
Ed
|
|
0
|
|
|
|
Reply
|
Ed
|
4/30/2005 12:41:01 AM |
|
Ed Beroset wrote:
> source code (DOS and Linux versions) on my web site
> http://www.beroset.com
Hi Ed,
As always, thanks for the example code. (thanks to Phil
Taylor for inspiring you, too :)
> Like most all of the code there, I have donated these programs to the
> public domain, so you're free to use them for any purpose.
Not concerned that MicroSoft will steal it, eh? :)
> I'd be
> interested in seeing if anybody can come up with faster code to do this.
> The algorithm is decent, but I am sure that I left a bunch of
> potential optimizations lying around in there, and there may be a better
> mechanism entirely. Have fun!
Okay! :)
So far, I've found (to my surprise!) that this is a little
faster:
...
cmp edi,ebp ; are we at a new significant digit?
; loopnz .top ; keep going until we've got them all
jz .predone
dec ecx
jnz .top
...predone:
cmp al,'0' ; is it a zero?
jnz .done ; yes, so keep
inc ebp ;
...done
...
I know "loop" is supposed to be slow compared to "dec
ecx"/"jnz ...", but I thought the "loopnz" would be a win
compared to *two* conditional jumps, but it seems to run
around 4m36.xxx vs 4m41.xxx for the "loopnz". I'm working
with a K6-300 - not exactly a "typical" machine to optimize
for, these days...
I guess that doesn't do quite the same thing (doesn't seem
to matter)... okay...
; loopnz .top ; keep going until we've got them all
dec ecx
jz .predone
cmp edi,ebp ; are we at a new significant digit?
jnz .top
...predone:
cmp al,'0' ; is it a zero?
jnz .done ; yes, so keep
inc ebp ;
...done
This seems maybe a little faster still... matter of seconds
on a nearly 5 minute run... What we need, I think, is a
"32-bit aaa", so we can add 4 digits at once... I'm not
coming up with it now... maybe another day...
I'm a little puzzled by your initialization strategy... you
initialize the counter buffer to ascii '0', but the number
buffers to 0, and then overwrite them with ascii '0' at
runtime. Eliminating this would surely be faster, but not
enough to matter... I'm even more mystified by initializing
"msg2" to "middle" at runtime. Did you have something in
mind that I'm missing, or did it just turn out that way?
Best,
Frank
|
|
0
|
|
|
|
Reply
|
Frank
|
4/30/2005 10:28:57 PM
|
|
Frank Kotler wrote:
> Ed Beroset wrote:
>
>
>>source code (DOS and Linux versions) on my web site
>>http://www.beroset.com
>
>
> Hi Ed,
>
> As always, thanks for the example code. (thanks to Phil
> Taylor for inspiring you, too :)
>
>
>>Like most all of the code there, I have donated these programs to the
>>public domain, so you're free to use them for any purpose.
>
>
> Not concerned that MicroSoft will steal it, eh? :)
Nope. In fact, if it inspires them to write code that's a little
faster, a little smaller, and a little less buggy, it would be well
worth it.
Thanks for your optimization ideas! I'll have to try them out on my
machine.
> I'm a little puzzled by your initialization strategy... you
> initialize the counter buffer to ascii '0', but the number
> buffers to 0, and then overwrite them with ascii '0' at
> runtime. Eliminating this would surely be faster, but not
> enough to matter... I'm even more mystified by initializing
> "msg2" to "middle" at runtime. Did you have something in
> mind that I'm missing, or did it just turn out that way?
Actually, I have something in mind that the *code* is missing. My
intent was to do two further optimizations. One is to dynamically
allocate the memory, in which case the initialization will be necessary.
The other is to reduce the number of system calls to two from the
current three per line. Since the first part of the line is of the form
"Fibonacci(x): ", my intent is to dynamically manage the term counter
(the "x" part) and then call the print sys_call just once to print that
whole thing. To do that, of course, I'd need to move back the "): "
part to make room as x expands. I hadn't done that yet but I figured
I'd post the code anyway.
I've been considering porting all of my example code to Linux using NASM
or GAS as the assembler, but I haven't yet gotten 'round to it. I'd be
more enthusiastic about it if NASM had both a better license and better
support for structures, but neither will ever happen and I don't care
enough about it to finish my own assembler at this point.
Ed
|
|
0
|
|
|
|
Reply
|
Ed
|
4/30/2005 11:37:46 PM
|
|
Ed Beroset wrote:
>
> Frank Kotler wrote:
> > I'm a little puzzled by your initialization strategy... you
> > initialize the counter buffer to ascii '0', but the number
> > buffers to 0, and then overwrite them with ascii '0' at
> > runtime. Eliminating this would surely be faster, but not
> > enough to matter... I'm even more mystified by initializing
> > "msg2" to "middle" at runtime. Did you have something in
> > mind that I'm missing, or did it just turn out that way?
>
> Actually, I have something in mind that the *code* is missing. My
> intent was to do two further optimizations. One is to dynamically
> allocate the memory, in which case the initialization will be necessary.
> The other is to reduce the number of system calls to two from the
> current three per line. Since the first part of the line is of the form
> "Fibonacci(x): ", my intent is to dynamically manage the term counter
> (the "x" part) and then call the print sys_call just once to print that
> whole thing. To do that, of course, I'd need to move back the "): "
> part to make room as x expands. I hadn't done that yet but I figured
> I'd post the code anyway.
I see. That would make sense. I wonder if you could, after
completing the "number" part, stick "middle" in there, then
the counter, then copy "(iccanobiF" to it, and print the
whole thing in *one* sys_call?
> I've been considering porting all of my example code to Linux using NASM
> or GAS as the assembler, but I haven't yet gotten 'round to it.
Cool!
> I'd be
> more enthusiastic about it if NASM had both a better license and better
> support for structures, but neither will ever happen and I don't care
> enough about it to finish my own assembler at this point.
Sorry to hear that (although I probably wouldn't have liked
it anyway :) I assume what you'd like for structure support
would involve "type checking"... or at least "type
remembering"? Have you looked at Fasm at all? I don't know
how their structure support is, but the license might be
more to your liking (... or not). At least, Fasm will
complain about:
myvar db 0
.....
mov [myvar], eax
..... which I take it you'd see as a good sign(?).
Best,
Frank
|
|
0
|
|
|
|
Reply
|
Frank
|
5/1/2005 1:55:21 AM
|
|
Frank Kotler wrote:
> Ed Beroset wrote:
>
>>Frank Kotler wrote:
>>
>>>I'm a little puzzled by your initialization strategy... you
>>>initialize the counter buffer to ascii '0', but the number
>>>buffers to 0, and then overwrite them with ascii '0' at
>>>runtime. Eliminating this would surely be faster, but not
>>>enough to matter... I'm even more mystified by initializing
>>>"msg2" to "middle" at runtime. Did you have something in
>>>mind that I'm missing, or did it just turn out that way?
>>
>>Actually, I have something in mind that the *code* is missing. My
>>intent was to do two further optimizations. One is to dynamically
>>allocate the memory, in which case the initialization will be necessary.
>> The other is to reduce the number of system calls to two from the
>>current three per line. Since the first part of the line is of the form
>>"Fibonacci(x): ", my intent is to dynamically manage the term counter
>>(the "x" part) and then call the print sys_call just once to print that
>>whole thing. To do that, of course, I'd need to move back the "): "
>>part to make room as x expands. I hadn't done that yet but I figured
>>I'd post the code anyway.
>
>
> I see. That would make sense. I wonder if you could, after
> completing the "number" part, stick "middle" in there, then
> the counter, then copy "(iccanobiF" to it, and print the
> whole thing in *one* sys_call?
Yes, it's possible, but I don't think I'd do it that way. The way to do
it would be to construct the first part of the next line "Fibonacci(x):
" and append it to the end of the term so that each call would print out
the number, the newline and then the first part of the next line.
>>I've been considering porting all of my example code to Linux using NASM
>>or GAS as the assembler, but I haven't yet gotten 'round to it.
>
> Cool!
'Taint done yet.
>>I'd be
>>more enthusiastic about it if NASM had both a better license and better
>>support for structures, but neither will ever happen and I don't care
>>enough about it to finish my own assembler at this point.
>
> Sorry to hear that (although I probably wouldn't have liked
> it anyway :)
Why don't you think you'd have liked it?
> I assume what you'd like for structure support
> would involve "type checking"... or at least "type
> remembering"? Have you looked at Fasm at all? I don't know
> how their structure support is, but the license might be
> more to your liking (... or not). At least, Fasm will
> complain about:
>
> myvar db 0
> ....
> mov [myvar], eax
>
> .... which I take it you'd see as a good sign(?).
It's a start, but FASM, NASM, and most of the other open source
assemblers have rather rotten support for structures. I think it's
because the people who write assemblers generally don't have much
experience using them.
First, a structure should have a real type associated with it, so that I
can't just reference any old chunk of memory and refer to it as though
structure member names were just offsets (a la NASM's lame add-on
macro-based structure support). Second, it should be simple to
instatiate structures just as it's easy to instatiate simple variables.
Third, sub-byte members should be supported.
As an example of why this would be useful, declare and instantiate a
copule of segment descriptors (data, code, stack, for example) using
NASM and compare it to how you'd do it with MASM. MASM's structure
support is far, far superior, and I would prefer even more than it provides.
Ed
|
|
0
|
|
|
|
Reply
|
Ed
|
5/1/2005 4:19:51 AM
|
|
Ed Beroset wrote:
> I was inspired by an email message from a person by the name of Phil
> Taylor who had visited my web site and in re-acquainting himself with
> assembly language programming. As an exercise, he had written a program
> to generate and print each term of the Fibonacci series from the first
> to the Nth and he emailed his code and the timings that he had measured
> for various numbers of terms.
>
> I created a program to do the same thing, but using a much faster method
> that doesn't require any mul or div instructions and I have posted the
> source code (DOS and Linux versions) on my web site
> http://www.beroset.com
>
> Like most all of the code there, I have donated these programs to the
> public domain, so you're free to use them for any purpose. I'd be
> interested in seeing if anybody can come up with faster code to do this.
> The algorithm is decent, but I am sure that I left a bunch of
> potential optimizations lying around in there, and there may be a better
> mechanism entirely. Have fun!
>
> Ed
Hi Ed. Congratulations on your work. I'd like to see the output of
FIBO.ASM but I haven't because I don't have NASM and I don't have that much
time to convert it.. Would you put the .COM on your site or send it over to
my email please?
Thanks.
Cheers.
|
|
0
|
|
|
|
Reply
|
CII
|
5/1/2005 6:02:16 AM
|
|
CII <spamtrap@crayne.org> wrote:
>
>Hi Ed. Congratulations on your work. I'd like to see the output of
>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
>time to convert it.. Would you put the .COM on your site or send it over to
>my email please?
Oh, come on. It took me two minutes to get it running in MASM. I think
it's asking a bit much to expect him to send you the binary, when he has
already generously provided the source code.
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
|
|
0
|
|
|
|
Reply
|
Tim
|
5/3/2005 5:06:32 AM
|
|
Tim Roberts wrote:
> CII <spamtrap@crayne.org> wrote:
> >
> >Hi Ed. Congratulations on your work. I'd like to see the output of
> >FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> >time to convert it.. Would you put the .COM on your site or send it over to
> >my email please?
>
> Oh, come on. It took me two minutes to get it running in MASM. I think
> it's asking a bit much to expect him to send you the binary, when he has
> already generously provided the source code.
> --
> - Tim Roberts, timr@probo.com
> Providenza & Boekelheide, Inc.
Two minutes? Man that's fast. I tried to compile it to masm and got very many
errors. I tried tweaking around a bit but with no success. I apologize to
anyone here if a binary is too much to ask, but I think my request may be the one
of more people so it may be a valid one, but if it's trouble then I won't mind at
all. I was just interested in what Mr. Ed had done that's all, and I may not be
as experienced as many of you here, which I realize you are by the technical
quality of your responses.
Thanks everyone for the inputs anyway.
Cheers!
|
|
0
|
|
|
|
Reply
|
CII
|
5/3/2005 6:43:05 AM
|
|
CII wrote:
>
> Tim Roberts wrote:
>
> > CII <spamtrap@crayne.org> wrote:
> > >
> > >Hi Ed. Congratulations on your work. I'd like to see the output of
> > >FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> > >time to convert it.. Would you put the .COM on your site or send it over to
> > >my email please?
> >
> > Oh, come on. It took me two minutes to get it running in MASM. I think
> > it's asking a bit much to expect him to send you the binary, when he has
> > already generously provided the source code.
> > --
> > - Tim Roberts, timr@probo.com
> > Providenza & Boekelheide, Inc.
>
> Two minutes? Man that's fast.
Hehe. Practice! :) Mostly a case of adding the "offset"
keyword where Masm needs it (and *not* where it doesn't!).
Or maybe it's easier to just d/l Nasm...
http://www.sf.net/projects/nasm
(presumably, you'd want the Windows build... just put it on
your path somewhere...)
I'd have to reboot to dos to build a .com file - *still*
haven't got dosemu installed... hmmm... no I wouldn't... I
can build a .com file in Linux, just can't test it... Hang
on a sec...
http://home.comcast.net/~fbkotler/fibofbk.zip
That is untested! (but looks okay) Probably want to unzip it
with the "-a" switch to get your CR's back (in the
source)...
I'd simply post the output, but it's kind of a long read :)
Best,
Frank
|
|
0
|
|
|
|
Reply
|
Frank
|
5/3/2005 12:23:46 PM
|
|
CII wrote:
>
> Tim Roberts wrote:
>
>
>>CII <spamtrap@crayne.org> wrote:
>>
>>>Hi Ed. Congratulations on your work. I'd like to see the output of
>>>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
>>>time to convert it.. Would you put the .COM on your site or send it over to
>>>my email please?
>>
>>Oh, come on. It took me two minutes to get it running in MASM. I think
>>it's asking a bit much to expect him to send you the binary, when he has
>>already generously provided the source code.
>>--
>>- Tim Roberts, timr@probo.com
>> Providenza & Boekelheide, Inc.
>
>
> Two minutes? Man that's fast. I tried to compile it to masm and got very many
> errors. I tried tweaking around a bit but with no success. I apologize to
> anyone here if a binary is too much to ask, but I think my request may be the one
> of more people so it may be a valid one, but if it's trouble then I won't mind at
> all. I was just interested in what Mr. Ed had done that's all, and I may not be
> as experienced as many of you here, which I realize you are by the technical
> quality of your responses.
For what it's worth, I don't think it's an unreasonable request, but
I've got an alternative I'm working on. Probably by the time you read
this, I'll have NASM, MASM, and TASM source versions available and up on
the web site. I haven't put up binaries because I would hate for people
to get in the habit of downloading and running unknown executable code
from the web -- it's inherently a bad security practice that I'd hate to
even tacitly endorse.
I had hoped to be able to convert all of the software to run on any of
the three, but this actual work thing gets in the way sometimes! If
it's not up there yet, keep watching http://www.beroset.com -- it will
be there soon.
Thanks for your interest.
Ed
|
|
0
|
|
|
|
Reply
|
Ed
|
5/3/2005 7:11:41 PM
|
|
CII wrote:
>
> Tim Roberts wrote:
>>CII <spamtrap@crayne.org> wrote:
>>
>>>Hi Ed. Congratulations on your work. I'd like to see the output of
>>>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
>>>time to convert it.. Would you put the .COM on your site or send it over to
>>>my email please?
>>
>>Oh, come on. It took me two minutes to get it running in MASM. I think
>>it's asking a bit much to expect him to send you the binary, when he has
>>already generously provided the source code.
>
> Two minutes? Man that's fast. I tried to compile it to masm and got very many
> errors. I tried tweaking around a bit but with no success. I apologize to
> anyone here if a binary is too much to ask, but I think my request may be the one
> of more people so it may be a valid one, but if it's trouble then I won't mind at
> all.
A previous response seems to have been lost, so I'll try again.
For what it's worth, I don't think your request is unreasonable but I
have an alternative solution you might like better. I ported the code
to MASM and posted it on the same web site ( http://www.beroset.com/ )
so that you can download it and play with it.
There's an optimization made to the Linux version that I haven't yet
propagated back to the DOS versions that speeds things up still further
(about 2x) but I'll probably get to it eventually. If you'd like to try
it, it's simply keeping track of where the most significant non-zero
digit is in each of the terms. That way, we can skip searching for
leading zeroes to suppress. Saves considerable time that way.
Ed
|
|
0
|
|
|
|
Reply
|
Ed
|
5/6/2005 2:51:46 AM
|
|
Frank Kotler wrote:
> CII wrote:
> >
> > Tim Roberts wrote:
> >
> > > CII <spamtrap@crayne.org> wrote:
> > > >
> > > >Hi Ed. Congratulations on your work. I'd like to see the output of
> > > >FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> > > >time to convert it.. Would you put the .COM on your site or send it over to
> > > >my email please?
> > >
> > > Oh, come on. It took me two minutes to get it running in MASM. I think
> > > it's asking a bit much to expect him to send you the binary, when he has
> > > already generously provided the source code.
> > > --
> > > - Tim Roberts, timr@probo.com
> > > Providenza & Boekelheide, Inc.
> >
> > Two minutes? Man that's fast.
>
> Hehe. Practice! :) Mostly a case of adding the "offset"
> keyword where Masm needs it (and *not* where it doesn't!).
> Or maybe it's easier to just d/l Nasm...
>
> http://www.sf.net/projects/nasm
>
> (presumably, you'd want the Windows build... just put it on
> your path somewhere...)
>
> I'd have to reboot to dos to build a .com file - *still*
> haven't got dosemu installed... hmmm... no I wouldn't... I
> can build a .com file in Linux, just can't test it... Hang
> on a sec...
>
> http://home.comcast.net/~fbkotler/fibofbk.zip
>
> That is untested! (but looks okay) Probably want to unzip it
> with the "-a" switch to get your CR's back (in the
> source)...
>
> I'd simply post the output, but it's kind of a long read :)
>
> Best,
> Frank
Thanks Frank, just read the message now. I'll look into it.
|
|
0
|
|
|
|
Reply
|
CII
|
5/6/2005 6:46:47 AM
|
|
Ed Beroset wrote:
> CII wrote:
> >
> > Tim Roberts wrote:
> >>CII <spamtrap@crayne.org> wrote:
> >>
> >>>Hi Ed. Congratulations on your work. I'd like to see the output of
> >>>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> >>>time to convert it.. Would you put the .COM on your site or send it over to
> >>>my email please?
> >>
> >>Oh, come on. It took me two minutes to get it running in MASM. I think
> >>it's asking a bit much to expect him to send you the binary, when he has
> >>already generously provided the source code.
> >
> > Two minutes? Man that's fast. I tried to compile it to masm and got very many
> > errors. I tried tweaking around a bit but with no success. I apologize to
> > anyone here if a binary is too much to ask, but I think my request may be the one
> > of more people so it may be a valid one, but if it's trouble then I won't mind at
> > all.
>
> A previous response seems to have been lost, so I'll try again.
>
> For what it's worth, I don't think your request is unreasonable but I
> have an alternative solution you might like better. I ported the code
> to MASM and posted it on the same web site ( http://www.beroset.com/ )
> so that you can download it and play with it.
>
> There's an optimization made to the Linux version that I haven't yet
> propagated back to the DOS versions that speeds things up still further
> (about 2x) but I'll probably get to it eventually. If you'd like to try
> it, it's simply keeping track of where the most significant non-zero
> digit is in each of the terms. That way, we can skip searching for
> leading zeroes to suppress. Saves considerable time that way.
>
> Ed
Thanks Ed, I'll look into it. The previous response wasn't lost, it's just that I
didn't see it until now, thanks!
|
|
0
|
|
|
|
Reply
|
CII
|
5/6/2005 6:47:38 AM
|
|
--------------7E1138B866FD2E68360AB9FF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Ed Beroset wrote:
> CII wrote:
> >
> > Tim Roberts wrote:
> >>CII <spamtrap@crayne.org> wrote:
> >>
> >>>Hi Ed. Congratulations on your work. I'd like to see the output of
> >>>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> >>>time to convert it.. Would you put the .COM on your site or send it over to
> >>>my email please?
> >>
> >>Oh, come on. It took me two minutes to get it running in MASM. I think
> >>it's asking a bit much to expect him to send you the binary, when he has
> >>already generously provided the source code.
> >
> > Two minutes? Man that's fast. I tried to compile it to masm and got very many
> > errors. I tried tweaking around a bit but with no success. I apologize to
> > anyone here if a binary is too much to ask, but I think my request may be the one
> > of more people so it may be a valid one, but if it's trouble then I won't mind at
> > all.
>
> A previous response seems to have been lost, so I'll try again.
>
> For what it's worth, I don't think your request is unreasonable but I
> have an alternative solution you might like better. I ported the code
> to MASM and posted it on the same web site ( http://www.beroset.com/ )
> so that you can download it and play with it.
>
> There's an optimization made to the Linux version that I haven't yet
> propagated back to the DOS versions that speeds things up still further
> (about 2x) but I'll probably get to it eventually. If you'd like to try
> it, it's simply keeping track of where the most significant non-zero
> digit is in each of the terms. That way, we can skip searching for
> leading zeroes to suppress. Saves considerable time that way.
>
> Ed
Hi Ed. After looking into the files that Mr. Frank Kotler kindly forwarded and you so
courteously provided, I decided to give it a try myself but without repeating your work
and perhaps from a different angle with which something "new" could be tried out.
Shown below is my take on the sequence. I put some additional notes and yet another
attempt that can be found at
http://www.geocities.com/yssmlp/
which I saved for personal (and public) reference. This code is not -not remotely- as
fine as yours but it's another perspective which I think is a good thing to have. The
code below operates on two decimal digits per byte and I know it can be improved to
maybe .7 digits more per byte (at a higher program code cost of course). Knowing how
much faster AH=40/INT21 is than AH=02/INT21, I fixed the code below to reflect the
difference and the difference is enormous. To look at it you may want to check the
website (F8.ASM).
Thanks again and I'll be here for comments.
..MODEL TINY
;----------
;PLAYING WITH FIBONACCI NUMBERS BY YSS (MAY 2005)
;
;F(N+2)=F(N+1)+F(N)
; F(0)=0, F(1)=1
;
;MAX TERMS = (60000 - LOG(SQR(5))/(LOG(1+SQR(5)/2))) / (LOG(1+SQR(5)/2))
; = 287090 APPROX.
;
;WHERE '60000' IS THE TOTAL NUMBER OF POSSIBLE DIGITS.
;
;THIS PROGRAM CALCS AND OUTPUTS THE 'FIBONACCI' SERIES TO STDOUT UP TO
;ABOUT 287000+ TERMS. IT'S NOT VERY FAST DUE TO:
;
; - NO OR LITTLE OPTIMIZATION (THIS CODE WAS WRITTEN 'ON THE FLY')
; - THE COMPRESSION OF THE DIGITS:
; EACH BYTE HOLDS TWO DECIMAL DIGITS IN LO-HI ORDER,
; THAT TAKES A LITTLE EXTRA TIME.
; - EACH DIGIT IS OUTPUT TO STDOUT ONE AT A TIME.
; A SPEED INCREASE OF OVER 32% CAN BE GAINED IF A FULL STRING
; IS OUTPUT INSTEAD.
;
;
;TWO ACCUMULATORS ARE USED, X1 AND X2, EACH CAPABLE OF HANDLING UP TO 60000
;DIGITS (30000 BYTES LONG EACH).
;
;USE:
;SET TL = MAX # OF TERMS -> ASSEMBLE -> F7 TO SEE RESULTS TO STDOUT
;TERMS ARE OUTPUT ONE LINE AT A TIME (ENDED WITH 0D 0A)
;
;
CSEG SEGMENT PARA PUBLIC 'CODE'
ORG 100H
ASSUME CS:CSEG,DS:CSEG
START:
X1 EQU 0
X2 EQU X1 + 30000
TL EQU 287000 ;TOTAL TERMS TO CALC
MOV DI,OFFSET A+X1
MOV CX,60000/2
REP STOSW ;CLR ALL
INC AX
MOV X1T,AX
MOV [A+X1],AX ;
MOV BP,TL ;TOTAL # TERMS [ F(X) ] CALC'D
S00:
MOV DI,OFFSET A+X2
MOV SI,OFFSET A+X1
TEST BL,1
JNE S001
XCHG DI,SI
S001:
CALL LOUT
ADD12:
PUSH DI
PUSH SI
PUSH BP
XOR BP,BP
MOV CX,X1T
CLC
ACXMRE:
MOV AL,[DI]
MOV DL,[SI]
MOV AH,DL
PUSH AX
CALL AAB1
MOV DH,AL
POP AX
PUSHF
SHR AX,1
SHR AX,1
SHR AX,1
SHR AX,1
POPF
CALL AAB1
PUSHF
AND DH,0FH
SHL AL,1
SHL AL,1
SHL AL,1
SHL AL,1
OR AL,DH
POPF
MOV [SI],AL
JNB ADOK
CMP CX,1
STC
JNE ADOK
INC CX
ADOK:
INC DI
INC SI
INC BP
LOOP ACXMRE
MOV X1T,BP
POP BP
POP SI
POP DI
DEC BL
DEC BP
JNE S00
RET
LOUT:
MOV DX,X1T
PUSH DI
PUSH SI
MOV AH,2
MOV SI,DI
DEC SI
ADD DI,DX
MOV CX,000FEH ;CH=0 FOR LEADING 0'S NOT OUT
LN0001:
MOV DL,[DI]
TEST CL,1
JNE LN0000
SHR DL,1
SHR DL,1
SHR DL,1
SHR DL,1
JMP SHORT LN0002
LN0000:
DEC DI
LN0002:
AND DL,0FH
OR DL,30H
CMP DL,30H
JE LN00201
OR CH,1
JMP LN0020
LN00201:
TEST CH,1
JE LN00202
LN0020:
INT 21H
LN00202:
DEC CL
CMP DI,SI
JNE LN0001
POP SI
POP DI
ODOAH:
MOV AH,9
MOV DX,OFFSET ODOAX
INT 21H
RET
ODOAX DB 13,10,36
AAB1:
PUSHF
AND AX,0F0FH
POPF
ADC AL,AH
AAA
RET
AAA
RET
AAA
RET
AAA
RET
X1T DW 0 ;#DIGITS SO FAR
A LABEL WORD
;-------------------------------
;http://www.geocities.com/yssmlp
;pixelrat@hotmail.com
;or google around for "yssmlp" ! :)
;-------------------------------
CSEG ENDS
END START
--------------7E1138B866FD2E68360AB9FF
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<p>Ed Beroset wrote:
<blockquote TYPE=CITE>CII wrote:
<br>>
<br>> Tim Roberts wrote:
<br>>>CII <spamtrap@crayne.org> wrote:
<br>>>
<br>>>>Hi Ed. Congratulations on your work. I'd like to see
the output of
<br>>>>FIBO.ASM but I haven't because I don't have NASM and I don't
have that much
<br>>>>time to convert it.. Would you put the .COM on your site or
send it over to
<br>>>>my email please?
<br>>>
<br>>>Oh, come on. It took me two minutes to get it running in MASM.
I think
<br>>>it's asking a bit much to expect him to send you the binary, when
he has
<br>>>already generously provided the source code.
<br>>
<br>> Two minutes? Man that's fast. I tried to compile it to
masm and got very many
<br>> errors. I tried tweaking around a bit but with no success.
I apologize to
<br>> anyone here if a binary is too much to ask, but I think my request
may be the one
<br>> of more people so it may be a valid one, but if it's trouble then
I won't mind at
<br>> all.
<p>A previous response seems to have been lost, so I'll try again.
<p>For what it's worth, I don't think your request is unreasonable but
I
<br>have an alternative solution you might like better. I ported
the code
<br>to MASM and posted it on the same web site ( <a href="http://www.beroset.com/">http://www.beroset.com/</a>
)
<br>so that you can download it and play with it.
<p>There's an optimization made to the Linux version that I haven't yet
<br>propagated back to the DOS versions that speeds things up still further
<br>(about 2x) but I'll probably get to it eventually. If you'd like
to try
<br>it, it's simply keeping track of where the most significant non-zero
<br>digit is in each of the terms. That way, we can skip searching
for
<br>leading zeroes to suppress. Saves considerable time that way.
<p>Ed</blockquote>
Hi Ed. After looking into the files that Mr. Frank Kotler kindly
forwarded and you so courteously provided, I decided to give it a try myself
but without repeating your work and perhaps from a different angle with
which something "new" could be tried out.
<p>Shown below is my take on the sequence. I put some additional
notes and yet another attempt that can be found at
<p><A HREF="http://www.geocities.com/yssmlp/">http://www.geocities.com/yssmlp/</A>
<p>which I saved for personal (and public) reference. This code is
not -not remotely- as fine as yours but it's another perspective which
I think is a good thing to have. The code below operates on two decimal
digits per byte and I know it can be improved to maybe .7 digits more per
byte (at a higher program code cost of course). Knowing how much
faster AH=40/INT21 is than AH=02/INT21, I fixed the code below to reflect
the difference and the difference is enormous. To look at it you
may want to check the website (F8.ASM).
<p>Thanks again and I'll be here for comments.
<br><tt></tt> <tt></tt>
<p><tt>.MODEL TINY</tt>
<br><tt>;----------</tt>
<br><tt>;PLAYING WITH FIBONACCI NUMBERS BY YSS (MAY 2005)</tt>
<br><tt>;</tt>
<br><tt>;F(N+2)=F(N+1)+F(N)</tt>
<br><tt>; F(0)=0, F(1)=1</tt>
<br><tt>;</tt>
<br><tt>;MAX TERMS = (60000 - LOG(SQR(5))/(LOG(1+SQR(5)/2))) / (LOG(1+SQR(5)/2))</tt>
<br><tt>; = 287090 APPROX.</tt>
<br><tt>;</tt>
<br><tt>;WHERE '60000' IS THE TOTAL NUMBER OF POSSIBLE DIGITS.</tt>
<br><tt>;</tt>
<br><tt>;THIS PROGRAM CALCS AND OUTPUTS THE 'FIBONACCI' SERIES TO STDOUT
UP TO</tt>
<br><tt>;ABOUT 287000+ TERMS. IT'S NOT VERY FAST DUE TO:</tt>
<br><tt>;</tt>
<br><tt>; - NO OR LITTLE OPTIMIZATION (THIS CODE WAS WRITTEN 'ON THE FLY')</tt>
<br><tt>; - THE COMPRESSION OF THE DIGITS:</tt>
<br><tt>; EACH BYTE HOLDS TWO DECIMAL
DIGITS IN LO-HI ORDER,</tt>
<br><tt>; THAT TAKES A LITTLE EXTRA TIME.</tt>
<br><tt>; - EACH DIGIT IS OUTPUT TO STDOUT ONE AT A TIME.</tt>
<br><tt>; A SPEED INCREASE OF
OVER 32% CAN BE GAINED IF A FULL STRING</tt>
<br><tt>; IS OUTPUT INSTEAD.</tt>
<br><tt>;</tt>
<br><tt>;</tt>
<br><tt>;TWO ACCUMULATORS ARE USED, X1 AND X2, EACH CAPABLE OF HANDLING
UP TO 60000</tt>
<br><tt>;DIGITS (30000 BYTES LONG EACH).</tt>
<br><tt>;</tt>
<br><tt>;USE:</tt>
<br><tt>;SET TL = MAX # OF TERMS -> ASSEMBLE -> F7 TO SEE RESULTS TO STDOUT</tt>
<br><tt>;TERMS ARE OUTPUT ONE LINE AT A TIME (ENDED WITH 0D 0A)</tt>
<br><tt>;</tt>
<br><tt>;</tt><tt></tt>
<p><tt>CSEG SEGMENT PARA PUBLIC 'CODE'</tt>
<br><tt>ORG 100H</tt>
<br><tt> ASSUME CS:CSEG,DS:CSEG</tt>
<br><tt>START:</tt>
<br><tt>X1 EQU 0</tt>
<br><tt>X2 EQU X1 + 30000</tt>
<br><tt>TL EQU 287000 ;TOTAL TERMS TO CALC</tt><tt></tt>
<p><tt> MOV DI,OFFSET A+X1</tt><tt></tt>
<p><tt> MOV CX,60000/2</tt>
<br><tt> REP STOSW ;CLR ALL</tt><tt></tt>
<p><tt> INC AX</tt>
<br><tt> MOV X1T,AX</tt>
<br><tt> MOV [A+X1],AX ;</tt>
<br><tt> MOV BP,TL ;TOTAL # TERMS [ F(X) ] CALC'D</tt>
<br><tt>S00:</tt>
<br><tt> MOV DI,OFFSET A+X2</tt>
<br><tt> MOV SI,OFFSET A+X1</tt>
<br><tt> TEST BL,1</tt>
<br><tt> JNE S001</tt>
<br><tt> XCHG DI,SI</tt>
<br><tt>S001:</tt>
<br><tt> CALL LOUT</tt>
<br><tt>ADD12:</tt>
<br><tt> PUSH DI</tt>
<br><tt> PUSH SI</tt>
<br><tt> PUSH BP</tt>
<br><tt> XOR BP,BP</tt>
<br><tt> MOV CX,X1T</tt>
<br><tt> CLC</tt><tt></tt>
<p><tt> ACXMRE:</tt>
<br><tt> MOV AL,[DI]</tt>
<br><tt> MOV DL,[SI]</tt><tt></tt>
<p><tt> MOV AH,DL</tt>
<br><tt> PUSH AX</tt>
<br><tt> CALL AAB1</tt>
<br><tt> MOV DH,AL</tt>
<br><tt> POP AX</tt>
<br><tt> PUSHF</tt>
<br><tt> SHR AX,1</tt>
<br><tt> SHR AX,1</tt>
<br><tt> SHR AX,1</tt>
<br><tt> SHR AX,1</tt>
<br><tt> POPF</tt>
<br><tt> CALL AAB1</tt>
<br><tt> PUSHF</tt>
<br><tt> AND DH,0FH</tt>
<br><tt> SHL AL,1</tt>
<br><tt> SHL AL,1</tt>
<br><tt> SHL AL,1</tt>
<br><tt> SHL AL,1</tt>
<br><tt> OR AL,DH</tt>
<br><tt> POPF</tt><tt></tt>
<p><tt> MOV [SI],AL</tt>
<br><tt> JNB ADOK</tt>
<br><tt> CMP CX,1</tt>
<br><tt> STC</tt>
<br><tt> JNE ADOK</tt>
<br><tt> INC CX</tt>
<br><tt> ADOK:</tt>
<br><tt> INC DI</tt>
<br><tt> INC SI</tt>
<br><tt> INC BP</tt>
<br><tt> LOOP ACXMRE</tt>
<br><tt> MOV X1T,BP</tt>
<br><tt> POP BP</tt>
<br><tt> POP SI</tt>
<br><tt> POP DI</tt><tt></tt>
<p><tt> DEC BL</tt>
<br><tt> DEC BP</tt>
<br><tt> JNE S00</tt>
<br><tt> RET</tt><tt></tt>
<p><tt>LOUT:</tt>
<br><tt> MOV DX,X1T</tt>
<br><tt> PUSH DI</tt>
<br><tt> PUSH SI</tt>
<br><tt> MOV AH,2</tt>
<br><tt> MOV SI,DI</tt>
<br><tt> DEC SI</tt>
<br><tt> ADD DI,DX</tt>
<br><tt> MOV CX,000FEH ;CH=0
FOR LEADING 0'S NOT OUT</tt><tt></tt>
<p><tt>LN0001:</tt>
<br><tt> MOV DL,[DI]</tt>
<br><tt> TEST CL,1</tt>
<br><tt> JNE LN0000</tt><tt></tt>
<p><tt> SHR DL,1</tt>
<br><tt> SHR DL,1</tt>
<br><tt> SHR DL,1</tt>
<br><tt> SHR DL,1</tt>
<br><tt> JMP SHORT LN0002</tt>
<br><tt>LN0000:</tt>
<br><tt> DEC DI</tt>
<br><tt>LN0002:</tt>
<br><tt> AND DL,0FH</tt>
<br><tt> OR DL,30H</tt>
<br><tt> CMP DL,30H</tt>
<br><tt> JE LN00201</tt>
<br><tt> OR CH,1</tt>
<br><tt> JMP LN0020</tt>
<br><tt>LN00201:</tt>
<br><tt> TEST CH,1</tt>
<br><tt> JE LN00202</tt>
<br><tt>LN0020:</tt>
<br><tt> INT 21H</tt>
<br><tt>LN00202:</tt><tt></tt>
<p><tt> DEC CL</tt>
<br><tt> CMP DI,SI</tt>
<br><tt> JNE LN0001</tt>
<br><tt> POP SI</tt>
<br><tt> POP DI</tt>
<br><tt>ODOAH:</tt>
<br><tt> MOV AH,9</tt>
<br><tt> MOV DX,OFFSET ODOAX</tt>
<br><tt> INT 21H</tt>
<br><tt> RET</tt>
<br><tt>ODOAX DB 13,10,36</tt><tt></tt>
<p><tt>AAB1:</tt>
<br><tt> PUSHF</tt>
<br><tt> AND AX,0F0FH</tt>
<br><tt> POPF</tt>
<br><tt> ADC AL,AH</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt><tt></tt>
<p><tt>X1T DW 0 ;#DIGITS SO FAR</tt>
<br><tt>A LABEL WORD</tt>
<br><tt>;-------------------------------</tt>
<br><tt>;<A HREF="http://www.geocities.com/yssmlp">http://www.geocities.com/yssmlp</A></tt>
<br><tt>;pixelrat@hotmail.com</tt>
<br><tt>;or google around for "yssmlp" ! :)</tt>
<br><tt>;-------------------------------</tt>
<br><tt>CSEG ENDS</tt>
<br><tt> END START</tt></html>
--------------7E1138B866FD2E68360AB9FF--
|
|
0
|
|
|
|
Reply
|
CII
|
5/8/2005 6:34:09 PM
|
|
CII wrote:
> Ed Beroset wrote:
>
> > CII wrote:
> > >
> > > Tim Roberts wrote:
> > >>CII <spamtrap@crayne.org> wrote:
> > >>
> > >>>Hi Ed. Congratulations on your work. I'd like to see the output of
> > >>>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> > >>>time to convert it.. Would you put the .COM on your site or send it over to
> > >>>my email please?
> > >>
> > >>Oh, come on. It took me two minutes to get it running in MASM. I think
> > >>it's asking a bit much to expect him to send you the binary, when he has
> > >>already generously provided the source code.
> > >
> > > Two minutes? Man that's fast. I tried to compile it to masm and got very many
> > > errors. I tried tweaking around a bit but with no success. I apologize to
> > > anyone here if a binary is too much to ask, but I think my request may be the one
> > > of more people so it may be a valid one, but if it's trouble then I won't mind at
> > > all.
> >
> > A previous response seems to have been lost, so I'll try again.
> >
> > For what it's worth, I don't think your request is unreasonable but I
> > have an alternative solution you might like better. I ported the code
> > to MASM and posted it on the same web site ( http://www.beroset.com/ )
> > so that you can download it and play with it.
> >
> > There's an optimization made to the Linux version that I haven't yet
> > propagated back to the DOS versions that speeds things up still further
> > (about 2x) but I'll probably get to it eventually. If you'd like to try
> > it, it's simply keeping track of where the most significant non-zero
> > digit is in each of the terms. That way, we can skip searching for
> > leading zeroes to suppress. Saves considerable time that way.
> >
> > Ed
>
> Thanks Ed, I'll look into it. The previous response wasn't lost, it's just that I
> didn't see it until now, thanks!
I made a last post about two days back and I don't see it on, so I'll assume it's lost
and will repost again:
"Hi Ed. After looking into the files that Mr. Frank Kotler kindly forwarded and you
so courteously provided, I decided to give it a try myself but without repeating your
work and perhaps from a different angle with which something "new" could be tried
out.
Shown below is my take on the sequence. I put some additional notes and yet
another attempt that can be found at
http://www.geocities.com/yssmlp/
which I saved for personal reference. This code is not -not remotely-
as fine as yours but it's another perspective which I think is a good thing to have.
The code below operates on two decimal digits per byte and I know it can be
improved to maybe .7 digits more per byte (at a higher program code cost of
course). Knowing how much faster AH=40/INT21 is than AH=02/INT21, I fixed
the code below to reflect the difference and the difference is enormous. To look at it
you may want to check the website (F8.ASM).
Thanks again and I'll be here for comments.
..MODEL TINY
;----------
;PLAYING WITH FIBONACCI NUMBERS BY YSS (MAY 2005)
;
;F(N+2)=F(N+1)+F(N)
; F(0)=0, F(1)=1
;
;MAX TERMS = (60000 - LOG(SQR(5))/(LOG(1+SQR(5)/2))) / (LOG(1+SQR(5)/2))
; = 287090 APPROX.
;
;WHERE '60000' IS THE TOTAL NUMBER OF POSSIBLE DIGITS.
;
;THIS PROGRAM CALCS AND OUTPUTS THE 'FIBONACCI' SERIES TO STDOUT UP TO
;ABOUT 287000+ TERMS. IT'S NOT VERY FAST DUE TO:
;
; - NO OR LITTLE OPTIMIZATION (THIS CODE WAS WRITTEN 'ON THE FLY')
; - THE COMPRESSION OF THE DIGITS:
; EACH BYTE HOLDS TWO DECIMAL DIGITS IN LO-HI ORDER,
; THAT TAKES A LITTLE EXTRA TIME.
; - EACH DIGIT IS OUTPUT TO STDOUT ONE AT A TIME.
; A SPEED INCREASE OF OVER 32% CAN BE GAINED IF A FULL STRING
; IS OUTPUT INSTEAD.
;
;
;TWO ACCUMULATORS ARE USED, X1 AND X2, EACH CAPABLE OF HANDLING UP TO 60000
;DIGITS (30000 BYTES LONG EACH).
;
;USE:
;SET TL = MAX # OF TERMS -> ASSEMBLE -> F7 TO SEE RESULTS TO STDOUT
;TERMS ARE OUTPUT ONE LINE AT A TIME (ENDED WITH 0D 0A)
;
;
CSEG SEGMENT PARA PUBLIC 'CODE'
ORG 100H
ASSUME CS:CSEG,DS:CSEG
START:
X1 EQU 0
X2 EQU X1 + 30000
TL EQU 287000 ;TOTAL TERMS TO CALC
MOV DI,OFFSET A+X1
MOV CX,60000/2
REP STOSW ;CLR ALL
INC AX
MOV X1T,AX
MOV [A+X1],AX ;
MOV BP,TL ;TOTAL # TERMS [ F(X) ] CALC'D
S00:
MOV DI,OFFSET A+X2
MOV SI,OFFSET A+X1
TEST BL,1
JNE S001
XCHG DI,SI
S001:
CALL LOUT
ADD12:
PUSH DI
PUSH SI
PUSH BP
XOR BP,BP
MOV CX,X1T
CLC
ACXMRE:
MOV AL,[DI]
MOV DL,[SI]
MOV AH,DL
PUSH AX
CALL AAB1
MOV DH,AL
POP AX
PUSHF
SHR AX,1
SHR AX,1
SHR AX,1
SHR AX,1
POPF
CALL AAB1
PUSHF
AND DH,0FH
SHL AL,1
SHL AL,1
SHL AL,1
SHL AL,1
OR AL,DH
POPF
MOV [SI],AL
JNB ADOK
CMP CX,1
STC
JNE ADOK
INC CX
ADOK:
INC DI
INC SI
INC BP
LOOP ACXMRE
MOV X1T,BP
POP BP
POP SI
POP DI
DEC BL
DEC BP
JNE S00
RET
LOUT:
MOV DX,X1T
PUSH DI
PUSH SI
MOV AH,2
MOV SI,DI
DEC SI
ADD DI,DX
MOV CX,000FEH ;CH=0 FOR LEADING 0'S NOT OUT
LN0001:
MOV DL,[DI]
TEST CL,1
JNE LN0000
SHR DL,1
SHR DL,1
SHR DL,1
SHR DL,1
JMP SHORT LN0002
LN0000:
DEC DI
LN0002:
AND DL,0FH
OR DL,30H
CMP DL,30H
JE LN00201
OR CH,1
JMP LN0020
LN00201:
TEST CH,1
JE LN00202
LN0020:
INT 21H
LN00202:
DEC CL
CMP DI,SI
JNE LN0001
POP SI
POP DI
ODOAH:
MOV AH,9
MOV DX,OFFSET ODOAX
INT 21H
RET
ODOAX DB 13,10,36
AAB1:
PUSHF
AND AX,0F0FH
POPF
ADC AL,AH
AAA
RET
AAA
RET
AAA
RET
AAA
RET
X1T DW 0 ;#DIGITS SO FAR
A LABEL WORD
;-------------------------------
;http://www.geocities.com/yssmlp
;pixelrat@hotmail.com
;or google around for "yssmlp" ! :)
;-------------------------------
CSEG ENDS
END START
"
|
|
0
|
|
|
|
Reply
|
CII
|
5/9/2005 7:33:50 PM
|
|
--------------0FAACCA1DF9B021BA2DF3597
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
CII wrote:
> Ed Beroset wrote:
>
> > CII wrote:
> > >
> > > Tim Roberts wrote:
> > >>CII <spamtrap@crayne.org> wrote:
> > >>
> > >>>Hi Ed. Congratulations on your work. I'd like to see the output of
> > >>>FIBO.ASM but I haven't because I don't have NASM and I don't have that much
> > >>>time to convert it.. Would you put the .COM on your site or send it over to
> > >>>my email please?
> > >>
> > >>Oh, come on. It took me two minutes to get it running in MASM. I think
> > >>it's asking a bit much to expect him to send you the binary, when he has
> > >>already generously provided the source code.
> > >
> > > Two minutes? Man that's fast. I tried to compile it to masm and got very many
> > > errors. I tried tweaking around a bit but with no success. I apologize to
> > > anyone here if a binary is too much to ask, but I think my request may be the one
> > > of more people so it may be a valid one, but if it's trouble then I won't mind at
> > > all.
> >
> > A previous response seems to have been lost, so I'll try again.
> >
> > For what it's worth, I don't think your request is unreasonable but I
> > have an alternative solution you might like better. I ported the code
> > to MASM and posted it on the same web site ( http://www.beroset.com/ )
> > so that you can download it and play with it.
> >
> > There's an optimization made to the Linux version that I haven't yet
> > propagated back to the DOS versions that speeds things up still further
> > (about 2x) but I'll probably get to it eventually. If you'd like to try
> > it, it's simply keeping track of where the most significant non-zero
> > digit is in each of the terms. That way, we can skip searching for
> > leading zeroes to suppress. Saves considerable time that way.
> >
> > Ed
>
> Thanks Ed, I'll look into it. The previous response wasn't lost, it's just that I
> didn't see it until now, thanks!
Hello. I've posted twice already and the posts don't seem to be going through.
Here's what I've posted:
Hi Ed. After looking into the files that Mr. Frank Kotler kindly forwarded and you
so courteously provided, I decided to give it a try myself but without repeating your
work and perhaps from a different angle with which something "new" could be tried
out.
Shown below is my take on the sequence. I put some additional notes and yet
another attempt that can be found at
http://www.geocities.com/yssmlp/
which I saved for personal (and public) reference. This code is not -not remotely-
as fine as yours but it's another perspective which I think is a good thing to have.
The code below operates on two decimal digits per byte and I know it can be
improved to maybe .7 digits more per byte (at a higher program code cost of
course). Knowing how much faster AH=40/INT21 is than AH=02/INT21, I fixed
the code below to reflect the difference and the difference is enormous. To look at it
you may want to check the website (F8.ASM).
Thanks again and I'll be here for comments.
..MODEL TINY
;----------
;PLAYING WITH FIBONACCI NUMBERS BY YSS (MAY 2005)
;
;F(N+2)=F(N+1)+F(N)
; F(0)=0, F(1)=1
;
;MAX TERMS = (60000 - LOG(SQR(5))/(LOG(1+SQR(5)/2))) /
(LOG(1+SQR(5)/2))
; = 287090 APPROX.
;
;WHERE '60000' IS THE TOTAL NUMBER OF POSSIBLE DIGITS.
;
;THIS PROGRAM CALCS AND OUTPUTS THE 'FIBONACCI' SERIES TO
STDOUT UP TO
;ABOUT 287000+ TERMS. IT'S NOT VERY FAST DUE TO:
;
; - NO OR LITTLE OPTIMIZATION (THIS CODE WAS WRITTEN 'ON THE
FLY')
; - THE COMPRESSION OF THE DIGITS:
; EACH BYTE HOLDS TWO DECIMAL DIGITS IN LO-HI ORDER,
; THAT TAKES A LITTLE EXTRA TIME.
; - EACH DIGIT IS OUTPUT TO STDOUT ONE AT A TIME.
; A SPEED INCREASE OF OVER 32% CAN BE GAINED IF A FULL
STRING
; IS OUTPUT INSTEAD.
;
;
;TWO ACCUMULATORS ARE USED, X1 AND X2, EACH CAPABLE OF HANDLING
UP TO 60000
;DIGITS (30000 BYTES LONG EACH).
;
;USE:
;SET TL = MAX # OF TERMS -> ASSEMBLE -> F7 TO SEE RESULTS TO
STDOUT
;TERMS ARE OUTPUT ONE LINE AT A TIME (ENDED WITH 0D 0A)
;
;
CSEG SEGMENT PARA PUBLIC 'CODE'
ORG 100H
ASSUME CS:CSEG,DS:CSEG
START:
X1 EQU 0
X2 EQU X1 + 30000
TL EQU 287000 ;TOTAL TERMS TO CALC
MOV DI,OFFSET A+X1
MOV CX,60000/2
REP STOSW ;CLR ALL
INC AX
MOV X1T,AX
MOV [A+X1],AX ;
MOV BP,TL ;TOTAL # TERMS [ F(X) ] CALC'D
S00:
MOV DI,OFFSET A+X2
MOV SI,OFFSET A+X1
TEST BL,1
JNE S001
XCHG DI,SI
S001:
CALL LOUT
ADD12:
PUSH DI
PUSH SI
PUSH BP
XOR BP,BP
MOV CX,X1T
CLC
ACXMRE:
MOV AL,[DI]
MOV DL,[SI]
MOV AH,DL
PUSH AX
CALL AAB1
MOV DH,AL
POP AX
PUSHF
SHR AX,1
SHR AX,1
SHR AX,1
SHR AX,1
POPF
CALL AAB1
PUSHF
AND DH,0FH
SHL AL,1
SHL AL,1
SHL AL,1
SHL AL,1
OR AL,DH
POPF
MOV [SI],AL
JNB ADOK
CMP CX,1
STC
JNE ADOK
INC CX
ADOK:
INC DI
INC SI
INC BP
LOOP ACXMRE
MOV X1T,BP
POP BP
POP SI
POP DI
DEC BL
DEC BP
JNE S00
RET
LOUT:
MOV DX,X1T
PUSH DI
PUSH SI
MOV AH,2
MOV SI,DI
DEC SI
ADD DI,DX
MOV CX,000FEH ;CH=0 FOR LEADING 0'S NOT OUT
LN0001:
MOV DL,[DI]
TEST CL,1
JNE LN0000
SHR DL,1
SHR DL,1
SHR DL,1
SHR DL,1
JMP SHORT LN0002
LN0000:
DEC DI
LN0002:
AND DL,0FH
OR DL,30H
CMP DL,30H
JE LN00201
OR CH,1
JMP LN0020
LN00201:
TEST CH,1
JE LN00202
LN0020:
INT 21H
LN00202:
DEC CL
CMP DI,SI
JNE LN0001
POP SI
POP DI
ODOAH:
MOV AH,9
MOV DX,OFFSET ODOAX
INT 21H
RET
ODOAX DB 13,10,36
AAB1:
PUSHF
AND AX,0F0FH
POPF
ADC AL,AH
AAA
RET
AAA
RET
AAA
RET
AAA
RET
X1T DW 0 ;#DIGITS SO FAR
A LABEL WORD
;-------------------------------
;http://www.geocities.com/yssmlp
;pixelrat@hotmail.com
;or google around for "yssmlp" ! :)
;-------------------------------
CSEG ENDS
END START
--------------0FAACCA1DF9B021BA2DF3597
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
<p>CII wrote:
<blockquote TYPE=CITE>Ed Beroset wrote:
<p>> CII wrote:
<br>> >
<br>> > Tim Roberts wrote:
<br>> >>CII <spamtrap@crayne.org> wrote:
<br>> >>
<br>> >>>Hi Ed. Congratulations on your work. I'd like to see
the output of
<br>> >>>FIBO.ASM but I haven't because I don't have NASM and I don't
have that much
<br>> >>>time to convert it.. Would you put the .COM on your site
or send it over to
<br>> >>>my email please?
<br>> >>
<br>> >>Oh, come on. It took me two minutes to get it running in
MASM. I think
<br>> >>it's asking a bit much to expect him to send you the binary, when
he has
<br>> >>already generously provided the source code.
<br>> >
<br>> > Two minutes? Man that's fast. I tried to compile it
to masm and got very many
<br>> > errors. I tried tweaking around a bit but with no success.
I apologize to
<br>> > anyone here if a binary is too much to ask, but I think my request
may be the one
<br>> > of more people so it may be a valid one, but if it's trouble then
I won't mind at
<br>> > all.
<br>>
<br>> A previous response seems to have been lost, so I'll try again.
<br>>
<br>> For what it's worth, I don't think your request is unreasonable but
I
<br>> have an alternative solution you might like better. I ported
the code
<br>> to MASM and posted it on the same web site ( <a href="http://www.beroset.com/">http://www.beroset.com/</a>
)
<br>> so that you can download it and play with it.
<br>>
<br>> There's an optimization made to the Linux version that I haven't
yet
<br>> propagated back to the DOS versions that speeds things up still further
<br>> (about 2x) but I'll probably get to it eventually. If you'd
like to try
<br>> it, it's simply keeping track of where the most significant non-zero
<br>> digit is in each of the terms. That way, we can skip searching
for
<br>> leading zeroes to suppress. Saves considerable time that way.
<br>>
<br>> Ed
<p>Thanks Ed, I'll look into it. The previous response wasn't lost,
it's just that I
<br>didn't see it until now, thanks!</blockquote>
Hello. I've posted twice already and the posts don't seem to be going
through.
<br>Here's what I've posted:
<br>
<p>Hi Ed. After looking into the files that Mr. Frank Kotler kindly
forwarded and you
<br>so courteously provided, I decided to give it a try myself but without
repeating your
<br>work and perhaps from a different angle with which something "new"
could be tried
<br>out.
<p>Shown below is my take on the sequence. I put some additional
notes and yet
<br>another attempt that can be found at
<p><A HREF="http://www.geocities.com/yssmlp/">http://www.geocities.com/yssmlp/</A>
<p>which I saved for personal (and public) reference. This code is
not -not remotely-
<br>as fine as yours but it's another perspective which I think is a good
thing to have.
<br>The code below operates on two decimal digits per byte and I know it
can be
<br>improved to maybe .7 digits more per byte (at a higher program code
cost of
<br>course). Knowing how much faster AH=40/INT21 is than AH=02/INT21,
I fixed
<br>the code below to reflect the difference and the difference is enormous.
To look at it
<br>you may want to check the website (F8.ASM).
<p>Thanks again and I'll be here for comments.
<br> <tt></tt>
<p><tt>.MODEL TINY</tt>
<br><tt>;----------</tt>
<br><tt>;PLAYING WITH FIBONACCI NUMBERS BY YSS (MAY 2005)</tt>
<br><tt>;</tt>
<br><tt>;F(N+2)=F(N+1)+F(N)</tt>
<br><tt>; F(0)=0, F(1)=1</tt>
<br><tt>;</tt>
<br><tt>;MAX TERMS = (60000 - LOG(SQR(5))/(LOG(1+SQR(5)/2))) /</tt>
<br><tt>(LOG(1+SQR(5)/2))</tt>
<br><tt>; = 287090 APPROX.</tt>
<br><tt>;</tt>
<br><tt>;WHERE '60000' IS THE TOTAL NUMBER OF POSSIBLE DIGITS.</tt>
<br><tt>;</tt>
<br><tt>;THIS PROGRAM CALCS AND OUTPUTS THE 'FIBONACCI' SERIES TO</tt>
<br><tt>STDOUT UP TO</tt>
<br><tt>;ABOUT 287000+ TERMS. IT'S NOT VERY FAST DUE TO:</tt>
<br><tt>;</tt>
<br><tt>; - NO OR LITTLE OPTIMIZATION (THIS CODE WAS WRITTEN 'ON THE</tt>
<br><tt>FLY')</tt>
<br><tt>; - THE COMPRESSION OF THE DIGITS:</tt>
<br><tt>; EACH BYTE HOLDS TWO DECIMAL
DIGITS IN LO-HI ORDER,</tt>
<br><tt>; THAT TAKES A LITTLE EXTRA TIME.</tt>
<br><tt>; - EACH DIGIT IS OUTPUT TO STDOUT ONE AT A TIME.</tt>
<br><tt>; A SPEED INCREASE OF
OVER 32% CAN BE GAINED IF A FULL</tt>
<br><tt>STRING</tt>
<br><tt>; IS OUTPUT INSTEAD.</tt>
<br><tt>;</tt>
<br><tt>;</tt>
<br><tt>;TWO ACCUMULATORS ARE USED, X1 AND X2, EACH CAPABLE OF HANDLING</tt>
<br><tt>UP TO 60000</tt>
<br><tt>;DIGITS (30000 BYTES LONG EACH).</tt>
<br><tt>;</tt>
<br><tt>;USE:</tt>
<br><tt>;SET TL = MAX # OF TERMS -> ASSEMBLE -> F7 TO SEE RESULTS TO</tt>
<br><tt>STDOUT</tt>
<br><tt>;TERMS ARE OUTPUT ONE LINE AT A TIME (ENDED WITH 0D 0A)</tt>
<br><tt>;</tt>
<br><tt>;</tt><tt></tt>
<p><tt>CSEG SEGMENT PARA PUBLIC 'CODE'</tt>
<br><tt>ORG 100H</tt>
<br><tt> ASSUME CS:CSEG,DS:CSEG</tt>
<br><tt>START:</tt>
<br><tt>X1 EQU 0</tt>
<br><tt>X2 EQU X1 + 30000</tt>
<br><tt>TL EQU 287000 ;TOTAL TERMS TO CALC</tt><tt></tt>
<p><tt> MOV DI,OFFSET A+X1</tt><tt></tt>
<p><tt> MOV CX,60000/2</tt>
<br><tt> REP STOSW ;CLR ALL</tt><tt></tt>
<p><tt> INC AX</tt>
<br><tt> MOV X1T,AX</tt>
<br><tt> MOV [A+X1],AX ;</tt>
<br><tt> MOV BP,TL ;TOTAL # TERMS [ F(X) ] CALC'D</tt>
<br><tt>S00:</tt>
<br><tt> MOV DI,OFFSET A+X2</tt>
<br><tt> MOV SI,OFFSET A+X1</tt>
<br><tt> TEST BL,1</tt>
<br><tt> JNE S001</tt>
<br><tt> XCHG DI,SI</tt>
<br><tt>S001:</tt>
<br><tt> CALL LOUT</tt>
<br><tt>ADD12:</tt>
<br><tt> PUSH DI</tt>
<br><tt> PUSH SI</tt>
<br><tt> PUSH BP</tt>
<br><tt> XOR BP,BP</tt>
<br><tt> MOV CX,X1T</tt>
<br><tt> CLC</tt><tt></tt>
<p><tt> ACXMRE:</tt>
<br><tt> MOV AL,[DI]</tt>
<br><tt> MOV DL,[SI]</tt><tt></tt>
<p><tt> MOV AH,DL</tt>
<br><tt> PUSH AX</tt>
<br><tt> CALL AAB1</tt>
<br><tt> MOV DH,AL</tt>
<br><tt> POP AX</tt>
<br><tt> PUSHF</tt>
<br><tt> SHR AX,1</tt>
<br><tt> SHR AX,1</tt>
<br><tt> SHR AX,1</tt>
<br><tt> SHR AX,1</tt>
<br><tt> POPF</tt>
<br><tt> CALL AAB1</tt>
<br><tt> PUSHF</tt>
<br><tt> AND DH,0FH</tt>
<br><tt> SHL AL,1</tt>
<br><tt> SHL AL,1</tt>
<br><tt> SHL AL,1</tt>
<br><tt> SHL AL,1</tt>
<br><tt> OR AL,DH</tt>
<br><tt> POPF</tt><tt></tt>
<p><tt> MOV [SI],AL</tt>
<br><tt> JNB ADOK</tt>
<br><tt> CMP CX,1</tt>
<br><tt> STC</tt>
<br><tt> JNE ADOK</tt>
<br><tt> INC CX</tt>
<br><tt> ADOK:</tt>
<br><tt> INC DI</tt>
<br><tt> INC SI</tt>
<br><tt> INC BP</tt>
<br><tt> LOOP ACXMRE</tt>
<br><tt> MOV X1T,BP</tt>
<br><tt> POP BP</tt>
<br><tt> POP SI</tt>
<br><tt> POP DI</tt><tt></tt>
<p><tt> DEC BL</tt>
<br><tt> DEC BP</tt>
<br><tt> JNE S00</tt>
<br><tt> RET</tt><tt></tt>
<p><tt>LOUT:</tt>
<br><tt> MOV DX,X1T</tt>
<br><tt> PUSH DI</tt>
<br><tt> PUSH SI</tt>
<br><tt> MOV AH,2</tt>
<br><tt> MOV SI,DI</tt>
<br><tt> DEC SI</tt>
<br><tt> ADD DI,DX</tt>
<br><tt> MOV CX,000FEH ;CH=0
FOR LEADING 0'S NOT OUT</tt><tt></tt>
<p><tt>LN0001:</tt>
<br><tt> MOV DL,[DI]</tt>
<br><tt> TEST CL,1</tt>
<br><tt> JNE LN0000</tt><tt></tt>
<p><tt> SHR DL,1</tt>
<br><tt> SHR DL,1</tt>
<br><tt> SHR DL,1</tt>
<br><tt> SHR DL,1</tt>
<br><tt> JMP SHORT LN0002</tt>
<br><tt>LN0000:</tt>
<br><tt> DEC DI</tt>
<br><tt>LN0002:</tt>
<br><tt> AND DL,0FH</tt>
<br><tt> OR DL,30H</tt>
<br><tt> CMP DL,30H</tt>
<br><tt> JE LN00201</tt>
<br><tt> OR CH,1</tt>
<br><tt> JMP LN0020</tt>
<br><tt>LN00201:</tt>
<br><tt> TEST CH,1</tt>
<br><tt> JE LN00202</tt>
<br><tt>LN0020:</tt>
<br><tt> INT 21H</tt>
<br><tt>LN00202:</tt><tt></tt>
<p><tt> DEC CL</tt>
<br><tt> CMP DI,SI</tt>
<br><tt> JNE LN0001</tt>
<br><tt> POP SI</tt>
<br><tt> POP DI</tt>
<br><tt>ODOAH:</tt>
<br><tt> MOV AH,9</tt>
<br><tt> MOV DX,OFFSET ODOAX</tt>
<br><tt> INT 21H</tt>
<br><tt> RET</tt>
<br><tt>ODOAX DB 13,10,36</tt><tt></tt>
<p><tt>AAB1:</tt>
<br><tt> PUSHF</tt>
<br><tt> AND AX,0F0FH</tt>
<br><tt> POPF</tt>
<br><tt> ADC AL,AH</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt>
<br><tt> AAA</tt>
<br><tt> RET</tt><tt></tt>
<p><tt>X1T DW 0 ;#DIGITS SO FAR</tt>
<br><tt>A LABEL WORD</tt>
<br><tt>;-------------------------------</tt>
<br><tt>;<A HREF="http://www.geocities.com/yssmlp">http://www.geocities.com/yssmlp</A></tt>
<br><tt>;pixelrat@hotmail.com</tt>
<br><tt>;or google around for "yssmlp" ! :)</tt>
<br><tt>;-------------------------------</tt>
<br><tt>CSEG ENDS</tt>
<br><tt> END START</tt></html>
--------------0FAACCA1DF9B021BA2DF3597--
|
|
0
|
|
|
|
Reply
|
CII
|
5/9/2005 9:19:49 PM
|
|
|
15 Replies
73 Views
(page loaded in 0.294 seconds)
|