64bit/64bit fixed point division?

  • Follow


Hi!
I wrote a C++ class for a 64bit fixed point numeric format,
32bit integral and 32bit fractional. I wrote addition,
subtraction and multiplication operator routines (the latter
in inline assembly). Now I need an assembly routine to inline
to compute division, but I'm having many problems finding the
fastest way to do it. How was easy with multiplication to get
advantage of the hardware multiplier!
How can it be that the "hardware" multiplier is useless in
this case? Ok, it has to do with laws of nature (a.k.a. math)
and such but, how would be the fastest way to perform a div
between two 64bit fixed point operands (with the point exactly
in the middle) and produce a 128bit result and/or a 64bit one
(I guess two versions of the routine will be necessary)?

Can anybody help with code, please? I managed to reckon that
the division will be equivalent to an integer division where
the dividend is shifted 32bits to the left, and the quotient
then is shifted the same amount to the right. But this lengthens
the division process even more, is there no way to avoid this?

Better than all, can some asm genious produce some code or
share some working code? I've spent weeks looking at the GMP
and other bignum libs code, but I'm more confused than before.

Thanks a lot!
Marie from Bozen, Italy

0
Reply spamtrap 8/22/2006 9:19:28 AM

spamtrap@crayne.org wrote in part:
> I wrote a C++ class for a 64bit fixed point numeric format,
> 32bit integral and 32bit fractional. I wrote addition,
> subtraction and multiplication operator routines (the latter
> in inline assembly). Now I need an assembly routine to inline
> to compute division, but I'm having many problems finding
> the fastest way to do it. How was easy with multiplication
> to get advantage of the hardware multiplier!  How can it be
> that the "hardware" multiplier is useless in this case? Ok,
> it has to do with laws of nature (a.k.a. math) and such but,
> how would be the fastest way to perform a div between two
> 64bit fixed point operands (with the point exactly in the
> middle) and produce a 128bit result and/or a 64bit one
> (I guess two versions of the routine will be necessary)?

Division is always messy business.  There are two main
algorithms: long-division (shift, compare, subtract) and
Newton-Raphson for 1/SQRT. Historically, integer division
is done via long-division, and floating point is done via
Newton-Raphson (which is faster at 32 bits, and even more
advantaged at more).  Both of these done in CPU microcode.

For 64bit fixed-point with an emphasis on speed (accuracy is
always lost in fractional parts), I'd go Newton-Raphson.

-- Robert

0
Reply Robert 8/22/2006 7:39:59 PM


--B=_H8082f536P2281Hydro+k7MLBOcf020490
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Robert Redelmeier wrote:
> spamtrap@crayne.org wrote in part:
>> I wrote a C++ class for a 64bit fixed point numeric format,
>> 32bit integral and 32bit fractional. I wrote addition,
>> subtraction and multiplication operator routines (the latter
>> in inline assembly). Now I need an assembly routine to inline
>> to compute division, but I'm having many problems finding
[snip]
> Division is always messy business.  There are two main
> algorithms: long-division (shift, compare, subtract) and
> Newton-Raphson for 1/SQRT. Historically, integer division
> is done via long-division, and floating point is done via
> Newton-Raphson (which is faster at 32 bits, and even more
> advantaged at more).  Both of these done in CPU microcode.
> 
> For 64bit fixed-point with an emphasis on speed (accuracy is
> always lost in fractional parts), I'd go Newton-Raphson.

Since you do have both an integer and a floating point divisor 
available, I'd look into using them to advantage:

Dividing a fixed-point 32:32 number by another number in the same 
format, is identical to dividing a 64-bit number by another, by shifting 
both operands 32 bits left first, then shifting the result left and 
storing as integer:

This means that you can use the 80-bit fp format to get exact results:

; Make sure that the x87 part is set to extended precision first!

   FILD qword ptr [a]
   FILD qword ptr [b]
   FDIVP st(1),st
   FMUL [two_to_32]	; Do a 32-bit left shift to convert fraction
   FISTP qword ptr [result]

This is about 50-60 cycles on anything from a classic Pentium and up.

Terje
-- 
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"


--B=_H8082f536P2281Hydro+k7MLBOcf020490
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline



***********************************************************************
NOTICE: This e-mail transmission, and any documents, files or previous
e-mail messages attached to it, may contain confidential or privileged
information. If you are not the intended recipient, or a person
responsible for delivering it to the intended recipient, you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in or attached to this message is
STRICTLY PROHIBITED. If you have received this transmission in error,
please immediately notify the sender and delete the e-mail and attached
documents. Thank you.
***********************************************************************


--B=_H8082f536P2281Hydro+k7MLBOcf020490--

0
Reply Terje 8/22/2006 9:11:22 PM

Terje Mathisen  <spamtrap@crayne.org> writes:
> Robert Redelmeier wrote:
> > spamtrap@crayne.org wrote in part:
> >> I wrote a C++ class for a 64bit fixed point numeric format,
> >> 32bit integral and 32bit fractional. I wrote addition,
> >> subtraction and multiplication operator routines (the latter
> >> in inline assembly). Now I need an assembly routine to inline
> >> to compute division, but I'm having many problems finding
> [snip]
> > Division is always messy business.  There are two main
> > algorithms: long-division (shift, compare, subtract) and
> > Newton-Raphson for 1/SQRT. Historically, integer division
> > is done via long-division, and floating point is done via
> > Newton-Raphson (which is faster at 32 bits, and even more
> > advantaged at more).  Both of these done in CPU microcode.
> > For 64bit fixed-point with an emphasis on speed (accuracy is
> > always lost in fractional parts), I'd go Newton-Raphson.
> 
> Since you do have both an integer and a floating point divisor
> available, I'd look into using them to advantage:

I'd be curious how to use an integer divisor...

> Dividing a fixed-point 32:32 number by another number in the same
> format, is identical to dividing a 64-bit number by another, by
> shifting both operands 32 bits left first, then shifting the result
> left and storing as integer:
> 
> This means that you can use the 80-bit fp format to get exact results:
> 
> ; Make sure that the x87 part is set to extended precision first!
> 
>    FILD qword ptr [a]
>    FILD qword ptr [b]
>    FDIVP st(1),st
>    FMUL [two_to_32]	; Do a 32-bit left shift to convert fraction
>    FISTP qword ptr [result]

..... and I'm still interested quite how you'd use an integer divisor.
 
> This is about 50-60 cycles on anything from a classic Pentium and up.

And only works if the integer is 63 bits precision, /not/ 64.
That damn sign bit can be a pain in the arse sometimes...

Phil
-- 
"Home taping is killing big business profits. We left this side blank 
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.

0
Reply Phil 8/22/2006 10:39:20 PM

Phil Carmody wrote:
> I'd be curious how to use an integer divisor...

As Robert said: Division is messy. It's even slow if you let the
processor do it. Hey, even a fsqrt is faster than fdiv...

But if you know the dividend in advance (i.e. it is a constant), you can
greatly speed up the process by multiplying with its reciprocal. This
also works for the integer/fixpoint case if you take some care.

-- 
Gru�,
Sebastian

0
Reply Sebastian 8/23/2006 12:24:04 AM

--B=_H8082f536P2281Hydro+k7N8vOcf026374
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Phil Carmody wrote:
> Terje Mathisen  <spamtrap@crayne.org> writes:
>> ; Make sure that the x87 part is set to extended precision first!
>>
>>    FILD qword ptr [a]
>>    FILD qword ptr [b]
>>    FDIVP st(1),st
>>    FMUL [two_to_32]	; Do a 32-bit left shift to convert fraction
>>    FISTP qword ptr [result]
> 
> .... and I'm still interested quite how you'd use an integer divisor.
>  
>> This is about 50-60 cycles on anything from a classic Pentium and up.
> 
> And only works if the integer is 63 bits precision, /not/ 64.
> That damn sign bit can be a pain in the arse sometimes...

Phil, the original poster did not state that he worked with unsigned 
numbers, so my code should work as is. The extended format for fp 
numbers work with 1+15+64 bits, with an explicit leading 1 bit, so even 
unsigned 64-bit numbers will be stored exactly.

However, if you do need 64-bit unsigned, then you can still use the fp 
divisor, albeit with one or two small tweaks:

a) Pretend that the numbers are signed, then adjust after the fact if 
needed.

b) Load the 64-bit value as two 32-bit unsigned numbers, then do the 
FDIV and convert back.

c) Load only the divisor, convert to a reciprocal, store that as an 
unsigned integer, then do a reciprocal multiplication to generate the 
true result.

c) Inspect the numbers before you start, and then handle the different 
cases in the following way: (result = a/b)

1) (a) and (b) both positive: Use my original code.
2) At least one of (a) and (b) negative, i.e. top bit set:
    Load both numbers in two stages, then do the same division as in (1).

One possibility is that you can get into an overflow situation, where 
the result won't fit inside the 32:32 target field. How do youwant to 
handle this? Trap, overflow silently (truncating the most significant 
bits?), saturate to 0xffff... ?

Terje
-- 
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"


--B=_H8082f536P2281Hydro+k7N8vOcf026374
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline



***********************************************************************
NOTICE: This e-mail transmission, and any documents, files or previous
e-mail messages attached to it, may contain confidential or privileged
information. If you are not the intended recipient, or a person
responsible for delivering it to the intended recipient, you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in or attached to this message is
STRICTLY PROHIBITED. If you have received this transmission in error,
please immediately notify the sender and delete the e-mail and attached
documents. Thank you.
***********************************************************************


--B=_H8082f536P2281Hydro+k7N8vOcf026374--

0
Reply Terje 8/23/2006 8:57:21 AM

Phil,
The mantissa for extended precision 80-bit floating point
is the full 64-bit, only bit number 63 is always 1.
The sign bit is bit number 79, see for example
http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html
and the book The Intel Microprocessors by Barry Brey.
So you can put a 64-bit integer into an extended precision
80-bit floating point without loss of precision.
Regards, Maarten.

"Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
news:87hd04gztz.fsf@nonospaz.fatphil.org...
> Terje Mathisen  <spamtrap@crayne.org> writes:
> > Robert Redelmeier wrote:
> > > spamtrap@crayne.org wrote in part:
> > >> I wrote a C++ class for a 64bit fixed point numeric format,
> > >> 32bit integral and 32bit fractional. I wrote addition,
> > >> subtraction and multiplication operator routines (the latter
> > >> in inline assembly). Now I need an assembly routine to inline
> > >> to compute division, but I'm having many problems finding
> > [snip]
> > > Division is always messy business.  There are two main
> > > algorithms: long-division (shift, compare, subtract) and
> > > Newton-Raphson for 1/SQRT. Historically, integer division
> > > is done via long-division, and floating point is done via
> > > Newton-Raphson (which is faster at 32 bits, and even more
> > > advantaged at more).  Both of these done in CPU microcode.
> > > For 64bit fixed-point with an emphasis on speed (accuracy is
> > > always lost in fractional parts), I'd go Newton-Raphson.
> >
> > Since you do have both an integer and a floating point divisor
> > available, I'd look into using them to advantage:
>
> I'd be curious how to use an integer divisor...
>
> > Dividing a fixed-point 32:32 number by another number in the same
> > format, is identical to dividing a 64-bit number by another, by
> > shifting both operands 32 bits left first, then shifting the result
> > left and storing as integer:
> >
> > This means that you can use the 80-bit fp format to get exact results:
> >
> > ; Make sure that the x87 part is set to extended precision first!
> >
> >    FILD qword ptr [a]
> >    FILD qword ptr [b]
> >    FDIVP st(1),st
> >    FMUL [two_to_32] ; Do a 32-bit left shift to convert fraction
> >    FISTP qword ptr [result]
>
> .... and I'm still interested quite how you'd use an integer divisor.
>
> > This is about 50-60 cycles on anything from a classic Pentium and up.
>
> And only works if the integer is 63 bits precision, /not/ 64.
> That damn sign bit can be a pain in the arse sometimes...
>
> Phil
> --
> "Home taping is killing big business profits. We left this side blank
> so you can help." -- Dead Kennedys, written upon the B-side of tapes of
> /In God We Trust, Inc./.
>

0
Reply Maarten 8/23/2006 2:43:54 PM

"Maarten Kronenburg"  <spamtrap@crayne.org> writes:
> Phil,
> The mantissa for extended precision 80-bit floating point
> is the full 64-bit, only bit number 63 is always 1.
> The sign bit is bit number 79, see for example
> http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html
> and the book The Intel Microprocessors by Barry Brey.
> So you can put a 64-bit integer into an extended precision
> 80-bit floating point without loss of precision.

I know. The FPU's fine, it's the interface from FP to integer
that's the problem. You can't (just) use a FISTP though.
(Unless you know that the answer is unique modulo 2^64.)

Phil
-- 
"Home taping is killing big business profits. We left this side blank 
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.

0
Reply Phil 8/25/2006 3:58:28 AM

Phil Carmody wrote:
> "Maarten Kronenburg"  <spamtrap@crayne.org> writes:
>> Phil,
>> The mantissa for extended precision 80-bit floating point
>> is the full 64-bit, only bit number 63 is always 1.
>> The sign bit is bit number 79, see for example
>> http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html
>> and the book The Intel Microprocessors by Barry Brey.
>> So you can put a 64-bit integer into an extended precision
>> 80-bit floating point without loss of precision.
> 
> I know. The FPU's fine, it's the interface from FP to integer
> that's the problem. You can't (just) use a FISTP though.
> (Unless you know that the answer is unique modulo 2^64.)

Loading the input values is relatively easy, OK?

The fastest might actually be to always load it as a signed 63-bit 
integer, and then use that "sign" bit to add either 0.0 or 2^63 to the 
initial result.

Assuming such maximally large inputs are very rare, it would probably be 
faster to use an actual branch around the fixup add instead:

   fild qword ptr [a]
   test byte ptr [a+7],80h
    jz a_done
   fadd dword ptr [two_to_64]
a_done:
   fild qword ptr [b]
   test byte ptr [b+7],80h
    jz b_done
   fadd dword ptr [two_to_64]
b_done:
   fdivp st(1), st
   fmul dword ptr [two_to_32]	;; Convert back to 32:32 fixed point!

At this point we have several possible outcomes, all positive:

a) Underflow: The result is less than 1 (corresponding to 2^-31)
Storing this as an integer (with truncation) will result in zero, which 
is correct.

b) In-range (2^63 > result >= 1): Storing as integer is OK

c) In-range but too large for 64-bit int (2^64 > result >= 2^63): 
Storing as integer will overflow.

d) Overflow (result >= 2^64): This cannot be stored as a 64-bit unsigned 
value, and must be handled by higher level sw.

I.e. only case (c) is hard, and it can be handled relatively easy by 
testing the result against 2^63, and if possible, subtract 2^64 to 
convert into an in-range 64-bit integer value which when stored will 
have the correct bitpattern, similar to my input adjustment:

   if (result >= two_to_63) {
     if (result >= two_to_64)
        trap(OVERFLOW);
     result -= two_to_64;
   }

Terje
-- 
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

0
Reply Terje 8/25/2006 11:42:39 AM

But there is no need to use FIST/FILD.
You can also write/read an extended precision
floating point (10 bytes) in memory with a value,
and use fld tword [edx] where edx is the address.
(for Turbo Assembler it's fld [tbyte edx]),
and vice versa with fst.
Only then you must also write/read the exponent,
which will be a little slower.

"Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message
news:87lkpdfouz.fsf@nonospaz.fatphil.org...
> "Maarten Kronenburg"  <spamtrap@crayne.org> writes:
> > Phil,
> > The mantissa for extended precision 80-bit floating point
> > is the full 64-bit, only bit number 63 is always 1.
> > The sign bit is bit number 79, see for example
> > http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html
> > and the book The Intel Microprocessors by Barry Brey.
> > So you can put a 64-bit integer into an extended precision
> > 80-bit floating point without loss of precision.
>
> I know. The FPU's fine, it's the interface from FP to integer
> that's the problem. You can't (just) use a FISTP though.
> (Unless you know that the answer is unique modulo 2^64.)
>
> Phil
> --
> "Home taping is killing big business profits. We left this side blank
> so you can help." -- Dead Kennedys, written upon the B-side of tapes of
> /In God We Trust, Inc./.
>

0
Reply Maarten 8/25/2006 5:23:42 PM

--B=_H8082f536P2281Hydro+k7QHfgcf022412
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Maarten Kronenburg wrote:
> But there is no need to use FIST/FILD.
> You can also write/read an extended precision
> floating point (10 bytes) in memory with a value,
> and use fld tword [edx] where edx is the address.
> (for Turbo Assembler it's fld [tbyte edx]),
> and vice versa with fst.
> Only then you must also write/read the exponent,
> which will be a little slower.

This will indeed be a bit slower!

You have to load the 64-bit mantissa into a pair of registers, and the 
16-bit sign+exponent field into another reg, then check the exponent to 
determine how much you need to shift the mantissa part...

   fstp tword ptr [result]
   mov eax,dword ptr [result]
   mov edx,dword ptr [result+4]
   movzx ecx, word ptr [result+8]

Since we want a fp 32:32-bit number which will fit into a 64-bit 
unsigned integer, we know that the result of the division should be 
multiplied by 2^32, i.e. shifted 32 bits, which is equal to adding 32 to 
the exponent. Since this exponent field is biased by 16384, we can 
correct this in one instruction:

   sub ecx,16384-32

If the result is negative, the 64-bit value must be shifted right, if it 
is zero then the result will fit exactly into the 64-bit EDX:EAX pair, 
and if it is positive then we have an overflow situation. :-(

    ja OverflowTrap
    jz Done

Now we have to check the shift count:

   neg ecx
   cmp ecx, 32
    jb DoTheShift
   mov eax,edx
   xor edx,edx
   cmp ecx, 64
    jb DoTheShift

The result underflows, so return zero!

   xor eax,eax
    jmp Done

The shift count in ECX will use the last 5 bits only!
DoTheShift:
   shrd edx,eax,cl
   shr edx,cl

Done:
   return	; Result in EDX:EAX

Terje

-- 
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"


--B=_H8082f536P2281Hydro+k7QHfgcf022412
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline



***********************************************************************
NOTICE: This e-mail transmission, and any documents, files or previous
e-mail messages attached to it, may contain confidential or privileged
information. If you are not the intended recipient, or a person
responsible for delivering it to the intended recipient, you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in or attached to this message is
STRICTLY PROHIBITED. If you have received this transmission in error,
please immediately notify the sender and delete the e-mail and attached
documents. Thank you.
***********************************************************************


--B=_H8082f536P2281Hydro+k7QHfgcf022412--

0
Reply Terje 8/26/2006 5:41:23 PM

10 Replies
174 Views

(page loaded in 0.077 seconds)

Similiar Articles:













7/20/2012 4:14:34 AM


Reply: