Suppose I want to divide an integer with another integer,
how many cycles does it take in general on PVI and PIII?
I notice that the output of a divide is usually float, does
that mean the divide is only for floats, meaning those
two operands got converted into float?
How expensive is modulo in terms of cycle count?
Cheers,
Jimmy
|
|
0
|
|
|
|
Reply
|
Jimmy
|
9/6/2004 8:46:29 AM |
|
Jimmy zhang wrote:
> Suppose I want to divide an integer with another integer,
> how many cycles does it take in general on P4 and PIII?
IA-32 Optimization Reference Manual
Appendix C (IA-32 Instruction Latency and Throughput)
ftp://download.intel.com/design/Pentium4/manuals/24896611.pdf
Signed integer division:
Northwood = 56-70 cycles
Prescott = 66-80 cycles
Pentium M = undisclosed
I don't have the Pentium III latencies.
> I notice that the output of a divide is usually float, does
> that mean the divide is only for floats, meaning those
> two operands got converted into float?
??
Integer division of a by b produces two integers: q (the quotient)
and r (the remainder) such that a = b*q + r and 0 <= r < b
> How expensive is modulo in terms of cycle count?
Modulo 2^n is a simple bit mask operation. Otherwise, see above.
--
Regards, Grumble
|
|
0
|
|
|
|
Reply
|
Grumble
|
9/6/2004 10:41:26 AM
|
|
"Jimmy zhang" <jzhang@ximpleware.com> wrote:
>Suppose I want to divide an integer with another integer,
>how many cycles does it take in general on PVI and PIII?
>...
>How expensive is modulo in terms of cycle count?
The integer division instruction produces two results: a quotient and a
remainder. The modulo operation is, of course, the remainder of a
division. So, the answer to your second question is exactly the same as
the answer to your first question.
--
- Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.
|
|
0
|
|
|
|
Reply
|
Tim
|
9/7/2004 4:46:29 AM
|
|
if the divisor equals 2^x, where x is an integer, does the cycle count
remains the same?
"Grumble" <a@b.c> wrote in message news:chhfbk$c8$1@news-rocq.inria.fr...
> Jimmy zhang wrote:
>
> > Suppose I want to divide an integer with another integer,
> > how many cycles does it take in general on P4 and PIII?
>
> IA-32 Optimization Reference Manual
> Appendix C (IA-32 Instruction Latency and Throughput)
> ftp://download.intel.com/design/Pentium4/manuals/24896611.pdf
>
> Signed integer division:
> Northwood = 56-70 cycles
> Prescott = 66-80 cycles
> Pentium M = undisclosed
>
> I don't have the Pentium III latencies.
>
> > I notice that the output of a divide is usually float, does
> > that mean the divide is only for floats, meaning those
> > two operands got converted into float?
>
> ??
>
> Integer division of a by b produces two integers: q (the quotient)
> and r (the remainder) such that a = b*q + r and 0 <= r < b
>
> > How expensive is modulo in terms of cycle count?
>
> Modulo 2^n is a simple bit mask operation. Otherwise, see above.
>
> --
> Regards, Grumble
>
|
|
0
|
|
|
|
Reply
|
Jimmy
|
9/8/2004 8:23:42 AM
|
|
Jimmy zhang wrote:
> if the divisor equals 2^x, where x is an integer, does the cycle
> count remain the same?
In that case, you should use SHR instead of IDIV.
|
|
0
|
|
|
|
Reply
|
Grumble
|
9/8/2004 6:58:08 PM
|
|
"Jimmy zhang" <spamtrap@crayne.org> wrote in message news:<NaV_c.6697$vy.1320@attbi_s52>...
> Suppose I want to divide an integer with another integer,
> how many cycles does it take in general on PVI and PIII?
> I notice that the output of a divide is usually float, does
> that mean the divide is only for floats, meaning those
> two operands got converted into float?
>
> How expensive is modulo in terms of cycle count?
It will depend on a number of factors. Assuming div reg32, m32 I think
it is 3 uops + 20 from microcode (on a netburst architecture) with
about a 50 cycle latency (if the next instruction is dependent). These
are, by necessity, only aproximate values. Agner.org and intel's
netburst documentation has some better details. Div, however, does not
return a float -- it returns 2 unsigned values (not quite the same
thing).
|
|
0
|
|
|
|
Reply
|
spamtrap
|
9/17/2004 11:10:04 AM
|
|
|
5 Replies
366 Views
(page loaded in 0.07 seconds)
Similiar Articles: cycle count for divide - comp.lang.asm.x86Suppose I want to divide an integer with another integer, how many cycles does it take in general on PVI and PIII? I notice that the output of a divid... Counting stress cycles in fatigue analysis - comp.soft-sys.matlab ...cycle count for divide - comp.lang.asm.x86 Counting stress cycles in fatigue analysis - comp.soft-sys.matlab ... cycle count for divide - comp.lang.asm.x86 Cycle Counting ... Multipliers, dividers, adders etc.... - comp.lang.verilog ...VHDL code for multiplier - comp.lang.vhdl cycle count for divide - comp.lang.asm.x86 Multipliers, dividers, adders etc.... - comp.lang.verilog ... cycle count for divide ... Pentium 4's Latency - comp.lang.asm.x86cycle count for divide - comp.lang.asm.x86 Pentium 4's Latency - comp.lang.asm.x86... Pentium-4 where, as Grumble pointed > out, it is 7 cycles latency and 1 cycle ... DIV: Number of Operands = 1? or 2? - comp.lang.asm.x86cycle count for divide - comp.lang.asm.x86... is only for floats, meaning those > two operands got converted into float? ?? Integer division of ... how to calcule the division modulo 2 reminder - comp.sys.hp48 ...cycle count for divide - comp.lang.asm.x86 how to calcule the division modulo 2 reminder - comp.sys.hp48 ... cycle count for divide - comp.lang.asm.x86 how to calcule the ... Locking 75 MHz operation - almost... - comp.sys.hp48cycle count for divide - comp.lang.asm.x86 Locking 75 MHz operation - almost... - comp.sys.hp48 cycle count for divide - comp.lang.asm.x86 Modulo 2^n is a simple bit mask ... Hex to ascii - comp.lang.asm.x86As a general rule, I stay away from the cycle counting > stuff that varies from CPU to ... As I said, my attempt at "divide by multiplication" > (which didn't work out well ... Floating point problem (again) - comp.lang.asm.x86Additionally, AMD lists the cycle counts for fdiv and fsqrt in single, double, and ... Divide and square root [...] > > Interesting. The P4 manuals say nothing about ... Fast bit-reverse on an x86? - comp.dspIn the following, reverse_small is the divide-and-conquer method applied to a ... means is that some instances were paired, so adding nothing to the overall cycle count. cycle count for divide - comp.lang.asm.x86 | Computer GroupSuppose I want to divide an integer with another integer, how many cycles does it take in general on PVI and PIII? I notice that the output of a divid... US Patent # 5,127,036. Fifty percent duty cycle divided-by-m ...... is a divide ... duty cycle, is defined to have a high interval and a low interval of equal duration. The divide-by-m counter includes a modulo binary counter for counting ... 7/23/2012 7:28:50 PM
|