|
|
Pentium 4's Latency
I review VTune Reference that Pentium 4 has a lot of latency. How do
you know that Intel promises to fix latency problem?
For example, SHL/SHR can't be used, but use 3 adds instead. How can 3
adds be done? Suppose, 20H is in low byte that needs to be moved to high
byte by using shl eax, 08H. Will lea be the option, but I don't think so.
shl eax, 08H is the same as imul eax, 0FFH, but both of them have latency on
Pentium 4.
Please let me know if there is another way to avoid latency by writing
clever code without using shl? Please advise.
--
Bryan Parkoff
|
|
0
|
|
|
|
Reply
|
Bryan
|
1/31/2004 3:27:51 AM |
|
Bryan,
If you need to do bit shifts and cannot substitute the operation with
up to 3 adds, you are stuck with a slow operation as the PIV performs
badly there. With a bit of fiddling, you can schedule instructions
before and after it at times that helps to hide the lag but if you can
find another way to do it, it would be a better solution.
LEA is off the pace as well but not as bad as bits shifts.
Regards,
hutch at movsd dot com
|
|
0
|
|
|
|
Reply
|
hutch
|
1/31/2004 6:54:19 AM
|
|
|
1 Replies
289 Views
(page loaded in 0.049 seconds)
Similiar Articles: Pentium 4's Latency - comp.lang.asm.x86I review VTune Reference that Pentium 4 has a lot of latency. How do you know that Intel promises to fix latency problem? For example, SHL/SH... AMD vs Intel timing on this code... - comp.lang.asm.x86Pentium 4's Latency - comp.lang.asm.x86 AMD vs Intel timing on this code... - comp.lang.asm.x86 Pentium 4's Latency - comp.lang.asm.x86 AMD vs Intel timing on this code ... slow MUL & DIV - comp.lang.asm.x86Pentium 4's Latency - comp.lang.asm.x86 slow MUL & DIV - comp.lang.asm.x86 Pentium 4's Latency - comp.lang.asm.x86 slow MUL & DIV - comp.lang.asm.x86 Is it true for 486 ... cycle count for divide - comp.lang.asm.x86Pentium 4's Latency - comp.lang.asm.x86 cycle count for divide - comp.lang.asm.x86 Pentium 4's Latency - comp.lang.asm.x86... Pentium-4 where, as Grumble pointed > out, it ... Rendering Latency - comp.graphics.api.openglPentium 4's Latency - comp.lang.asm.x86 OpenGL video rendering - comp.graphics.api.opengl Pentium 4 2.2 768MB, Medion, ti4200 128MB Windows XP SP2 All of them using the ... Non Intel & AMD Arch - comp.lang.asm.x86Pentium 4's Latency - comp.lang.asm.x86 Non Intel & AMD Arch - comp.lang.asm.x86 It's 1-2 clocks > on all superscalar x86 CPUs *except* the Pentium-4 where, as Grumble ... MMX reverse byte order instr. - comp.lang.asm.x86Pentium 4's Latency - comp.lang.asm.x86 MMX reverse byte order instr. - comp.lang.asm.x86 Pentium 4's Latency - comp.lang.asm.x86 Suppose, 20H is in low byte that needs to ... Onboard Local Oscillator Change Improvements - comp.protocols.time ...Pentium 4's Latency - comp.lang.asm.x86 Onboard Local Oscillator Change Improvements - comp.protocols.time ... Guys, The performance with ntpd running in a rusting Pentium ... Fast bit-reverse on an x86? - comp.dspPentium 4's Latency - comp.lang.asm.x86 Fast bit-reverse on an x86? - comp.dsp Pentium 4's Latency - comp.lang.asm.x86 Fast bit-reverse on an x86? - comp.dsp match SSE2 ... About instruction lea - comp.lang.asm.x86... great deal of multiply-and-add scenarios, yes, it's ... The second: sall $8, %ebx orl %eax, %ebx On a Pentium 4 ... On amd64- and athlon cpus lea has a latency of two cycles. Pentium 4's Latency - comp.lang.asm.x86 | Computer GroupI review VTune Reference that Pentium 4 has a lot of latency. How do you know that Intel promises to fix latency problem? For example, SHL/SH... Two Methods for Measuring Memory Latency on Intel Pentium 4 ...We have touched upon the issue of measuring memory latency on platforms with Intel Pentium 4 series processors many times. There is nothing surpris... 7/30/2012 9:26:17 AM
|
|
|
|
|
|
|
|
|