I'm trying to use an assembly repeatting instruction to perform a case
insensitive string compare. This that possible? Or do I just have to
use the traditional loop and bitmask?
|
|
0
|
|
|
|
Reply
|
spamtrap2 (1628)
|
9/12/2005 5:41:50 PM |
|
On Mon, 12 Sep 2005 17:41:50 +0000 (UTC), spamtrap@crayne.org wrote in
comp.lang.asm.x86:
> I'm trying to use an assembly repeatting instruction to perform a case
> insensitive string compare. This that possible? Or do I just have to
> use the traditional loop and bitmask?
There is another way that takes more memory, and probably is not any
faster on the newer processors.
You could build yourself a 256 byte 'toupper' table, where all the
bytes held the same value as their position in the table, except for
the lower case letters which would have the values of their upper case
versions.
Then, assuming for example, (e)si points to one string, (e)di points
to another, and (e)bx points to the first byte of the table, you could
loop something like this:
repeat:
ldsb ; get byte from first string
xlat ; convert to upper case if lower
mov ah, al ; save in ah
mov al, [(e)di] ; get byte from second string
inc (e)di ; increment pointer into second string
xlat ; convert to upper case if lower
cmp al, ah ; see if they are equal
End of string checking, by length, or zero byte, or whatever you wish
to use, omitted.
But I think this would be quite a bit slower than the loop you're
thinking of.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
|
|
0
|
|
|
|
Reply
|
Jack
|
9/13/2005 2:43:46 AM
|
|
NONAME<spamtrap@crayne.org> asked:
| I'm trying to use an assembly repeatting instruction to perform a case
| insensitive string compare. This that possible? Or do I just have to
| use the traditional loop and bitmask?
By using REP ?
NO.
Except if you modify both strings before:
....
REP CMPSB
What's traditional ?
mov cx,size_of
Loop:
mov al,[si]
inc si
mov ah,[di]
inc di
and ax,0xdfdf ;or ax 0x2020 would do it also
cmp al,ah
jnz out ;*
dec cx ;*
jnz Loop ;*
out:
)* this lines can be replaced by LOOPE Loop (repeat lines until not equal),
even it may not be faster on latest CPUs.
REP,REPE,REPNE work only on single block instructions.
__
wolfgang
|
|
0
|
|
|
|
Reply
|
wolfgang
|
9/13/2005 7:56:52 PM
|
|
Jack Klein schrieb:
> On Mon, 12 Sep 2005 17:41:50 +0000 (UTC), spamtrap@crayne.org wrote in
> comp.lang.asm.x86:
>
> > I'm trying to use an assembly repeatting instruction to perform a case
> > insensitive string compare. This that possible? Or do I just have to
> > use the traditional loop and bitmask?
>
> There is another way that takes more memory, and probably is not any
> faster on the newer processors.
>
> You could build yourself a 256 byte 'toupper' table, where all the
> bytes held the same value as their position in the table, except for
> the lower case letters which would have the values of their upper case
> versions.
>
> Then, assuming for example, (e)si points to one string, (e)di points
> to another, and (e)bx points to the first byte of the table, you could
> loop something like this:
>
> repeat:
> ldsb ; get byte from first string
> xlat ; convert to upper case if lower
> mov ah, al ; save in ah
> mov al, [(e)di] ; get byte from second string
> inc (e)di ; increment pointer into second string
> xlat ; convert to upper case if lower
> cmp al, ah ; see if they are equal
>
> End of string checking, by length, or zero byte, or whatever you wish
> to use, omitted.
>
> But I think this would be quite a bit slower than the loop you're
> thinking of.
Your suggested lookup loop is fine, because both strings should be
converted to the same case anyway. But i would avoid xlat (5 cycles
vector path on Athlons) and replace it by standard instructions:
lea ebx, upperCase
repeat:
movzx eax, byte ptr [esi] ; get byte from first string
movzx edx, byte ptr [edi] ; get byte from second string
mov eax, [ebx+eax] ; convert to upper case if lower
inc esi ; increment pointer into first string
inc edi ; increment pointer into second string
cmp eax, [ebx+edx] ; see if they are equal
....
One possible improvement should be to process two characters at once,
eg. with a 64K entry lookup table. With SSE2 (for long strings!) one
may implement it SIMD wise with 16 byte characters at once and to
compare ASCII bytes (PCMPGTB) parallel with 0x6060..60 and 0x7a7a..7a.
If greater 0x60 and not greater 0x7a ('a'-'z') one resets 0x20 to
convert lower case to upper case:
; assuming 16 more aligned bytes in both strings
repeat16:
MOVDQA xmm0, dqword ptr [esi]
MOVDQA xmm4, dqword ptr [edi]
add esi, 16 ; increment pointer into first string
add edi, 16 ; increment pointer into second string
MOVDQA xmm1, xmm0
MOVDQA xmm2, xmm0
MOVDQA xmm5, xmm4
MOVDQA xmm6, xmm4
PCMPGTB xmm1, [seventyAHex] ; > 0x7a ? 0xff : 0x00
PCMPGTB xmm5, [seventyAHex] ; > 0x7a ? 0xff : 0x00
PCMPGTB xmm2, [sixtyHex] ; > 0x60 ? 0xff : 0x00
PCMPGTB xmm6, [sixtyHex] ; > 0x60 ? 0xff : 0x00
PANDN xmm1, xmm2 ; <= 0x7a && > 0x60 -> lower case in s1
PANDN xmm5, xmm6 ; <= 0x7a && > 0x60 -> lower case in s2
PAND xmm1, [twentyHex] ; mask 0x20 from 0xff in lower case s1
PAND xmm5, [twentyHex] ; mask 0x20 from 0xff in lower case s2
PXOR xmm1, xmm0 ; and toggle it to upper case s1
PXOR xmm5, xmm4 ; and toggle it to upper case s2
PCMPEQB xmm5, xmm1 ; final compare
PMOVMSKB eax, xmm5 ; get the 16 sign bits to ax
cmp eax, 0xffff ; all bytes equal ?
jne notEqual
....
On todays processors on should also avoid using Repeated String
Instructions. See Software Optimization Guide for AMD Athlon� 64 and
AMD Opteron� Processors Chapter 8 Integer Optimizations Page 169.
Gerd
>
> --
> Jack Klein
> Home: http://JK-Technology.Com
> FAQs for
> comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
> comp.lang.c++ http://www.parashift.com/c++-faq-lite/
> alt.comp.lang.learn.c-c++
> http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
|
|
0
|
|
|
|
Reply
|
Gerd
|
9/13/2005 8:21:44 PM
|
|
wolfgang kern wrote:
> NONAME<spamtrap@crayne.org> asked:
>
>> I'm trying to use an assembly repeatting instruction to perform a
>> case insensitive string compare. This that possible? Or do I just
>> have to use the traditional loop and bitmask?
>
> By using REP ?
> NO.
> Except if you modify both strings before:
> ...
> REP CMPSB
>
> What's traditional ?
>
> mov cx,size_of
> Loop:
> mov al,[si]
> inc si
lodsb
> mov ah,[di]
> inc di
> and ax,0xdfdf ;or ax 0x2020 would do it also
> cmp al,ah
> jnz out ;*
> dec cx ;*
> jnz Loop ;*
loop Loop
> out:
>
> )* this lines can be replaced by LOOPE Loop (repeat lines until not
> equal), even it may not be faster on latest CPUs.
>
> REP,REPE,REPNE work only on single block instructions.
Hot d*mn!
I reinvented this wheel too!
but you may get false matches with special chars.
|
|
0
|
|
|
|
Reply
|
Mike
|
9/14/2005 9:45:59 AM
|
|
Mike Jones wrote:
[about..]
| I reinvented this wheel too!
| but you may get false matches with special chars.
Sure,
if L/Ucase compare/convert shall include all-ASCII characters
then additional filters (40..5a/60..7a) are needed.
__
wolfgang
|
|
0
|
|
|
|
Reply
|
wolfgang
|
9/14/2005 11:24:54 AM
|
|
--B=_H8082f536Pd3Hydro+j8EDZPQS018821
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
wolfgang kern wrote:
> Mike Jones wrote:
>
> [about..]
> | I reinvented this wheel too!
> | but you may get false matches with special chars.
>
> Sure,
> if L/Ucase compare/convert shall include all-ASCII characters
> then additional filters (40..5a/60..7a) are needed.
There are two possible ways to do a case-insensitive compare (or search):
Either you convert both samples to the same case, via a upcase[] lookup
table, or you convert one of them to the inverse case, via another
lookup table, and then check both versions.
The first approach needs two table lookups, while the other needs two
compares.
The really interesting problem is case-ignore search though, in
particular Boyer-Moore:
For 8-bit characters my best code monocases the search pattern during
setup, then uses a case-ignore version of the skip table, where all
possible matches are encoded as length zero.
When you find a possible starting (actually ending!) char, then you do a
regular case-ignore compare on the remainder, using a lookup table to
convert each candidate char to the same case as the pattern.
Going to 16+ bits (unicode) make the problem much more interesting!
The fastest approach I've found is to generate two versions of the
search pattern during setup, one all uppercase and one all lowercase,
then compare the target against both of them at the same time!
This is much faster than doing an inline function call to uppercase each
character during the comparison process. :-)
Terje
--
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
--B=_H8082f536Pd3Hydro+j8EDZPQS018821
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
***********************************************************************
NOTICE: This e-mail transmission, and any documents, files or previous
e-mail messages attached to it, may contain confidential or privileged
information. If you are not the intended recipient, or a person
responsible for delivering it to the intended recipient, you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in or attached to this message is
STRICTLY PROHIBITED. If you have received this transmission in error,
please immediately notify the sender and delete the e-mail and attached
documents. Thank you.
***********************************************************************
--B=_H8082f536Pd3Hydro+j8EDZPQS018821--
|
|
0
|
|
|
|
Reply
|
Terje
|
9/14/2005 1:41:36 PM
|
|
Terje Mathisen
| Going to 16+ bits (unicode) make the problem much more interesting!
Yeah, this would need to add some lines :)
| The fastest approach I've found is to generate two versions of the
| search pattern during setup, one all uppercase and one all lowercase,
| then compare the target against both of them at the same time!
| This is much faster than doing an inline function call to uppercase each
| character during the comparison process. :-)
Even I (the library independent hexfreak) may not see it as the fastest
way, Yes Terje, if I would work with the 'Oh so common' HLL/Lib-approach,
I'd see your suggestion as to be the fastest way too :)
__
wolfgang
|
|
0
|
|
|
|
Reply
|
wolfgang
|
9/14/2005 4:50:45 PM
|
|
|
7 Replies
162 Views
(page loaded in 2.288 seconds)
|