This is a really cool algorithm I found on-line, and I'm trying to
clean up the code and make it more useful. Arg1 is the source address
of the plaintext (8 bytes) and arg2 is the destination. delta, dis,
and n are just parts of the algorithm. k is the 16 byte key.
There are 2 things I'd like to improve. First, I'm not sure it uses
the best register choice in several places. I'd like any thoughts you
might have on that. Second, and this ties in with the first, the key
value should really be a parameter. But since it requires 16 bytes, it
wouldn't fit in a register. I'd like to avoid breaking it up into 4
parts. But it seems that every register is already being used. Maybe
locals is the solution
So, is there a better choice of register usage. And would it make
sense just to set up 4 locals with each of the 4 dwords from the key
before the algorithm starts, and just use them as necessary?
delta equ 0x9e3779b9
dis equ 0xc6ef3720
n equ 32
section .data
k dd 123,456,789,112
section .text
; --------------------------------------
tea_encode:
BEGIN
mov eax,arg1
mov ebx,[eax+4]
mov eax,[eax]
mov ecx,n ; n (count)
xor edx,edx ; sum
..enstart:
mov esi,ebx
mov edi,esi ; esi, edi = z
shl esi,4
shr edi,5
xor edi,esi
add edi,ebx
mov esi,edx
and esi,3
mov esi,[k+esi*4]
add esi,edx
xor esi,edi
add eax,esi
add edx,delta
mov esi,eax
mov edi,esi
shl esi,4
shr edi,5
xor edi,esi
add edi,eax
mov esi,edx
shr esi,11
and esi,3
mov esi,[k+esi*4]
add esi,edx
xor esi,edi
add ebx,esi
loop .enstart
mov edi,arg2
mov [edi+0*4],eax
mov [edi+1*4],ebx
END
; --------------------------------------
|