f



how to disassemble ppc 'signed immediate' operands ?

I am writing my own replacement 'dis' command for ppc Mac Open
Firmware's command interpreter, since the built-in version fails to
disassemble the code it merely dumps unprocessed on screen.

I would be interested to hear from any veterans at reading disassembled
ppc code if they have any strong likes or dislikes about how other
disassemblers format their output.

More urgently,  I notice the other disassemblers I have seen seem to
output their "signed immediate" (16-bit integer) operands unsigned. Is
that a result of laziness, an oversight, or is there some valid reason
why it should be done that way?

Thanks,
             Tink
0
tinkerer (32)
4/10/2008 1:04:19 PM
comp.sys.powerpc.tech 819 articles. 1 followers. Post Follow

4 Replies
507 Views

Similar Articles

[PageSpeed] 42

Tinkerer wrote:
> likes or dislikes about how other disassemblers format their output.

See PowerISA V2.05. (No real world examples there, unfortunately.)
You could also adopt GCC asm output. Beware of operand order!

> More urgently,  I notice the other disassemblers I have seen seem to
>  output their "signed immediate" (16-bit integer) operands unsigned.

> Is that a result of laziness, an oversight, or is there some valid 
> reason why it should be done that way?

Very likely this is an oversight.

IMO, it is important to remind users that signed 16bit values are 
sign-extended to the full register length (32/64bit).

So e.g. on a 64bit PPC machine, 0x7fff becomes 0x0000000000007fff, but 
0x8000 becomes 0xffffffffffff8000.

"li r4, 0x8000" (wrong, unsigned output) is therefore misleading - it
gives you the impression that r4 contains 0x0000000000008000 after the 
operation.

"li r4, -32768" (or "li r4, -0x8000"), however, shouts "remember sign
extension" at you, which is IMHO much better.

HTH -H
0
Hagen
4/14/2008 8:12:18 AM
Hagen Patzke <HPatzke@hotmail.com> wrote:

> Tinkerer wrote:
> > likes or dislikes about how other disassemblers format their output.
> 
> See PowerISA V2.05. (No real world examples there, unfortunately.)
> You could also adopt GCC asm output. Beware of operand order!

I had no idea those docs existed. I have been working from a 1995 book I
found, Gary Kacmarcik: "Optimising PowerPC Code: Programming the PowerPC
Chip in Assembly Language" which contains hardly any examples of
assembly code. I copied the ppc instruction binary field maps into a
filemaker pro database which I used to reveal patterns and calculate
opcodes. 

At the very least it will be original :-)

> > More urgently,  I notice the other disassemblers I have seen seem to
> >  output their "signed immediate" (16-bit integer) operands unsigned.
> 
> > Is that a result of laziness, an oversight, or is there some valid 
> > reason why it should be done that way?
> 
> Very likely this is an oversight.
> 
> IMO, it is important to remind users that signed 16bit values are 
> sign-extended to the full register length (32/64bit).
> 
> So e.g. on a 64bit PPC machine, 0x7fff becomes 0x0000000000007fff, but
> 0x8000 becomes 0xffffffffffff8000.
> 
> "li r4, 0x8000" (wrong, unsigned output) is therefore misleading - it
> gives you the impression that r4 contains 0x0000000000008000 after the
> operation.
> 
> "li r4, -32768" (or "li r4, -0x8000"), however, shouts "remember sign
> extension" at you, which is IMHO much better.

I agree. 

The disassembler is almost finished - just a bit of tidying up the
source code and fine-tuning output remains to be done.

I think I'll re-do the signed immediate output so that any value between
+ or - 9 is a single digit, any value outside that range is prefixed
with 0x, and the minus sign, if there is one, moved ahead of the 0x (if
there is one) as in your example. 

I originally liked the idea of the number of digits reflecting the
register size the operand was sourced from. While that looks good for
high memory addresses, I now find it doesn't look so good for low
positive numbers where most of the digits are zeroes.

Cheers
0
tinkerer
4/14/2008 4:05:04 PM
>Hagen Patzke <HPatzke@hotmail.com> wrote:
>> See PowerISA V2.05. (No real world examples there, unfortunately.)

Tinkerer wrote:
> I had no idea those docs existed. 

IBM has a lot of resources for the Power Architecture:
http://www-03.ibm.com/technology/ges/semiconductor/power/index.html
http://www.ibm.com/developerworks/views/power/downloads.jsp

PowerISA V2.05 can be found here, at the Power.org site:
(http://www.power.org/resources/reading/)
http://www.power.org/resources/reading/PowerISA_V2.05.pdf

 > I copied the ppc instruction binary field maps into a
> filemaker pro database which I used to reveal patterns and calculate
> opcodes.

Sounds a bit like my eighties' procedure for a 6502 disassembler. :-)
Except I had to do it by hand - but then the processor is much simpler.

> At the very least it will be original :-)

At the very least you will have a very good understanding of the opcode 
structure, instruction set, and even optimizations. Good!

The Instruction Set Architecture (ISA) documentation will help you to 
verify if you really found the correct structures. :-)

> The disassembler is almost finished - just a bit of tidying up the
> source code and fine-tuning output remains to be done.

If you have a PC with Win[95..Vista], you can compare your disassembler 
output with the one from a debugger demo. Download "Simulator for 
PowerPC" from http://www.lauterbach.com/download.html

> I originally liked the idea of the number of digits reflecting the
> register size the operand was sourced from. While that looks good for
> high memory addresses, I now find it doesn't look so good for low
> positive numbers where most of the digits are zeroes.

I agree. Too much visual clutter. GCC output for the GNU Assembler 
("Gas"), on the other hand, IMHO is extremely terse.

As said earlier, you have to be very careful with operand ordering.
Unfortunately, in the ISA assembly the ordering of source/destination in 
the assembly mnemonic is not completely consistent.

Good luck! -H
0
Hagen
4/15/2008 1:33:45 PM
On Apr 15, 11:33=A0pm, Hagen Patzke <HPat...@hotmail.com> wrote:
> >Hagen Patzke <HPat...@hotmail.com> wrote:

Apologies for not answering sooner. Your reply never appeared on my
network's news server.

> IBM has a lot of resources for the Power Architecture:http://www-
> 03.ibm.com/technology/ges/semiconductor/power/index.html
> http://www.ibm.com/developerworks/views/power/downloads.jsp
>
> PowerISA V2.05 can be found here, at the Power.org site:
> (http://www.power.org/resources/reading/)http://www.power.org/resources/re=
ading/PowerISA_V2.05.pdf

Definitely a must-read, but maybe I need to give priority to improving
my forth technique in the meantime :-)
My disassembler seems to work, but I still haven't had a chance to
actually try it (as opposed to
running a few systematic tests on it). I wrote it in beginner forth,
mostly straight out of Leo Brodie's "Staring Forth" - a book I still
haven't had a chance to finish reading.

> =A0> I copied the ppc instruction binary field maps into a
>
> > filemaker pro database which I used to reveal patterns and calculate
> > opcodes.
>
> Sounds a bit like my eighties' procedure for a 6502 disassembler. :-)
> Except I had to do it by hand - but then the processor is much simpler.

My favourite micro back then was the Commodore 128. I added equivalent
Z80 capabilities to its built-in 6502 Machine Language Monitor. (It
had both CPU's on board but I doubt many people ever realized they
were capable of working co-operatively within the same memory map.

> > At the very least it will be original :-)
>
> At the very least you will have a very good understanding of the opcode
> structure, instruction set, and even optimizations. Good!

Yes. Even before reading the docs you have pointed out, I acquired the
ability to see patterns which initially  appeared way too complex.
Looking at my ppc disassembler again now, I could probably replace all
the individualized operand extraction functions with two or three
general-purpose ones. The way they use the stack is a natural for
recursion.

> The Instruction Set Architecture (ISA) documentation will help you to
> verify if you really found the correct structures. :-)
>
> > The disassembler is almost finished - just a bit of tidying up the
> > source code and fine-tuning output remains to be done.
>
> If you have a PC with Win[95..Vista], you can compare your disassembler
> output with the one from a debugger demo. Download "Simulator for
> PowerPC" fromhttp://www.lauterbach.com/download.html

I'll give it a try next time I'm on one of those infernal machines
after I get past all the automatic updates for things I never
installed and haven't a clue what they are.

> > I originally liked the idea of the number of digits reflecting the
> > register size the operand was sourced from. While that looks good for
> > high memory addresses, I now find it doesn't look so good for low
> > positive numbers where most of the digits are zeroes.
>
> I agree. Too much visual clutter. GCC output for the GNU Assembler
> ("Gas"), on the other hand, IMHO is extremely terse.

One thing I am not satisfied with is the problem of disassembling code
temporarily located in non-aligned blocs of memory. If I provide the
branch-target addresses with the same 2-bit offset as the address of
the instruction being disassembled, the calculated branch targets
remain correct within the bloc being moved around in memory, but can
no longer be guaranteed correct when accessing properly installed
code. And vice versa. The disassembler was intended as a tool for
exploration, so I wanted it to disassemble anything in memory,
regardless of where it might be found. In the end, I sacrificed that
design goal for accuracy. (The built-in 'dis' will "disassemble" from
non-aligned addresses, but it just dumps the next 4 bytes without
actually disassembling anything).

> As said earlier, you have to be very careful with operand ordering.
> Unfortunately, in the ISA assembly the ordering of source/destination in
> the assembly mnemonic is not completely consistent.

I had previously noticed in particular that fields B and C are always
swapped, T and A are never swapped. However field T ("Target") is
sometimes labelled S ("Source") and fields S and A are sometimes
swapped - and sometimes they're not (!)

> Good luck! -H

Cheers. Good to know I'm not the only one interested in such mind-
tickling esoterica. :-)

0
tinkerer
4/21/2008 4:05:44 AM
Reply: