investigating ppc code defs on Open Firmware Macs

Now that a Public Domain 'asm' and 'dasm' have been made available for
PowerPC Macintosh Open Firmware, we finally get a look under the hood.

"dis.of", the PD source file, also patches the Apple-supplied 'see' word,
letting it use 'dis' or 'dasm' to disassemble the code definitions it
previously kept out of sight.

As I recall, the Open Firmware command

0 > see cell+

formerly responded with only

cell+  ok 

Lets take a look inside some of the common but previously inscrutible
forth code defs as implemented for my own G4 Macintosh, with the benefit
of these newly available commands:

0 > boot hd:\dis.of

0 > see cell+ code cell+  ( 1st two words = input, 2nd two = output)
ff8495a0: 3a940004   addi       r20,r20,4
ff8495a4: 4e800020   bclr       0x14,0

The second ppc instruction means "branch always" to the address in the
ppc's Link Register. For now, think of "bclr 14,0" as being the code def
equivalent to the more familiar ';' colon def terminator.

As for the rest of that two-line 'cell+' definition, all it seems
to do is add 4 to the number in the ppc's r20 register.

Let's review exactly what it does.
0 >   44  .s -> 44 <- Top  ok
1 > cell+ .s -> 48 <- Top  ok

It appears Open Firmware thinks r20 is the top of the forth parameter
stack. Surely there's more to it than that? We should be able to vanquish
that simplistic notion with a little code def of our own.

0 > 0 4+ . 4  ok
0 > 0 5+ . 
5+, unknown word
0 > 

code 5+
asm addi r20,r20,5
c;       \ automatically appends the requisite "bclr 0x14,0"

The way your code gets echoed back during assembly depends on whether you
have 'asm's def in the "dis.of" source file set to end lines with 'cr' or
'same-line'. (It's easy to change from the Mac desktop - 'cr' leaves both
input and output visible, whereas same-line gives a more compact
disassembly of the instructions reformatted by dasm).

This is how it echoes back when 'asm' is set to 'cr':

0 > code 5+
0 > asm addi r20,r20,5 
ff9e7e90: 3a940005   addi       r20,r20,5

In either case you can easily confirm your code was properly installed
in the forth dictionary:

0 > see 5+
           code 5+
ff9e7e90: 3a940005   addi       r20,r20,5
ff9e7e94: 4e800020   bclr       0x14,0

Looks like c; appended the proper ending.

Now to try it out. Expect an instant computer lock-up if the 'bclr 14,0'
is missing from the end.

0 > 2 5+ . 7  ok
0 > 4 5+ . 9  ok

It worked! r20 must indeed be the top of the forth parameter stack.

Let's see what happens if we use a forth word which increases the number
of items on the parameter stack:

0 > see 2 
          code 2
ff849388: 969ffffc   stwu       r20,-4(r31)
ff84938c: 3a800002   addi       r20,0,2
ff849390: 4e800020   bclr       0x14,0
0 > 

There's the literal '2' at the end of the second instruction, but it
looks like it has been encapsulated in some heavy duty ppc code. To help
understand, recall what happens when you enter '2' at a forth prompt.

0 > showstack  ok
-> <- Empty 3  ok
-> 3 <- Top 2  ok
-> 3 2 <- Top noshowstack  ok

Something has to put that 2 on the stack. Looks like '2' can do it
itself! Those forth experts aren't being pretentious when they talk about
"compiling a number". They mean it is being embedded in code which places
its nominal value on the stack whenever it is executed.

We have already noticed '2' does that by adding the ppc instruction's
immediate data operand 2 to the 0 embedded in another of its fields.
It then stores the result in r20 - the top of the parameter stack. 
But what has happened to the '3' previously in the r20 register?

2 > clear
0 > see 2 
           code 2
ff849388: 969ffffc   stwu       r20,-4(r31)
ff84938c: 3a800002   addi       r20,0,2
ff849390: 4e800020   bclr       0x14,0
0 > 

First we see it pushing r20 onto a real stack in memory prior to loading
r20's new value from the immediate data field of the addi instruction.
'stwu' is a powerful instruction designed to automate stack and frame
pointer manipulation. The 'stw' part refers to the 32-bit word in r20.
The "u" stands for "update" and applies to r31. Thus the forth parameter
stack (the one you see in response to typing .s) is really a hybrid
stack, r20 acting like a single-cell input/output cache for a real memory
stack which employs r31 as its stack pointer. This being the case,
'drop', a forth word which ends up with one less item on the stack than
when it started, should be expected to exhibit the same sequence of 
steps in reverse.

0 > see drop 
              code drop
ff848ec8: 829f0000   lwz        r20,0(r31)
ff848ecc: 3bff0004   addi       r31,r31,4
ff848ed0: 4e800020   bclr       0x14,0

Just as expected. The "s" ("store") in the instruction mnemonic has been
replaced by "l" ("load"), and the "u" ("update") dropped in favour of an
explicit "addi 4" to r31. Thus the overall top item simply vanishes when
r20 overwrites it with the data moved from the 2nd item (always pointed
to by r31), then adjusting r31 so that the original copy is subsequently
excluded from the memory component of the stack. lwzu could not be used to
update r31, as the zero offset required to read the top item on its
portion of the stack would have left it still "on the stack" after it
had been copied to r20. That would result in the top item on the overall
stack (as displayed by .s) appearing to change value rather than be
removed since the stack size would have remained unaltered. Don't dispair
if you can't make sense of all this. "stwu" and its counterparts are
difficult to explain. There are lots of other attempts to do so on the
Web. You just have to keep puzzling over it and trying it out until it

That was only a glimpse to get you started. Apart from a few "how to"
examples in the source file, 'asm' and 'dasm' use almost exclusively
what's known as the ppc's Basic Instruction Set. It's no breeze to
master. The best approach appears to be starting off by focussing on
a small high-use subset. That is exactly what you will be doing if you
carry on from here, initially disassembling the simpler standard forth
code definitions while improvising your own variations as you go.

If you have an adjacent computer connected to the Internet, the link


can be very helpful in getting information about the unfamiliar
instructions you encounter. Just change the instruction name between the
final slash and the ".html" to the instruction you are interested in.

Short of that, 'asm?' and 'asm-help' come in the "dis.of" source file.
They list an example of any/every instruction in the format required
by 'asm'.

A few more to try:

define a constant, eg

44 constant const

'see const' won't reveal how it is implemented because it's a constant
not a code def.

To get around that:

0 > ' const dasm 
ff9e3f60: 969ffffc   stwu       r20,-4(r31)
ff9e3f64: 3a800044   addi       r20,0,0x44
ff9e3f68: 4e800020   bclr       0x14,0

It has exactly the same code template as '2'!

44445555 constant big-const
see big-const
' big-const dasm

The compiler chose an expanded template to accommodate the larger number.
Since all ppc instructions are exactly 32 bits long, there is obviously
not enough room to embed a number with 32 significant bits in a single
instruction. You can see by looking at the encoded instructions to the
left of their mnemonics, how two ppc instructions are needed to carry a
number with more than 16 bits. (The "s" in "addis" stands for "shifted")

see dup
see 2dup
see 3dup
' 3dup dasm

What are those 'bl' ("branch-and-save-return-address-in-Link-Register")
instructions? To find out, enter their target addresses followed by
'.label' at the forth prompt, in the style of

ff848d50 .label

(The actual addresses are probably different on your computer. Copy them
from 3dup's disassembly).

Type 'see 3dup' again

Notice any correlation? 

Far from making things more complicated, those complexities quickly vanish
once you have asm.

tinkerer (32)
5/4/2009 12:29:25 AM
comp.sys.powerpc.tech 819 articles. 1 followers. Post Follow

0 Replies

Similar Articles

[PageSpeed] 51