Dealer Demo, part 4: Some Forth at last
So we spent the last two posts working on tooling to reverse engineer the Dealer Demo, and got the bootloader disassembled for study. Now we can start disassembling the Forth kernel. Let's start with "-dis d00 8":
0D00: EA ORIG NOP 0D01: 4C 8D 1C JMP $1C8D 0D04: EA NOP 0D05: 4C A1 1C JMP $1CA1
Looking at the fig-Forth listing, this looks like the start of the kernel. 1C8D will be COLD+2, the cold start routine, and 1CA1 will be WARM, the warm start routine.
The next 13 words should be data words according to the fig-Forth listing, configuration parameters used where initializing the kernel:
0D08: 00 00 .WORD 0 0D0A: 01 00 .WORD 1 0D0C: 00 00 .WORD 0 0D0E: 7F 00 .WORD $7F 0D10: 80 04 .WORD $0480 0D12: EC 00 .WORD $EC 0D14: FF 01 .WORD $01FF 0D16: 00 01 .WORD $0100 0D18: 1F 00 .WORD $1F 0D1A: 00 00 .WORD 0 0D1C: FB 86 .WORD $86FB 0D1E: FB 86 .WORD $86FB 0D20: 01 1C .WORD $1C01
Of these, we can identify $EC as the TOS (top of stack), $86FB is the pointer to the top of memory (TOP), $1C01 should be the initial vocabulary pointer (VL0).
The next bit of data should be the first definition, the LIT word. In fig-Forth, all definitions start with a byte declaring the length of the word in the bottom 5 bits, then the word itself with the high bit set on the last letter, although in rare cases there will be an extra byte to avoid having the definition fall on a $..FF boundary (why will become clear later). This is called the name field, and its address is referenced as the NFA. There are then two words, the first is the link field which points to the last definition (and is called the LFA). The next word is the parameter field (at the PFA), which points to the code to run for this word (this is the value that must not be at $..FF). For primitive words, it is usually just .WORD *+2, meaning that the code immediately follows the definition.
Knowing this we could slowly disassemble the code definition by definition by hand, but we want our tools to do as much of this as possible for use, so let's implement a definition decompiler.
The first thing we need is a get_string method from a buffer. It should output something that can be appended to a a .BYTE directive. In most cases, surrounding the values with single quotes should be adequate, but recall that the last byte in a definition string will have it's high bit set and thus be unprintable, so let's handle that case in the most general fashion possible so we can use the code for other purposes later:
sub get_string { my ($buff) = @_; my ($string, $inquote) = ('', 0); foreach my $val (unpack "C*", $buff) { my $printable = $val >= 0x20 && $val <= 0x7e; ($inquote = 1, $string .= "'") if ($printable && !$inquote); ($inquote = 0, $string .= "',") if (!$printable && $inquote); $string .= sprintf "%c", $val if $printable && $val != 0x27; $string .= sprintf "''", $val if $val == 0x27; $string .= sprintf "\$%02X", $val if $val >= 10 && !$printable; $string .= $val if $val < 10; $string .= ',' if !$printable; } $string .= "'" if $inquote; $string =~ s/,$//; $string; }
OK, now let's define def_buf, the routine that will parse a definition:
sub def_buf { my ($buff, $addr, $size) = @_; my $first = unpack "C", substr($buff, 0, 1); my $nfaSize = $first & 0x3f; if (($first & 0x80) == 0 || $nfaSize == 0 || $nfaSize > 0x1f) { forth_buf($buff, $addr, $size); return; } $nfaSize += 1 if (($addr + $nfaSize + 3) & 0xff) == 0xff; # handle CFA = ..ff my $string = get_string(substr($buff, 1, $nfaSize)); my $count = $nfaSize + 1; $string = sprintf "%s.BYTE \$%02X,%s", get_label($addr), $first, $string; multi_buf($buff, $addr, $count, $string); forth_buf(substr($buff, $count), $addr + $count, $size - $count); } sub def { my ($buff, $addr, $size) = read_img(@_); def_buf($buff, $addr, $size); }
This looks for the length byte first, if the value found doesn't look like a length byte, we go to our general parser called forth_buf (which mostly just dumps .WORDs of data, more on that shortly). It then adjusts the size if we're at the $..FF boundary, gets the string, passes it to multi_buf to output, and then resumes in the general forth_buf parser.
So what does forth_buf look like:
sub print_word { my ($buff, $addr, $i) = @_; my ($b1, $b2) = unpack "C*", substr($buff, $i, 2); my $val = $b1 | ($b2 << ; my $sval = sprintf "\$%04X", $val; $sval = sprintf "\$%02X", $val if $val < 256; $sval = $names->{$val} if exists $names->{$val}; $sval = sprintf "\$%01X", $val if $val < 16; $sval = $val if $val < 10; $sval = '*+2' if $addr + $i + 2 == $val; my $string = sprintf "%s.WORD %s\n", get_label($addr + $i), $sval; print sb($addr + $i, $b1, $b2), $string; $val; } sub forth_buf { my ($buff, $addr, $size) = @_; my ($val, $val1) = (-1, -1); for (my $i = 0; $i + 2 < $size; ) { $val1 = $val; $val = print_word($buff, $addr, $i); $i += 2; if ($addr + $i == $val) { disasm_buf(substr($buff, $i), $addr + $i, $size - $i); last; } } } sub forth { my ($buff, $addr, $size) = read_img(@_); forth_buf($buff, $addr, $size); }
It calls print_word to print the word sensibly. When the word is *+2, we switch to disasm_buf, otherwise we call print_word again on the next word.
Now to hook it up into main:
elsif ($opt =~ /^\-def/) { def(@_); } else { forth(@_); }
So now let's run "-def d22"
0D22: 83 4C 49 .BYTE $83,'LI',$D4 0D25: D4 0D26: 00 00 .WORD 0 0D28: 2A 0D .WORD *+2 0D2A: B1 F8 LDA ($F8),Y 0D2C: 48 PHA 0D2D: E6 F8 INC $F8 ...
This exactly matches up to the fig-Forth listing, but with some missing constants. It looks like $F8 is the interpreter pointer (IP), so we need to add the following to the listing:
=00EC TOS = $EC =00F0 N = $F0 =00F8 IP = N+8 =00FB W = IP+3 =00FD UP = W+2 =00FF XSAVE = UP+2
We can now start disassembling and inserting the labels as we go along from the fig-Forth listing to get a more readable listing.
Continuing disassembling we see the PUSH, PUT, NEXT, CLIT and SETUP routines. The first difference between Dealer Demo and the published fig-Forth can now be seen. The NEXT routine doesn't have a JSR TRACE call and all the associated tracing code has been omitted. This is to be expected, since that code is there for the initial debugging of the kernel during bootstrapping, there's no need to keep it once the Forth is up an running.
There is another small difference, the single byte literal command CLIT is implemented, but the name and link for it have been dropped so you can't use it. Nonetheless we'll see it used in the code later, so how it got removed is a bit of a mystery.
NEXT is the most important routine in all of Forth, so it's helpful to study it for a moment.
0D42: A0 01 NEXT LDY #1 0D44: B1 F8 LDA (IP),Y 0D46: 85 FC STA W+1 0D48: 88 DEY 0D49: B1 F8 LDA (IP),Y 0D4B: 85 FB STA W 0D4D: 18 CLC 0D4E: A5 F8 LDA IP 0D50: 69 02 ADC #2 0D52: 85 F8 STA IP 0D54: 90 02 BCC L54 0D56: E6 F9 INC IP+1 0D58: 4C FA 00 L54 JMP W-1
How does this work? First, we move the value pointed to by the interpreter pointer (IP) into the W register. Then we increment IP by 2 and then JMP to W-1. W-1 was setup with the indirect jump opcode at startup (6C), so this immediately jumps again to the value at the address we just picked up from the instruction pointer. So Forth basically uses this NEXT routine to automatically jump to addresses held in memory one after the other. In a way it executes code a bit like the 6502, except there are no opcodes, every operation is a 2-byte pointer to an address in memory, which itself points to the code we need to execute. This implementation style for Forth is known as indirect threaded code.
Let's work through an example. Suppose the IP is currently pointing to an address defined with .WORD LIT (i.e. the code for a literal value). LIT equals $0D28, and the value at that address is $0D2A. So when NEXT is called, it copies the value $0D28 into the W register, increments IP by 2, and then calls JMP W-1. The next instruction is therefore JMP ($0D28), which sets the program counter to $0D2A and continues running LDA (IP),Y, et cetera. The code for LIT itself consumes the next word and puts it on the data stack, and increments IP by 2, then falls through to NEXT to fetch the next word.
It's now hopefully obvious why no definition's parameter field can be on address $..FF. JMP (xxxx) on the 6502 has a bug where it will incorrectly fetch the destination when the address crosses the page boundary, and the NEXT routine relies on JMP (xxxx) to function. This isn't a problem though, we just need to detect when this is about to happen and move the definition by one byte, which is why the name string sometime has an extra byte.
Continuing our disassembly, we find the next few words:
-def WORD NFA PFA same as fig-Forth? d74 EXECUTE L75 EXEC Yes d8d BRANCH L89 BRAN Yes dab 0BRANCH L107 ZBRAN Yes dcd (LOOP) L127 PLOOP Yes dfc (+LOOP) L154 PPLOO Yes e36 (DO) L185 PDO Yes
In each case we have to add at least a couple of labels to the list (one at the NFA, and then one two lines later at the PFA), more if there are branches in the code. In a few cases, we need to revert a symbol inserted to better match the fig-Forth listing, but largely the tool we have is doing much of the work once we recognize the start of a definition.
In all of these cases, the listing matches the implementation in the original fig-Forth listing.
As we've been disassembling, we have also been modifying set_name to cover more cases. In particular W-1, N-1, N+1, N+2, ..., N+7, NEXT+2, and BRAN+2 should be recognized, so we've add the following lines to set_name:
$names->{$val+2} = "$label+2" if grep { $_ eq $label } qw/NEXT BRAN/; $names->{$val-1} = "$label-1" if grep { $_ eq $label } qw/N W/; if ($label eq 'N') { $names->{$val+$_} = "$label+$_" for (2..7); }
That's enough for now, I'm attaching our progress and we'll pick it up again next time.
dd4.zip
0 Comments
Recommended Comments
There are no comments to display.