Jump to content
IGNORED

Atari_Ace's Blog - Dealer Demo part 2, Let's make a disassembler


Recommended Posts

So to decompile the Dealer Demo, we need to start by peeking at the boot sectors to see how it starts.

 
Writing a disassembler is pretty easy. You need an array containing a definition of each of the 256 possible opcodes. Most of them will just map to .BYTE <value>, but the valid ones will have a name and a mode. The mode determines how much additional data to read (0, 1 or 2 bytes) and how to format the line. Move the pointer forward that many bytes and repeat the process. In other words, something like so:
Here all the heavy lifting will be done by print_op, which relies on the definitions from the opcodes() function to know what to print.

What does opcodes look like?
Although it looks formidable, it really isn't.  The function creates an array of 256 definitions, with a name of ".BYTE $xx" and a mode of 'ILL', for illegal.  It then takes data from the $modes hash table to fill in the valid opcodes.  For example in the key 'ACC' there is a value '0aASL'.  So that expands to $opcodes->[0x0a] = { name => 'ASL', mode => 'ACC' }, or opcode 0x0a is "ASL" in accumulator mode (i.e. ASL A).  In other words, I've represented the valid opcode information in a compact form, and used a little code to expand it into a more useful representation.

When I first did this of course I missed a few opcodes, and reversed SEI and SED (which went undetected for a long time because they are rare opcodes) but I've used this routine for several months now and am fairly certain it's accurate now.

print_op is a bit more complicated, but not too horrible. To implement it, let's first write a helper function.
This routine is designed to show bytes (thus the name sb) in the first 16 characters of a line:  sb(0x480, 0xff, 0x01, 0x80) for instance would show:
So for each mode, we compute the value to append after the definition name.  For instance, for 'ACC' we set $sval = ' A', and for 'REL' (a relative branch) we compute the destination and use that for $sval.

In a few spots there is reference to the $names hash. This is an empty hash currently, but eventually we will put well known symbols into it so that we don't always output numbers.

OK, we are almost there. We need a read_file routine of course. We also need a read_img routine that will help us translate addresses to offsets into the file. For now, we'll just hard code that the first sector is 0x480 to 0x4ff and change this in the future.
The path in read_img is for my particular setup, you may need to change it.  OK, now let's hook it all up.
0480: FF                .BYTE $FF0481: 01 80             ORA ($80,X)0483: 04                .BYTE $040484: C0 E4             CPY #$E40486: A9 F4             LDA #$F40488: D0 06             BNE $0490048A: 00                BRK048B: 0D 01 00          ORA $0001048E: 80                .BYTE $80048F: 00                BRK0490: 85 F0             STA $F00492: A9 52             LDA #$520494: 8D 02 03          STA $03020497: AD 8C 04          LDA $048C049A: 8D 0A 03          STA $030A049D: AD 8D 04          LDA $048D04A0: 8D 0B 03          STA $030B04A3: A9 01             LDA #104A5: 8D 01 03          STA $030104A8: AD 8A 04          LDA $048A04AB: 8D 04 03          STA $0304…
Not perfect, but a great start for only ~160 lines of code. We need to start populating the $names array to make this a bit more readable, and improve read_img so we can target where the disassembly takes place, but we'll talk about how we're going to do that next time.



http://atariage.com/forums/blog/734/entry-14998-dealer-demo-part-2-lets-make-a-disassembler/
Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...