Jump to content
IGNORED

How do i disassemble 8kb atari 2600 games?


Recommended Posts

That question had bothered me forever. The fact that Distella cannot disassemble 8kb games such as ''Mr. Do's Castle'' and ''Smurfs Rescue in Gargamel's Castle'' is my worry about the current version of Distella.

 

If anyone (ESPECIALLY THE AUTHOR!!) rebuilds/remakes distella to fix the disassembly limit (2kb and 4kb roms are supported currently) i will give credit for you in my atari

games!

Link to comment
Share on other sites

That question had bothered me forever. The fact that Distella cannot disassemble 8kb games such as ''Mr. Do's Castle'' and ''Smurfs Rescue in Gargamel's Castle'' is my worry about the current version of Distella.

 

If anyone (ESPECIALLY THE AUTHOR!!) rebuilds/remakes distella to fix the disassembly limit (2kb and 4kb roms are supported currently) i will give credit for you in my atari

games!

I wrote a little utility program called SplitFile that will take a ROM file larger than 4K and split it into different segments that can then be (individually) disassembled with Distella. You can even specify the size of the segments you want to divide it up into, which is useful for disassembling ROMS which use a bankswitching method where each bank is 1K or 2K (since not all bankswitching methods have 4K banks). You can read about it, and download the .zip file, in the following thread:

 

http://www.atariage.com/forums/index.php?s...st&p=824663

 

Note that I posted two versions of it-- the second version is basically the same as the first, but has a few minor enhancements-- so you'll probably want to download the second version.

 

Michael

Link to comment
Share on other sites

If anyone (ESPECIALLY THE AUTHOR!!) rebuilds/remakes distella to fix the disassembly limit (2kb and 4kb roms are supported currently) i will give credit for you in my atari

games!

 

One major difficulty in trying to make a program like Distella work with larger ROMs is that for a disassembler to work it must be able to follow program flow. In a 2K or 4K program, this is usually pretty easy (though there are some notable exceptions). In an 8K or larger program it may often be more difficult.

 

If a program is set up nicely, you may be able to disassemble each of the parts and then search for loads to $xFF8 or $xFF9 (for 16K games, also look for $xFF6 or $xFF7). Just looking for the FFx where x is a digit should probably be good.

 

If in the first bank of an 8K game you see a LDA $1FF9 (or a BIT $7FF9, or whatever), then you know that in the other bank, the instruction at the next address will be executable. For example, if you saw the LDA $1FF9 at address $1234, then in the other bank you would know that execution could reach address $1237.

 

In some games things will be easy. In other games they will be much more complicated. Parker Brothers games may be the trickiest, since multiple banks can be active at once. But that's a subject for another day.

Link to comment
Share on other sites

Yup...there are a number of bankswitching types, and the bankswitch instruction itself might use an offbeat method of switching that Distella would miss regardless (such as using a register, as in LDA $1FF7,x...if X is holding a 1, the first bank would be switched to. While $1FF8 and $1FF9 are "known" hotspots that trigger the banks in F8 switching...and could be written in to Distella fairly easy, the value of what X holds would be difficult for the disassembler to discover.

So you have an unknown method of banking coupled with an unknown set of circumstances.

 

Better to just leave Distella under human guidance ;)

 

 

The "But anyway..." dept:

You need to split the file and create configuration files to help Distella interpret the file...and correctly seperate program instructions from data/gfx.

 

Here are the steps that I follow to disassemble an 8k game that uses the F8 bankswitch mode:

 

 

 

* Split the binary into 2 parts

There are freeware file splitters available on the 'net. I use HJsplit and set the segment size at 4k (4096 bytes). F8 banking swaps a full 4k (block) of rom, so there are only 2 segments that you need to disassemble seperately.

 

 

 

* Use the -d switch to disassemble each part to temp files

The -d instructs Distella to interpret the entire memory block as coded instructions. Because data tables and gfx are imbedded within the program, the resulting temp disassemblies will be very large (due to all of the gibberish and undocumented instructions).

 

 

 

* Manually search through the temps to find...

- a) the origin of that 4k block

Because the 6507 only uses 13 bits in it's address line (instead of the full 16), the 2600 doesn't care which block the 4k chunk was addressed to...so long as it begins with an odd number (indicating rom memory). To the 2600, LDA $FEE0 would have the same effect as LDA $1EE0...1 and F are both odd numbers. This is easily discovered, as you'll undoubtedly see lots of loading instructions that show you the digit that was used IF the "init vector" was not correct and caused Distella to miss out on labelling areas of data. When in doubt, use D for the first chunk and F for the last (a common practice was to use F for the last and work up from there). The init vector (bytes $1FFC and $1FFD in the 4k chunk) must exist in at least one of the banks...but at this point, you don't know if it's valid for the current chunk (or even if the odd number is correct for that bank!). A Start vector indicating $3000 would still boot a bank addressed to $F000 due to the hobbled 13-bit address line.

 

- b) the starting / ending addresses of sections of program instructions

Here's where it gets tricky. You'll need to try to find some kind of order amongst all of the gibberish...and try to identify which lines in the temp files are coded instructions vs. data. Fortunately, the 6507 uses only a portion of the 256 possible byte values as "legal" instuctions...so even with these temp files it shouldn't be so difficult to find the ends of routines once undocumented opcodes begin showing up (those beginning with a period in the disassemblies). But UNfortunately, program areas can begin at addresses that Distella might miss out on labelling (because bankswitching from one block to another can happen at any point in the program). And a followup problem is that routines don't always need to end with a JMP instruction or bankswitch to point to the next program line (due to short supply of memory, "unconditional branches" might be used instead). The opposite - a "conditional branch" - is like an if-goto statement in Basic. X = X - 1 : IF X = 0 THEN GOTO 500 works similarly to DEX : BEQ $label in machine language. Unconditional branching is just the GOTO itself...and it occurs if the condition is always the same result. LDX #$00 : BEQ $label - in that case, the program will ALWAYS branch off to the label...the "zero" result happened right in the loading of X. Load zero, branch if zero. Again, this would be difficult for a disassembler to know the status of flags before a branching instruction...so you'll need to spot them yourself...and be aware that the bytes that follow the branch might not be instructions. Distella won't know the difference. No JMP encountered to divert it, so it goes chugging merrily along.

 

Important note to save your sanity: Don't dwell on the bankswitch issue too much...save that for when you have both disassemblies merged to 1 file later...when it can very easily be cross-referenced. You WILL miss spots unless the program was layed out very simply. And you WILL miss spots that are handled via indirect-jump. Trying to get everything on one go just ends up taking longer.

 

- c) the starting / ending addresses of sections of data/gfx.

If you did a fair job in b), this is simple. Just write down the address ranges that fall between the areas assumed to be instructions. Because data values might be indistinguishable from true gfx values at this stage, I just use GFX for all data. That works for both, and having all data bytes on seperate lines will make adding labels later that much easier. You'll know which address each data value is coming from just by looking at the number thrown off to the side.

 

 

 

* Use the a/b/c results to create configuation files

These are just simple unformatted text files. You'll need a seperate one for each 4k block, that lists the addresses you found from the above step. This lets you create a "map" for Distella to follow. ORG sets the address of the block, CODE indicates program instruction areas, GFX indicates bitmap areas, and DATA (can be) used for all the rest.

Example:

ORG $F000

GFX $F000 $F0FF

CODE $F100 $F7FF

GFX $F800 $FFEF

CODE $FFF0 $FFF7

GFX $FFF8 $FFFF

 

Note that if you do not specify how to handle an area, Distella will go back to basics and try to figure it out on it's own. Best to avoid confusion and just list 'em all.

 

 

 

* Use Distella to create "final" disassemblies based on your findings

When you instruct Distella to use one, Distella will check your config file to see how a byte from the binary file should be translated and follow your instructions (or "map"). You signify this with the -c switch followed by the name of the config file (such as -cbank1.cfg).

Note that if equates such as LF100 = $3100 still show up near the top of the file, this could point you to an area that wasn't handled correctly...but often it is due to something simpler...the index register used when reading a table has a minimum value other than zero within the program (if X is always going to be a value between 5 and 9...the actual table probably exists 5 bytes PAST the label location). Adjust your pointers within the .cfg, or trace the program above the loading instruction (to try to discover the minimum value a pointer should be) and adjust the label location in the disassembly. In the above example, move the label 5 lines down and change the instruction to read $label-5.

 

 

 

* Paste the 2 disassemblies to 1 file

Because one bank can reference areas from another (and v-v), it'll make working with the file later (with Dasm) that much easier. You might find program areas that missed detection the first time around, just by finding bankswitch hotspot calls in one section and checking it against the destination address in the other bank - it should contain valid opcodes there to continue the routine/program.

You'll need to change all ORG commands in the disassembly to RORG (and add sequential ORG's above those), and paste all equates used in both files to the top of the new "merged" disassembly.

 

Example:

Assuming that bank 1 had an ORG of $D000 and bank 2 had an ORG of $F000...change them to this:

 

ORG $1000

RORG $D000

(bank 1 contents)

 

ORG $2000

RORG $F000

(bank 2 contents)

 

This is related to the 6507's funky 13-bit address line. While it only uses odd numbers, Dasm is not so limited. Without setting the 2nd's origin that is exactly 4k after the first...Dasm will assume that there's an additional 4k of data (at $E000 to $EFFF). Using relocatable-ORG's corrects this problem. The example above sets the origins to be 4k apart, while the next lines set the true origins.

 

NOTE: this assumes that both banks were not set as the same block #. If you run across one of those binaries where both banks were originally assembled to $F000-$FFFF...you should do an automatic search/replace first with one of them. Replace all occurances of "LF" with "LD" in the first bank, for example. This will prevent the problem of duplicate label names.

 

 

Whew! Seems like a lotta work...but it's really only about 30 minutes or so to create a -rough- disassembly once you have a basic understanding of asm.

 

Then comes the really hard part: finding and labelling all occurences of indirect jumping/addressing (only then does it become reverse-engineered source code). But that's for another time. With the 8k disassembly ready...use Dasm to create a file, and then check it against the original 8k game (I just use DOS' file compare in binary mode - fc /b file1 file2). There should be no differences listed if all went well.

Edited by Nukey Shay
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...