Jump to content
IGNORED

New Disassembler


pcrow

Recommended Posts

Checked the ca65 output, again it's doing the same as MADS ".org:    = $A000"

you have to remove the '=' and the ':'

 

the .org can be at the start of the line or have space(s) in front of it.

 

 

 

Edited by TGB1718
Link to comment
Share on other sites

Just tried building this on MacOS. Pulled down the repo, and executing the `make` command results in:

 

```

 ~/Developer/Atari/atari_8bit_utils/disasm/ [main] make
cc -g -O0 -W -Wall  -c disasm.c
disasm.c:13:10: fatal error: 'endian.h' file not found
#include <endian.h>
         ^~~~~~~~~~
1 error generated.
make: *** [disasm.o] Error 1

```

Link to comment
Share on other sites

6 minutes ago, tsom said:

Just tried building this on MacOS. Pulled down the repo, and executing the `make` command results in:

 

```

 ~/Developer/Atari/atari_8bit_utils/disasm/ [main] make
cc -g -O0 -W -Wall  -c disasm.c
disasm.c:13:10: fatal error: 'endian.h' file not found
#include <endian.h>
         ^~~~~~~~~~
1 error generated.
make: *** [disasm.o] Error 1

```

I tried changing the line to:

#include <machine/endian.c>

on the advice I found looking up the issue, but that now gives a bunch of errors like:

 

disasm.c:1275:17: error: call to undeclared function 'le16toh'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
   int target = le16toh(*(uint16_t *)&load[2]);

 

Link to comment
Share on other sites

5 minutes ago, tsom said:

disasm.c:13:10: fatal error: 'endian.h' file not found
#include <endian.h>
         ^~~~~~~~~~

That is for the function le16toh(), which probably only matters if you're on a PowerPC or older Mac (which were big endian).  I'm under the impression that pretty much everything else is little endian, which matches the Atari.  I think I can put in a fix to avoid that on Macs.

Link to comment
Share on other sites

FWIW, I replaced #include <endian.h> with the below (that I found) and it builds. May need to be tweaked?

 

#if defined(__APPLE__)

   #include <libkern/OSByteOrder.h>

  #define htobe16(x) OSSwapHostToBigInt16(x)
  #define htole16(x) OSSwapHostToLittleInt16(x)
  #define be16toh(x) OSSwapBigToHostInt16(x)
  #define le16toh(x) OSSwapLittleToHostInt16(x)

  #define htobe32(x) OSSwapHostToBigInt32(x)
  #define htole32(x) OSSwapHostToLittleInt32(x)
  #define be32toh(x) OSSwapBigToHostInt32(x)
  #define le32toh(x) OSSwapLittleToHostInt32(x)

  #define htobe64(x) OSSwapHostToBigInt64(x)
  #define htole64(x) OSSwapHostToLittleInt64(x)
  #define be64toh(x) OSSwapBigToHostInt64(x)
  #define le64toh(x) OSSwapLittleToHostInt64(x)

 #elif defined(__linux__)

   #include <sys/types.h>

 #endif

 

Link to comment
Share on other sites

I pushed a change to eliminate the endian include on Mac and Windows, which should ease compilation.  I'm not sure if it will compile on a PowerPC Mac, as I don't have one to test what functions are available, but for newer ones it should be fine.

Link to comment
Share on other sites

I tweaked my commit based on what you posted.  If the host system is already little endian, then it can just access the data directly, so it uses an empty macro.  That covers Windows and newer Macs, but for older PowerPC Macs, it now uses what you posted.

Link to comment
Share on other sites

2 minutes ago, pcrow said:

I tweaked my commit based on what you posted.  If the host system is already little endian, then it can just access the data directly, so it uses an empty macro.  That covers Windows and newer Macs, but for older PowerPC Macs, it now uses what you posted.

Cool! Builds and runs now! Now to have a play :)

 

Link to comment
Share on other sites

5 hours ago, TGB1718 said:

Checked the ca65 output, again it's doing the same as MADS ".org:    = $A000"

you have to remove the '=' and the ':'

 

the .org can be at the start of the line or have space(s) in front of it.

I think I have it right now.

Link to comment
Share on other sites

Is there any trick for telling assemblers that data is a string to be converted to screen memory?

That is, where space is $00 instead of $20, and so forth up through $3F, followed by the control characters, with $60-$7f (mostly lower-case letters) being the same?

If so, then I should support that data type.

Link to comment
Share on other sites

MADS uses single quotes ' ' to produce a normal ASCII string

using double quotes " " to produce screen bytes of the string

 

This is an example output from MADS showing the difference

 

   206 A003 28 65 6C 6C 6F 00 +      .byte "Hello World"
   207 A00E 48 65 6C 6C 6F 20 +      .byte 'Hello World'

 

ca65 only accepts the double quote which produces the ASCII string, single quotes are 

used for single characters only and again it's the ASCII character

 

i.e. .byte 'a','b','c','d'

Edited by TGB1718
Link to comment
Share on other sites

I think it works.  I ran it against a directory full of binary load files and see a bunch of screen code strings detected when in MADS mode.

 

For now, I'm just scanning for "ATARI" and "COPYRIGHT" which finds a bunch without any false positives.  I could make it more generic, but I figure there would be a lot of false positives.  Sometimes screen graphics can have good long strings that look like ASCII without making any sense.

Link to comment
Share on other sites

So I've been playing around a bit, using the boot for Ultima IV as a test.  I found some bugs and fixed them.  I also found a coding technique that I hadn't seen before, which probably shows I'm not much of an assembly coder:

 

There are a number of JSR routines that pop off the return address, save it, and use it to access the data after the JSR instruction.  That address is then adjusted, pushed back on the stack, and the RTS skips over the data.  I'm not sure why the push and RTS instead of doing a JMP indirect to the adjusted address.  Maybe it's just the coder's style, or maybe it used fewer bytes.  Regardless, this technique of embedding data for function parameters wreaks havoc with a disassembler.  But I've made it so that if the number of bytes consumed is consistent (not, say, until $00 or until negative), then you can specify a label for the routine and add '/p+4' if it consumes 4 bytes of data, and you'll see the bytes instead of attempts to parse code.

 

I'm sure they were using a macro assembler with some macros for function calls there.

Link to comment
Share on other sites

4 hours ago, pcrow said:

So I've been playing around a bit, using the boot for Ultima IV as a test.  I found some bugs and fixed them.  I also found a coding technique that I hadn't seen before, which probably shows I'm not much of an assembly coder:

 

There are a number of JSR routines that pop off the return address, save it, and use it to access the data after the JSR instruction.  That address is then adjusted, pushed back on the stack, and the RTS skips over the data.  I'm not sure why the push and RTS instead of doing a JMP indirect to the adjusted address.  Maybe it's just the coder's style, or maybe it used fewer bytes.  Regardless, this technique of embedding data for function parameters wreaks havoc with a disassembler.  But I've made it so that if the number of bytes consumed is consistent (not, say, until $00 or until negative), then you can specify a label for the routine and add '/p+4' if it consumes 4 bytes of data, and you'll see the bytes instead of attempts to parse code.

 

I'm sure they were using a macro assembler with some macros for function calls there.

Yeah, the benefit of doing this is that you don't have to have the parameter block elsewhere, which is tough to do from an assembler macro. It also saves on code size, though it's usually slower.

 

The Happy 1050 firmware uses this technique to encode the target address when switching banks and it was very annoying to deal with when disassembling.

  • Like 1
Link to comment
Share on other sites

L6FAD:	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C5 ; Inverse character 'E'
	.byte $C1 ; Inverse character 'A'
	.byte $D3 ; Inverse character 'S'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C1 ; Inverse character 'A'
	.byte $C3 ; Inverse character 'C'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D4 ; Inverse character 'T'
	.byte $C8 ; Inverse character 'H'
	.byte $C5 ; Inverse character 'E'

It now auto-detects inverse-video character strings with a short list of known words, including PLEASE.  I don't know of a better way of presenting them in the disassembly, but it becomes pretty obvious when one is detected, and then you know to look for more nearby.

Link to comment
Share on other sites

1 hour ago, pcrow said:
L6FAD:	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C5 ; Inverse character 'E'
	.byte $C1 ; Inverse character 'A'
	.byte $D3 ; Inverse character 'S'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C1 ; Inverse character 'A'
	.byte $C3 ; Inverse character 'C'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D4 ; Inverse character 'T'
	.byte $C8 ; Inverse character 'H'
	.byte $C5 ; Inverse character 'E'

It now auto-detects inverse-video character strings with a short list of known words, including PLEASE.  I don't know of a better way of presenting them in the disassembly, but it becomes pretty obvious when one is detected, and then you know to look for more nearby.

For MADS you can use the "dta" pseudo-command instead of ".byte" and then make them strings with an asterisk afterwards to signal inverted.

 

dta c'PLEASE'*

 

Also, I suspect all the assemblers allow multiple data values per ".byte" (or whatever) line using comma or space.

 

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...