New Disassembler

TGB1718 · January 22

Checked the ca65 output, again it's doing the same as MADS ".org: = $A000"

you have to remove the '=' and the ':'

the .org can be at the start of the line or have space(s) in front of it.

Edited January 22 by TGB1718

pcrow · January 22

So no '.' for MADS but yes '.' for ca65 and cc65?

Are ca65 and cc65 going to always be the same?

tsom · January 22

Just tried building this on MacOS. Pulled down the repo, and executing the `make` command results in:

```

 ~/Developer/Atari/atari_8bit_utils/disasm/ [main] make
cc -g -O0 -W -Wall -c disasm.c
disasm.c:13:10: fatal error: 'endian.h' file not found
#include <endian.h>
^~~~~~~~~~
1 error generated.
make: *** [disasm.o] Error 1

```

tsom · January 22

6 minutes ago, tsom said:

Just tried building this on MacOS. Pulled down the repo, and executing the `make` command results in:

```

 ~/Developer/Atari/atari_8bit_utils/disasm/ [main] make
cc -g -O0 -W -Wall -c disasm.c
disasm.c:13:10: fatal error: 'endian.h' file not found
#include <endian.h>
^~~~~~~~~~
1 error generated.
make: *** [disasm.o] Error 1

```

I tried changing the line to:

#include <machine/endian.c>

on the advice I found looking up the issue, but that now gives a bunch of errors like:

disasm.c:1275:17: error: call to undeclared function 'le16toh'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
int target = le16toh(*(uint16_t *)&load[2]);

pcrow · January 22

5 minutes ago, tsom said:

disasm.c:13:10: fatal error: 'endian.h' file not found
#include <endian.h>
^~~~~~~~~~

That is for the function le16toh(), which probably only matters if you're on a PowerPC or older Mac (which were big endian). I'm under the impression that pretty much everything else is little endian, which matches the Atari. I think I can put in a fix to avoid that on Macs.

tsom · January 22

FWIW, I replaced #include <endian.h> with the below (that I found) and it builds. May need to be tweaked?

#if defined(__APPLE__)

   #include <libkern/OSByteOrder.h>

  #define htobe16(x) OSSwapHostToBigInt16(x)
  #define htole16(x) OSSwapHostToLittleInt16(x)
  #define be16toh(x) OSSwapBigToHostInt16(x)
  #define le16toh(x) OSSwapLittleToHostInt16(x)

  #define htobe32(x) OSSwapHostToBigInt32(x)
  #define htole32(x) OSSwapHostToLittleInt32(x)
  #define be32toh(x) OSSwapBigToHostInt32(x)
  #define le32toh(x) OSSwapLittleToHostInt32(x)

  #define htobe64(x) OSSwapHostToBigInt64(x)
  #define htole64(x) OSSwapHostToLittleInt64(x)
  #define be64toh(x) OSSwapBigToHostInt64(x)
  #define le64toh(x) OSSwapLittleToHostInt64(x)

 #elif defined(__linux__)

   #include <sys/types.h>

 #endif

pcrow · January 22

I pushed a change to eliminate the endian include on Mac and Windows, which should ease compilation. I'm not sure if it will compile on a PowerPC Mac, as I don't have one to test what functions are available, but for newer ones it should be fine.

pcrow · January 22

I tweaked my commit based on what you posted. If the host system is already little endian, then it can just access the data directly, so it uses an empty macro. That covers Windows and newer Macs, but for older PowerPC Macs, it now uses what you posted.

tsom · January 22

2 minutes ago, pcrow said:

I tweaked my commit based on what you posted. If the host system is already little endian, then it can just access the data directly, so it uses an empty macro. That covers Windows and newer Macs, but for older PowerPC Macs, it now uses what you posted.

Cool! Builds and runs now! Now to have a play

pcrow · January 22

5 hours ago, TGB1718 said:

Checked the ca65 output, again it's doing the same as MADS ".org: = $A000"

you have to remove the '=' and the ':'

the .org can be at the start of the line or have space(s) in front of it.

I think I have it right now.

TGB1718 · January 22

48 minutes ago, pcrow said:

Are ca65 and cc65 going to always be the same?

Yes, cc65 produces an intermediate .s file which is then used by ca65

pcrow · January 22

Is there any trick for telling assemblers that data is a string to be converted to screen memory?

That is, where space is $00 instead of $20, and so forth up through $3F, followed by the control characters, with $60-$7f (mostly lower-case letters) being the same?

If so, then I should support that data type.

TGB1718 · January 22

MADS uses single quotes ' ' to produce a normal ASCII string

using double quotes " " to produce screen bytes of the string

This is an example output from MADS showing the difference

206 A003 28 65 6C 6C 6F 00 + .byte "Hello World"
207 A00E 48 65 6C 6C 6F 20 + .byte 'Hello World'

ca65 only accepts the double quote which produces the ASCII string, single quotes are

used for single characters only and again it's the ASCII character

i.e. .byte 'a','b','c','d'

Edited January 22 by TGB1718

pcrow · January 22

Interesting. I've been using double-quotes for strings. This gets complicated.

I'm also adding some text string detection, which is why this came up.

pcrow · January 22

I think it works. I ran it against a directory full of binary load files and see a bunch of screen code strings detected when in MADS mode.

For now, I'm just scanning for "ATARI" and "COPYRIGHT" which finds a bunch without any false positives. I could make it more generic, but I figure there would be a lot of false positives. Sometimes screen graphics can have good long strings that look like ASCII without making any sense.

pcrow · January 23

So I've been playing around a bit, using the boot for Ultima IV as a test. I found some bugs and fixed them. I also found a coding technique that I hadn't seen before, which probably shows I'm not much of an assembly coder:

There are a number of JSR routines that pop off the return address, save it, and use it to access the data after the JSR instruction. That address is then adjusted, pushed back on the stack, and the RTS skips over the data. I'm not sure why the push and RTS instead of doing a JMP indirect to the adjusted address. Maybe it's just the coder's style, or maybe it used fewer bytes. Regardless, this technique of embedding data for function parameters wreaks havoc with a disassembler. But I've made it so that if the number of bytes consumed is consistent (not, say, until $00 or until negative), then you can specify a label for the routine and add '/p+4' if it consumes 4 bytes of data, and you'll see the bytes instead of attempts to parse code.

I'm sure they were using a macro assembler with some macros for function calls there.

phaeron · January 24

4 hours ago, pcrow said:

So I've been playing around a bit, using the boot for Ultima IV as a test. I found some bugs and fixed them. I also found a coding technique that I hadn't seen before, which probably shows I'm not much of an assembly coder:

There are a number of JSR routines that pop off the return address, save it, and use it to access the data after the JSR instruction. That address is then adjusted, pushed back on the stack, and the RTS skips over the data. I'm not sure why the push and RTS instead of doing a JMP indirect to the adjusted address. Maybe it's just the coder's style, or maybe it used fewer bytes. Regardless, this technique of embedding data for function parameters wreaks havoc with a disassembler. But I've made it so that if the number of bytes consumed is consistent (not, say, until $00 or until negative), then you can specify a label for the routine and add '/p+4' if it consumes 4 bytes of data, and you'll see the bytes instead of attempts to parse code.

I'm sure they were using a macro assembler with some macros for function calls there.

Yeah, the benefit of doing this is that you don't have to have the parameter block elsewhere, which is tough to do from an assembler macro. It also saves on code size, though it's usually slower.

The Happy 1050 firmware uses this technique to encode the target address when switching banks and it was very annoying to deal with when disassembling.

pcrow · January 25

L6FAD:	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C5 ; Inverse character 'E'
	.byte $C1 ; Inverse character 'A'
	.byte $D3 ; Inverse character 'S'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C1 ; Inverse character 'A'
	.byte $C3 ; Inverse character 'C'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D4 ; Inverse character 'T'
	.byte $C8 ; Inverse character 'H'
	.byte $C5 ; Inverse character 'E'

It now auto-detects inverse-video character strings with a short list of known words, including PLEASE. I don't know of a better way of presenting them in the disassembly, but it becomes pretty obvious when one is detected, and then you know to look for more nearby.

Teapot · January 25

1 hour ago, pcrow said:
L6FAD:	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C5 ; Inverse character 'E'
	.byte $C1 ; Inverse character 'A'
	.byte $D3 ; Inverse character 'S'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D0 ; Inverse character 'P'
	.byte $CC ; Inverse character 'L'
	.byte $C1 ; Inverse character 'A'
	.byte $C3 ; Inverse character 'C'
	.byte $C5 ; Inverse character 'E'
	.byte $A0 ; Inverse character ' '
	.byte $D4 ; Inverse character 'T'
	.byte $C8 ; Inverse character 'H'
	.byte $C5 ; Inverse character 'E'
It now auto-detects inverse-video character strings with a short list of known words, including PLEASE. I don't know of a better way of presenting them in the disassembly, but it becomes pretty obvious when one is detected, and then you know to look for more nearby.

For MADS you can use the "dta" pseudo-command instead of ".byte" and then make them strings with an asterisk afterwards to signal inverted.

dta c'PLEASE'*

Also, I suspect all the assemblers allow multiple data values per ".byte" (or whatever) line using comma or space.

pcrow · January 25

L1649:	dta c'PLEASE PLACE THE'* ; inverse

That was an easy change, enabled with the --syntax=mads option.

New Disassembler

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members