Jump to content

Recommended Posts

I don't mind taking credit for this "idiom".

Ms. Pac-Man does it in a couple of places to run the same section of code for both players.

 

Given no one used it before, you have it well deserved, I've added your credit to my comment :)

  • Like 1

I didn't think this bug fix would get so much attention! Yeesh. :)

 

Especially, when it was such a bone-headed bug to begin with. :lol: (And I mean that in the "DOH!" sense, not as an insult; since the emulator is so good in every other way. :) )

Well I am glad that there is a flash cart for Intellivision coming, because what do you think the odds are of playing Ms. Pac Man on my PSP, my GCW-Zero, or PPSSPP emulator?

 

I didn't do the ports to the PSP or other systems. But, the bug fixes are easy to pull into those ports. So, if you know who did those ports, you can point them to me to get the fixes.

I can always tell just how lazy/unclever I am as a programmer. I'd just have done

 

FOR i = 1 to 2

<<code goes here, i being a handy flag to set p1 vs p2>>

NEXT i

 

You guys and your fancy tricks. :P

 

 

I start there. But, as space or cycles become tighter, I end up using more and more tricks. :-)

 

Here's a particularly obscure one from Space Patrol. The Sprite Attribute Table (SPAT) has a list of indices into a larger table of sprite attributes. Each Sprite Entry (SPENT) is 4 words long, so the indices are all multiples of 4. (I store them scaled.) To distinguish activated sprites from deactivated sprites, I store a 1 in the LSB of the SPAT entry. That way "00" means "deactivated", and "01" means "active, index 0." Valid indexes have values like 1, 5, 9, etc.

 

The UPdate SPrite Active (UPSPA) routine needs to update the number of active sprites in Group 1 and Group 2. (Group 1 has the saucers and other bad guys, Group 2 has all of the bullets in flight.) The following code makes clever use of the fact that bit 0 will be 0 for inactive sprites and 1 for active sprites, while bit 1 will always be 0:

.

        ;;------------------------------------------------------------------;;
        ;;  Recalc GP1ACT, GP2ACT.                                          ;;
        ;;------------------------------------------------------------------;;
        MVII    #SPAT,  R4          ;   8 Point to sprite attribute table
        MVII    #3,     R3          ;   8 We reuse this 4 times.
                                    ;----
                                    ;  16

        ; totally abuse the fact bit 1 is always 0
        MVI@    R4,     R0          ;   8
        ADD@    R4,     R0          ;   8
        ADD@    R4,     R0          ;   8
        MVI@    R4,     R1          ;   8
        ADD@    R4,     R1          ;   8
        ANDR    R3,     R0          ;   6
        ANDR    R3,     R1          ;   6
        ADDR    R0,     R1          ;   6
        MVO     R1,     GP1ACT      ;  11
                                    ;----
                                    ;  69

        MVI@    R4,     R0          ;   8
        ADD@    R4,     R0          ;   8
        ADD@    R4,     R0          ;   8
        MVI@    R4,     R1          ;   8
        ADD@    R4,     R1          ;   8
        ADD@    R4,     R1          ;   8
        ANDR    R3,     R0          ;   6
        ANDR    R3,     R1          ;   6
        ADDR    R0,     R1          ;   6
        AND@    R4,     R3          ;   8
        ADDR    R3,     R1          ;   6
        MVO     R1,     GP2ACT      ;  11
                                    ;----
                                    ;  91
                                    ;  16 (carried forward)
                                    ;  69 (carried forward)
                                    ;----
                                    ; 176

.

That code adds up the SPAT entries in groups of 3, and ANDs the sums with the constant 3. This reduces the number of arithmetic operations I need to do to count up the number of active sprites.

 

(A note on Space Patrol nomenclature: I used 'sprite' to refer to one of the many objects Space Patrol was tracking in the world. I used MOB to refer to the hardware movable object that could display a sprite. As long as there weren't more sprites than MOBs, each sprite would get displayed on a MOB. If there were more sprites than MOBs, then I'd multitask. Group 1 vs. Group 2 allowed me to preference saucers and bad guys over bullets.)

 

That saved me 56 cycles over the way I had been doing it, which relied on the fact that comparing 0 == 0 sets the carry bit....

.

        ;;------------------------------------------------------------------;;
        ;;  Recalc GP1ACT, GP2ACT.                                          ;;
        ;;------------------------------------------------------------------;;
        MVII    #SPAT,  R4          ;   8 Point to sprite attribute table
        CLRR    R0                  ;   6 We need 0 in a reg for this...
        MVII    #$10000-5, R1       ;   8 Start our count to -5

        CMP@    R4,     R0          ;   8 \
        ADCR    R1                  ;   6  |
        CMP@    R4,     R0          ;   8  |
        ADCR    R1                  ;   6  |    Count up number of zero
        CMP@    R4,     R0          ;   8  |__  entries.  This trick works
        ADCR    R1                  ;   6  |    because "CMP 0, 0" sets carry,
        CMP@    R4,     R0          ;   8  |    and "CMP positive, 0" doesn't.
        ADCR    R1                  ;   6  |
        CMP@    R4,     R0          ;   8  |
        ADCR    R1                  ;   6 /
        NEGR    R1                  ;   6 R1 is "5 - #_of_zeros"
        MVO     R1,     GP1ACT      ;  11

        MVII    #$10000-7, R1       ;   8
        CMP@    R4,     R0          ;   8 \
        ADCR    R1                  ;   6  |
        CMP@    R4,     R0          ;   8  |
        ADCR    R1                  ;   6  |
        CMP@    R4,     R0          ;   8  |
        ADCR    R1                  ;   6  |    Count up number of zero
        CMP@    R4,     R0          ;   8  |__  entries.  This trick works
        ADCR    R1                  ;   6  |    because "CMP 0, 0" sets carry,
        CMP@    R4,     R0          ;   8  |    and "CMP positive, 0" doesn't.
        ADCR    R1                  ;   6  |
        CMP@    R4,     R0          ;   8  |
        ADCR    R1                  ;   6  |
        CMP@    R4,     R0          ;   8  |
        ADCR    R1                  ;   6 /
        NEGR    R1                  ;   6 R1 is "7 - #_of_zeros"
        MVO     R1,     GP2ACT      ;  11
                                    ;----
                                    ; 232

.

And that 56 cycles really made a difference to the responsiveness of the game. That, and a few other optimizations that ultimately pulled ~200 cycles out of the main loop and got the total cycle count back under budget. At one point, I joked Space Patrol was borrowing cycles from other processors in the room.

 

The straightforward loop "FOR I = 0 TO 4 : IF ... : NEXT" would have been way too slow. :-)

 

And... while I say I start with simpler and optimize later, it appears that the oldest version of UPSPA I can find (from 2002) looks pretty much like the older version I quoted above. I think I knew when I wrote that routine I was already getting short on cycles.

Edited by intvnut
  • Like 1

 

 

I start there. But, as space or cycles become tighter, I end up using more and more tricks. :-)

 

 

 

Wow, that's very nifty. I'm adding that to my bag-o'-tricks, and I know just where to use it. :)

 

-dZ.

  • Like 1

as a 20 year user of emulators this has to be the most perfect one out there. Sure others run great and awesome but none are as perfect as this one.

 

 

*takes his virtual hat off and looks bashful*

 

Garsh, thems mighty kind words. I appreci-date it.

 

*blush*

 

:)

  • Like 1

 

I know, right? Maybe I should take up philately or numismatics.

look at nes emulators, they have been near perfect since the 90's yet most nes emulate cannot emulate the graphics in mike tyson's punchout properly

 

That saved me 56 cycles over the way I had been doing it, which relied on the fact that comparing 0 == 0 sets the carry bit....

 

 

The closest thing I have to that is trying to build a dispatch table on the stack by chaining a bunch of addresses as required. I do not know how clever it is, but I thought to share it anyway. Maybe others will find it useful, or even point out improvements. :)

 

I receive a vector that represents a "queue" of routines that need to execute, in order. If a bit is set, it means the routine executes; if cleared, it means we should skip it.

;; ======================================================================== ;;
KERNEL.DISP     PROC
                BEGIN                                   ; End of the call chain

                ; --------------------------------------
                ; NOTE: We complement the vector so that
                ;       we can "skip" a stage by its bit
                ;       being set.  Skipping is done by
                ;       shifting the stage bit into the
                ;       Carry flag, and using it as an
                ;       offset to add to the CPU Program
                ;       Counter:
                ;
                ;         C clear = ~(Stage On ) : PC
                ;         C set   = ~(Stage Off) : PC++
                ; --------------------------------------
                COMR    R3
                MVII    #.KERNEL.DISP.TBL,      R4

                ; --------------------------------------
                ; Queue kernel stages
                ; --------------------------------------

        REPEAT  (.KERNEL.DISP.TBL.Size - 1)
                MVI@    R4,     R1                      ; Get next stage address
                SARC    R3                              ; \_ Skip if not set
                ADCR    PC                              ; /     next if (~vector & STAGE);

                PSHR    R1                              ; "Enqueue" stage by pushing it into the call chain
        ENDR

                ; We can optimize the last stage by
                ; avoiding contiguous push and pop.
                ; --------------------------------------
                SARC    R3                              ; \_ Skip if not set
                ADCR    PC                              ; /     last if (~vector & STAGE);
                JR@     R4                              ; Engage call chain

                RETURN                                  ; Nothing to do, return
                ENDP

Again, I do not know how clever that is, but I thought it would be useful since I wouldn't need to keep track of the vector and I can dedicate all registers to each routine. Also, I originally intended to use a buffer for the jump table until I realized that using the stack would already take care of much of the work needed to chain the calls, and do so in a more "natural" way.

 

The dispatch table contains only five routine pointers, so the call chain in the stack is never too deep.

 

-dZ.

 

 

P.S. JR@ is a synonym of MVI@ Rx, PC.

Edited by DZ-Jay
  • Like 1

Er... I think I found a bug. :(

 

The "NOT" operator does not complement assembler symbols anymore. I've compared it to the previous version I had, which probably predates "1.0."

 

OLD BEHAVIOUR:

	; SOURCE:
_foo	SET	$0F
_bar	SET	(NOT _foo)

	; OUTPUT:
0xF                     _foo    SET    $0F
0xFFFFFFF0              _bar    SET    (NOT _foo)

NEW BEHAVIOUR:

	; SOURCE:
_foo	SET	$0F
_bar	SET	(NOT _foo)

	; OUTPUT:
0xF                     _foo    SET    $0F
0x0                     _bar    SET    (NOT _foo)

 

Er... I think I found a bug. :(

 

The "NOT" operator does not complement assembler symbols anymore. I've compared it to the previous version I had, which probably predates "1.0."

 

OLD BEHAVIOUR:

	; SOURCE:
_foo	SET	$0F
_bar	SET	(NOT _foo)

	; OUTPUT:
0xF                     _foo    SET    $0F
0xFFFFFFF0              _bar    SET    (NOT _foo)

NEW BEHAVIOUR:

	; SOURCE:
_foo	SET	$0F
_bar	SET	(NOT _foo)

	; OUTPUT:
0xF                     _foo    SET    $0F
0x0                     _bar    SET    (NOT _foo)

 

 

Yes, that is a bug I fixed in AS1600. NOT is supposed to be logical NOT, not bitwise NOT, according to the Frankenstein Assembler specs. I had implemented it incorrectly previously and it's fixed now. Actually, it was fixed quite some time ago, but after Beta4. It's been in a few of the interim releases I've made since Beta4.

 

You'll need to use XOR $FFFF to get bitwise NOT.

 

This change went in in Sep 2013, and so has been in the interim updates I've published since Beta4.

 

I wonder if this is worth changing back? I kinda grimace whenever I think of this particular issue....

Edited by intvnut

 

 

Yes, that is a bug I fixed in AS1600. NOT is supposed to be logical NOT, not bitwise NOT, according to the Frankenstein Assembler specs. I had implemented it incorrectly previously and it's fixed now. Actually, it was fixed quite some time ago, but after Beta4. It's been in a few of the interim releases I've made since Beta4.

 

You'll need to use XOR $FFFF to get bitwise NOT.

 

This change went in in Sep 2013, and so has been in the interim updates I've published since Beta4.

 

I wonder if this is worth changing back? I kinda grimace whenever I think of this particular issue....

 

I can imagine how it sounds like a good thing, since we didn't have logical negation before (as a matter of fact, all my code had to compensate for that by comparing against zero instead of just testing a boolean expression).

 

However, it's a rather big change in code semantics. It also means I now have to re-write my game and framework code to fit, lest it fails in mysterious ways. :(

 

Personally, I would suggest reverting back and perhaps offering a different operator for logical NOT.

 

-dZ.

Unfortunately a Windows Update + reboot occurred and this is no longer a problem. I say "unfortunately" because I can't now reliably reproduce the issue or give you more to go on. If it occurs again I'll do more to get you something concrete. Thanks for jumping on this so quickly in the first place.

 

 

Can you run it from a cmd prompt without the -q flag and see what jzIntv report for sound buffering? It would be good to know whether it's dropping sound buffers for some reason.

 

You should see a line like this:

Rate: [100.37% 100.75%]  Drop Gfx:[  0.16%      3] Snd:[  0.07%  1  2.960]

Here's a breakdown of what that line shows:

  • Execution rate as %age of real time. 100% is perfect. First number is long-term average, second number is short-term average. These should hover around 100%.
  • Dropped graphics frames, both as a %age and an absolute count. A small number of dropped frames is typical, especially if you're switching between windows.
  • Dropped audio frames, both as a %age and an absolute count, and also the current audio buffering depth. Again, a small number of dropped audio frames is typical, especially if you're switching between windows.

 

There are some flags to control audio sample rate and buffering, but first let's see if it's dropping audio or if something else is up.

 

Have any other Windows users run into problems with this build? I only have WinXP and Win7 to try running on, and my WinXP box has no speakers.

 

 

EDIT: I'm running it on my aging Dell Win7 laptop, and no audio issues here. Does Win 8.1 have a 'mixer' that allows setting per-app volumes? Maybe that's the culprit. I've never used Windows 8 or 8.1 so I'll have to rely on the rest of you in the peanut gallery for this one.

Unfortunately a Windows Update + reboot occurred and this is no longer a problem. I say "unfortunately" because I can't now reliably reproduce the issue or give you more to go on. If it occurs again I'll do more to get you something concrete. Thanks for jumping on this so quickly in the first place.

 

 

 

Hopefully it was just something horked up in Win8.1 itself, and it's corrected now. If it does come up again, I'll be glad to hear about it.

 

 

 

Of course you can ask! :)

 

Oh, you want me to tell you the details too... Ok. ;) ;) ;)

 

It was a simple 'facepalm' of a bug. MVO@ R7, R6 (aka. PSHR R7) should push the new value of R7 (the address of the following instruction) rather than the old value. This bug was causing a piece of code like this to loop forever, rather than iterating twice and leaving:

    PSHR R5
    PSHR R7

    ... stuff ...

    PULR PC

 

Hi, Joe,

 

I was just going through my P-Machinery code and I encountered a section where it installs an "exception handler" hook while in DEBUG mode that stores the entire system state in the stack prior to executing. In tracing the CPU history of this code, I was surprised to notice that pushing the PC into the stack does indeed work as expected--that is, the following address is stored.

F00D 0000 028E 8007 0300 FFFF 02F1 2111 ------i-  PSHR R7               64008 ; <-- Save PC into stack
F00D 0000 028E 8007 0300 FFFF 02F2 2112 --------  JD   DEBUG_HANDLER    64017
DEBUG_HANDLER:
F00D 0000 028E 8007 0300 FFFF 02F2 BA83 --------  PSHR R5               64030
F00D 0000 028E 8007 0300 FFFF 02F3 BA84 --------  PSHR R4               64039
F00D 0000 028E 8007 0300 FFFF 02F4 BA85 --------  PSHR R3               64048
F00D 0000 028E 8007 0300 FFFF 02F5 BA86 --------  PSHR R2               64057
F00D 0000 028E 8007 0300 FFFF 02F6 BA87 --------  PSHR R1               64066
F00D 0000 028E 8007 0300 FFFF 02F7 BA88 --------  PSHR R0               64075
F00D 0000 028E 8007 0300 FFFF 02F8 BA89 --------  MOVR R6,R1            64084
F00D 02F8 028E 8007 0300 FFFF 02F8 BA8A ------i-  SUBI #$0007,R1        64090
F00D 02F1 028E 8007 0300 FFFF 02F8 BA8C -C----i-  MVI@ R1,R1            64098 ; <-- Read PC from stack
F00D 2112 028E 8007 0300 FFFF 02F8 BA8D -C----i-  ADDI #$0004,R1        64106
     ^^^^
     PC = following instruction
 

This is with the older version of the assembler I've been using for years, so was this bug introduced in version 1.0?

 

EDIT:

Oooh! The plot thickens! It turns out that my code above, which worked during Christmas Carol early development, does not work any more because of an "off-by-one" error on the last instruction. It appears I was expecting the address of the stack to be the one of the PSHR PC instruction, not the next one. Returning the next one changes the offset to 3, causing the hook to jump into the weeds when adjusted by 4! :o

 

So it seems that the bug was there originally, then it was not, then at some point in time it came back; and finally it was squashed once and for all. Weird.

Edited by DZ-Jay
  • Like 1

 

Hi, Joe,

 

I was just going through my P-Machinery code and I encountered a section where it installs an "exception handler" hook while in DEBUG mode that stores the entire system state in the stack prior to executing. In tracing the CPU history of this code, I was surprised to notice that pushing the PC into the stack does indeed work as expected--that is, the following address is stored.

F00D 0000 028E 8007 0300 FFFF 02F1 2111 ------i-  PSHR R7               64008 ; <-- Save PC into stack
F00D 0000 028E 8007 0300 FFFF 02F2 2112 --------  JD   DEBUG_HANDLER    64017
DEBUG_HANDLER:
F00D 0000 028E 8007 0300 FFFF 02F2 BA83 --------  PSHR R5               64030
F00D 0000 028E 8007 0300 FFFF 02F3 BA84 --------  PSHR R4               64039
F00D 0000 028E 8007 0300 FFFF 02F4 BA85 --------  PSHR R3               64048
F00D 0000 028E 8007 0300 FFFF 02F5 BA86 --------  PSHR R2               64057
F00D 0000 028E 8007 0300 FFFF 02F6 BA87 --------  PSHR R1               64066
F00D 0000 028E 8007 0300 FFFF 02F7 BA88 --------  PSHR R0               64075
F00D 0000 028E 8007 0300 FFFF 02F8 BA89 --------  MOVR R6,R1            64084
F00D 02F8 028E 8007 0300 FFFF 02F8 BA8A ------i-  SUBI #$0007,R1        64090
F00D 02F1 028E 8007 0300 FFFF 02F8 BA8C -C----i-  MVI@ R1,R1            64098 ; <-- Read PC from stack
F00D 2112 028E 8007 0300 FFFF 02F8 BA8D -C----i-  ADDI #$0004,R1        64106
     ^^^^
     PC = following instruction
 

This is with the older version of the assembler I've been using for years, so was this bug introduced in version 1.0?

 

EDIT:

Oooh! The plot thickens! It turns out that my code above, which worked during Christmas Carol early development, does not work any more because of an "off-by-one" error on the last instruction. It appears I was expecting the address of the stack to be the one of the PSHR PC instruction, not the next one. Returning the next one changes the offset to 3, causing the hook to jump into the weeds when adjusted by 4! :o

 

So it seems that the bug was there originally, then it was not, then at some point in time it came back; and finally it was squashed once and for all. Weird.

 

This isn't an assembler bug. It's a jzIntv bug.

 

 

What version do you mean when you say "1.0"? I've never released a version 1.0. There's 1.0 Beta 3 from 2006 and 1.0 Beta 4 from 2012, and a few builds with date codes, but no 1.0. In any case, I went back and looked at the broken code, and see no reason why it should ever have worked at least as far back as 2006. (That's as far back as my SVN history goes.)

 

Bonus: The bug seems to go back much, much further. I looked in a version on my hard drive from 2002.

 

So, I have no idea what version you have that fixes this bug, but if I had sent you a patched version apparently I never checked the fix into mainline.

Sorry, I meant jzIntv, not assembler. DOH! :dunce:

 

I've been using special builds you created to address problems during the development of Christmas Carol, so they could have been off the mainline.

Sorry, I meant jzIntv, not assembler. DOH! :dunce:

 

I've been using special builds you created to address problems during the development of Christmas Carol, so they could have been off the mainline.

 

It's entirely possible I have some un-checked-in fixes kicking on my MBP, then. It's entirely possible we hit this during CC development and I fixed it, and failed to check in the fix.

No worries, that explains it. In any case, that code is only used during "DEBUG" mode, and is conditionally suppressed from assembly on production builds, so it shouldn't affect anybody. This is why I didn't notice it until now, when I started once more working on the Christmas Carol code for the 2014 CvW Special Edition.

 

I've updated the code to bypass the potential issue altogether by changing the PSHR PC with the following sequence:

    PSHR  R0     ; Save R0
    MOVR  PC, R0 ; \_ Save Program Counter
    PSHR  R0     ; /
    ; ...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...