Jump to content
IGNORED

Programming tricks: Better bit flags


GroovyBee

Recommended Posts

Lets say you have four bit flags that you want to perform a single action on if any one of them is set.

 

You might handle the decision code something like this :-

 

   mvi SomeFlags, r0
   movr r0, r1
   andi #$0001, r1
   bne HandleAction1
   movr r0, r1
   andi #$0002, r1
   bne HandleAction2
   movr r0, r1
   andi #$0004, r1
   bne HandleAction3
   movr r0, r1
   andi #$0008, r1
   bne HandleAction4

 

Or if you wanted to be more efficient you could use double shifts :-

 

   mvi SomeFlags, r0
   movr r0, r1
   rrc r1, 2
   bc HandleAction1
   bov HandleAction2
   rrc r1, 2
   bc HandleAction3
   bov HandleAction4

 

The problem with both of these solutions is that you need to find a temporary register to hold the original value should you want to clear the bit flags later.

 

For ultimate speed its possible to use a technique often used in the ARM microcontroller world. The CP1600 allows the contents (arithmetic result flags only) of its status register to be directly read from and written to using the GSWD and RSWD instructions respectively. The status register's four arithmetic bit flags are Carry, Overflow, Zero and Sign.

 

The GSWD instruction writes the status register's contents to the top most four bits of each byte in the destination register as follows (where x is zero to aid clarity) :-

 

1 1 1 1 1 1
5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|Z|O|C|x|x|x|x|S|Z|O|C|x|x|x|x|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

The RSWD instruction reads the status register's new value from the least significant byte of the source register as follows (where x is don't care) :-

 

7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
|S|Z|O|C|x|x|x|x|
+-+-+-+-+-+-+-+-+

 

Thus if we move our flags of interest to the upper four bits of the least significant byte and we move that register into the status register we can directly affect the arithmetic flags and the code sequence can be optimised to the following :-

 

   mvi SomeFlags, r0
   rswd r0
   bc HandleAction1 ; Bit 4 set?
   bov HandleAction2 ; Bit 5 set?
   bze HandleAction3 ; Bit 6 set?
   bmi HandleAction4 ; Bit 7 set?

 

If you want to detect if a bit flag is zero then the "opposite sense" branch instructions of bnc, bnov, bnze and bpl can be used instead.

 

The need for a temporary register has gone too.

Link to comment
Share on other sites

Thus if we move our flags of interest to the upper four bits of the least significant byte and we move that register into the status register we can directly affect the arithmetic flags and the code sequence can be optimised to the following :-

 

   mvi SomeFlags, r0
   rswd r0
   bc HandleAction1 ; Bit 4 set?
   bov HandleAction2 ; Bit 5 set?
   bze HandleAction3 ; Bit 6 set?
   bmi HandleAction4 ; Bit 7 set?

 

If you want to detect if a bit flag is zero then the "opposite sense" branch instructions of bnc, bnov, bnze and bpl can be used instead.

 

The need for a temporary register has gone too.

 

Awesome! I now will go hunt through my current game code to change the double SARC shifts for this trick, where applicable. That's very cool.

 

-dZ.

Edited by DZ-Jay
Link to comment
Share on other sites

Thus if we move our flags of interest to the upper four bits of the least significant byte and we move that register into the status register...

 

I just realized something, if the bit vector is right-justified on the word, then this means that you still need to shift it into the upper four bits of the Least Significant Byte of the register. This removes any advantage over the second method you mentioned (shifting/rotating into Carry and Overflow), unless the vector was already there, of course.

 

-dZ.

Link to comment
Share on other sites

I just realized something, if the bit vector is right-justified on the word, then this means that you still need to shift it into the upper four bits of the Least Significant Byte of the register. This removes any advantage over the second method you mentioned (shifting/rotating into Carry and Overflow), unless the vector was already there, of course.

 

Yep! The optimisation is dependent on the data given. If you are only interested in 2 to 4 bits its a good solution. Preparing the data in advance is the key. For example where you'd do :-

 

   xori #$01, r0

 

Change it to :-

 

   xori #$10, r0

 

And you avoid the shifts because the bit flags are already where you need them.

Link to comment
Share on other sites

I just realized something, if the bit vector is right-justified on the word, then this means that you still need to shift it into the upper four bits of the Least Significant Byte of the register. This removes any advantage over the second method you mentioned (shifting/rotating into Carry and Overflow), unless the vector was already there, of course.

 

Yep! The optimisation is dependent on the data given. If you are only interested in 2 to 4 bits its a good solution. Preparing the data in advance is the key. For example where you'd do :-

 

   xori #$01, r0

 

Change it to :-

 

   xori #$10, r0

 

And you avoid the shifts because the bit flags are already where you need them.

 

Yes, that's true. I'll take that into consideration in the future when I'm preparing my vectors.

 

-dZ.

Link to comment
Share on other sites

Lets say you have four bit flags that you want to perform a single action on if any one of them is set.

 

...

 

   mvi SomeFlags, r0
   rswd r0
   bc HandleAction1 ; Bit 4 set?
   bov HandleAction2 ; Bit 5 set?
   bze HandleAction3 ; Bit 6 set?
   bmi HandleAction4 ; Bit 7 set?

 

If you want to detect if a bit flag is zero then the "opposite sense" branch instructions of bnc, bnov, bnze and bpl can be used instead.

 

The need for a temporary register has gone too.

 

That is handy, although often a lookup table is faster (if larger). ie.

       ADDI    #@@tbl, R1
       MVI@    R1,     PC  
@@tbl:  DECLE   NoAction
       DECLE   HandleAction1    ; bit 0 is leftmost bit set
       DECLE   HandleAction2    ; bit 1 is leftmost bit set
       DECLE   HandleAction2    ; bit 1 is leftmost bit set
       DECLE   HandleAction3    ; bit 2 is leftmost bit set
       DECLE   HandleAction3    ; bit 2 is leftmost bit set
       DECLE   HandleAction3    ; bit 2 is leftmost bit set
       DECLE   HandleAction3    ; bit 2 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set
       DECLE   HandleAction4    ; bit 3 is leftmost bit set

Like I said, larger (19 vs 11 words), but a little bit faster.

 

You can get a little more bang-for-the-buck out of SLLC or SARC too, in case your flags are formatted differently. If your flags are in the four MSBs and the rest of the word is 0s, you can do:

   SLLC R0, 2
   BC   bit15_set
   BOV  bit14_set
   BMI  bit13_set
   BNEQ bit12_set

SARC is a little weirder, because it grabs the sign bit from bit 7 of the result, not bit 15.

   SARC R0, 2
   BC   bit0_set
   BOV  bit1_set
   BMI  bit9_set
   ; Z bit isn't very useful

Edited by intvnut
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...