Jump to content
IGNORED

tools for cycle counting?


 Share

Recommended Posts

Since I know all cycle counts by heart I prefer doing it in the source itself, as close to the code as possible (i.e. by using comments) but I know SvOlli has written a small tool that according to him has helped him a lot planning cycle-exact kernels. It displays a grid, 76 cycles wide, and allows you to place and move instruction chunks of different widths (i.e. different amount of cycles needed) around. It also marks the start of the visible area, etc. You can find it here.

  • Like 2
Link to comment
Share on other sites

I reference the list BNE Jeff linked to. You'll learn the counts very quickly, as they follow a pattern, so won't be looking them up that long. Start off with:

 

brief, 2 cycle instructions, are those that don't access memory (or use immediate mode for the value) such as:

  • lda #value, ldx #, ldy #
  • tax, tay, txa, tya, txs, tsx
  • dey, dex, iny, inx
  • adc #, sbc #
  • bne, beq, etc. but only when the branch is not taken
  • cmp #, cpx #, cpy #
  • ero #
  • asl, lsr, ror, rol,
  • clc, sec

 

short, 3 cycle instructions, use zero page memory

  • lda zp, ldx zp, ldy zp, sta zp, stx zp, sty zp
  • cmp zp, cpx zp, cpy zp

 

medium, 4 cycle instructions, use non-zero-page memory

  • lda $ff00, sta $ff00
  • etc

instructions that use indexed addressing like lda value,x or ,y will add 1 extra cycle.

 

instructions that use indirect addressing like lda (zp),y will add 2 extra cycles.

 

You can also use Stella, it lists all the cycles in the source display:

post-3056-0-79466000-1468162993_thumb.png

Link to comment
Share on other sites

Once you know how many cycles each mode takes it becomes pretty easy. LDA #IMMEDIATE = 2 cycles, LDA ZP = 3 cycles, LDA ABSOLUTE = 4 cycles, etc...

 

What I do for the kernel code is:

1) Put in the number of cycles for the instruction after a semi-colon.

2) Copy-paste it into an excel sheet I made corrects cycle counts, and re-formats all the spacing, etc...

3) Copy-paste back into source code.

 

 

Here is a typical example. The code doesn't matter so much. What I'm doing here is rewriting a piece of code and this point I've moved stuff around and added it. It's a mess.

    tay                          ;2  @6
    lda    PF1_Tab,Y             ;4  @10
    sta    PF1                   ;3  @13


    sta    currentRowShape   ;3
    ora    rowHighlightShape    ;3
    sta    rowHighlightShape     ;3

 
    lda    PF0_Tab,Y             ;4  @17
    sta    PF0                   ;3  @20
    lda    PF2_Tab,Y             ;4  @24
    sta    PF2                   ;3  @27

    ldy    #0                    ;2  @
    lda    (digitOne),Y          ;5  @2
    sta    tempDig1_B            ;3  @6
    lda    (digitTwo),Y          ;5  @11
    sta    tempDig2_B            ;3  @14
    lda    (digitThree),Y        ;5  @19
    sta    tempDig3_B            ;3  @22

    ldy    #6                    ;2  @
    lda    (digitOne),Y          ;5  @2

Copy-paste the mess into the excel sheet (in cell A2). In the yellow box at the top of the sheet (cell A1) put in what cycle the code segment is to start at. In this example I will arbitrarily start at cycle 7.

 

post-7074-0-22121900-1468165395_thumb.png

 

And voila! Now copy from cell B2 downward and paste back into the source code.

    tay                          ;2  @9
    lda    PF1_Tab,Y             ;4  @13
    sta    PF1                   ;3  @16


    sta    currentRowShape       ;3  @19
    ora    rowHighlightShape     ;3  @22
    sta    rowHighlightShape     ;3  @25

 
    lda    PF0_Tab,Y             ;4  @29
    sta    PF0                   ;3  @32
    lda    PF2_Tab,Y             ;4  @36
    sta    PF2                   ;3  @39

    ldy    #0                    ;2  @41
    lda    (digitOne),Y          ;5  @46
    sta    tempDig1_B            ;3  @49
    lda    (digitTwo),Y          ;5  @54
    sta    tempDig2_B            ;3  @57
    lda    (digitThree),Y        ;5  @62
    sta    tempDig3_B            ;3  @65

    ldy    #6                    ;2  @67
    lda    (digitOne),Y          ;5  @72

A few things about this:

 

1) I only made this sheet for myself, for my style of coding. The spacing is how I prefer, and there might be some dependencies on that. I've never thought anyone else would use this sheet. There also might be a few bugs I've never fixed as I know all the quirks and I'm lazy I guess to fix them.

2) The same type of idea can also be used in other programs. It doesn't have to be excel. I just chose excel because it was very easy for me to build the program. Maybe an IDE can be set up with similar rules.

3) The excel sheet I made handles multiple scanlines. It will correct the cycles as you roll over from scanline to scanline. It also handles "2³" which is something I saw other programmers using for branches when I first started out.

    lda    (digitTwo),Y          ;5  @70
    sta    tempDig2_B            ;3  @73
    lda    (digitThree),Y        ;5  @2 <----- rolled over to next scanline, 3 cycles already used so 2 is correct.
    sta    tempDig3_B            ;3  @5

    ldy    #6                    ;2  @7
    lda    (digitOne),Y          ;5  @12
    bne    .storeRowHiLightShape ;2³ @14/15 <---- takes 2 or 3 cycles. Add 1 more cycle for page boundary crossings....

Here is the sheet I use:

 

Cycle Counter(newest).zip

  • Like 1
Link to comment
Share on other sites

it makes me think that an app could be made where you paste in a snippet of kernel code, and a starting cycle (if wanted), and it would calculate the cycles, and update on a gantt graph the cycles for each instruction (or simply show the total)...

 

food for thought for later...for now, I _have_ to solve the dodgeball pong physics... (seriously, how in the hell did Larry Wagner manage to make a pong algorithm that doesn't just bounce back and forth?!)

 

-Thom

Link to comment
Share on other sites

it makes me think that an app could be made where you paste in a snippet of kernel code, and a starting cycle (if wanted), and it would calculate the cycles, and update on a gantt graph the cycles for each instruction (or simply show the total)...

You can do that, but it is beneficial to also know how many cycles each instruction mode takes. I consider it as fundamental as knowing a basic multiplication table up to 12. I could pull out a calculator or my phone to figure out 7 x 6 = 42 (like most of the kids these days have to), or I could memorize the result. In the end knowing the cycles for each instruction mode is very easy as there really are just a few of them.

 

If you make an app that does it all without any cycles being written then it should be able to distinguish between labels and opcodes, do all the cycle counting, format everything while still keeping all comments, and so on. It shouldn't be too hard but you need to account for all of the other things that might be thrown in that are not opcodes or distinguish the opcode like ".byte", ".w", ".wx", ".wy", ".ind" etc... Also it should probably have controls to adjust the spacing of the output format and have the ability to insert scanline breaks (i.e. ";-----------------------") to help separate each line.

Link to comment
Share on other sites

That spreadsheet idea is really neat! :thumbsup:

 

Fortunately the complexity of 6502 instruction timing is on the lower end, and after doing cycle counting on a couple of kernels manually I no longer need to look up any details, as the fundamental rules are quite simple as Omegamatrix said. So in the end I think a dedicated tool would not be worth the effort.

Link to comment
Share on other sites

What I really want is a 6502 IDE plugin for IntelliJ where it:

 

  1. automatically adds the cycle count comments for each line (sure, it's not hard to maintain by hand, but why should I do a computer's job?)
  2. Lets me specify a "start" point via a special comment, and will keep running counts from there (including all possible branches)
  3. Lets me highlight a block of code, and it will analyze just that, and tell me the min/max cycle counts for that block
  4. Auto-suggests obvious optimizations (ie converting a jmp to a beq or bne if the target is in range and the status of the Z flag is knowable)
Link to comment
Share on other sites

If you just give it snippets of code then the challenge becomes determining Zero Page or Absolute addressing. If you saw this code:

 

lda Value1

should it add this for the cycle count:

lda Value1 ; 3

or this:

lda Value1 ; 4

 

 

That's why it would have to be a full IDE and not just a simple editor. A smart IDE would analyze your whole codebase, and know what labels were where, and where page boundaries were (and in my dream world, would show a marker on a line where it takes an extra cycle because of a boundary, so that it would be obvious).

 

For now, though, vim is good enough for me :-/

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...