42bs Posted August 24, 2018 Share Posted August 24, 2018 One last note: add "stz $fda0,x" in the copy loop to zero out the palette. And yes, now the wheel is much much rounder ... :-) Quote Link to comment Share on other sites More sharing options...
sage Posted August 24, 2018 Share Posted August 24, 2018 just a hint: the second stage in the microloader is not encrypted. thus there is no real speed difference to your ansatz. if you want to do a REAL optimization, you have to choose the filler bytes such, that the multiplication is faster. Quote Link to comment Share on other sites More sharing options...
42bs Posted August 24, 2018 Share Posted August 24, 2018 (edited) The challenge was to fit as much as possible in the first 50bytes since this is the minimum. But, oh, I get your point, if the first stage is only a few bytes that loads the rest and we fill the remainder with optimal values, decryption plus additional loading might be quicker. But to find this kind of optimized code one needs to have exact cycle counts. There is a guy in the 6502-FB group who made a simulator with lots to debugging features. But for Apple][. I do not trust handybug's cycle count, but the decription could be run in another simulator ... Next challenge :-) Edited August 24, 2018 by 42bs Quote Link to comment Share on other sites More sharing options...
sage Posted August 24, 2018 Share Posted August 24, 2018 actually it depends on how the multiplication is written. if it cares about if bits are set or not. anyway: already with the microloader + unpacker for the first game binary you are so fast, that the "blob" from turning on is already disturbing the music 1 Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 28, 2018 Author Share Posted August 28, 2018 Yes Could it be that the last byte of the 50 bytes has to be 0? I seem to run into a problem/bug and that would explain it. Quote Link to comment Share on other sites More sharing options...
+karri Posted August 28, 2018 Share Posted August 28, 2018 If the last byte is 0 it stops decryption. Otherwise it continues with the next bootloader block. Quote Link to comment Share on other sites More sharing options...
enthusi Posted August 28, 2018 Author Share Posted August 28, 2018 Thank you. I noticed when I checked the BIOS rom. Got that byte saved now Quote Link to comment Share on other sites More sharing options...
enthusi Posted September 14, 2018 Author Share Posted September 14, 2018 This was the final version, that I now use, btw: enthusi_one_shot_loader dex txs ldx #31 l2 lda code,x pha stz $fda0,x dex bpl l2 bmi code_start code *=$1e0 code_start lda cart0 sta blocks2load load_a_full_block inc block ;I use $03 which is initialised as 0 from BIOS lda block jsr $fe00 tay pageloop lda cart0 target sta $200,y iny bne pageloop inc target+2 dex bne pageloop dec blocks2load bne load_a_full_block ready ;we are at $0200 here where the main game starts ;alternatively I load two more bytes as startadress (stored in stack) and use RTS in code to jump there 1 Quote Link to comment Share on other sites More sharing options...
42bs Posted September 14, 2018 Share Posted September 14, 2018 (edited) This was the final version, that I now use, btw: enthusi_one_shot_loader dex txs ldx #31 l2 lda code,x pha stz $fda0,x dex bpl l2 bmi code_start code *=$1e0 code_start lda cart0 sta blocks2load load_a_full_block inc block ;I use $03 which is initialised as 0 from BIOS lda block jsr $fe00 tay pageloop lda cart0 target sta $200,y iny bne pageloop inc target+2 dex bne pageloop dec blocks2load bne load_a_full_block ready ;we are at $0200 here where the main game starts ;alternatively I load two more bytes as startadress (stored in stack) and use RTS in code to jump there You know that 65C02 has "bra" Also, you should init AUDIN (see Karri's comment) Edited September 14, 2018 by 42bs Quote Link to comment Share on other sites More sharing options...
enthusi Posted September 14, 2018 Author Share Posted September 14, 2018 Yeah, right after BPL I still find BMI easier to read but that's just what I am used to AUDIN is for the main code then (this loader right now is fixed to $200 anyway and for own projects). Quote Link to comment Share on other sites More sharing options...
42bs Posted September 14, 2018 Share Posted September 14, 2018 Yeah, right after BPL I still find BMI easier to read but that's just what I am used to AUDIN is for the main code then (this loader right now is fixed to $200 anyway and for own projects). The problem with AUDIN is relevant if you use bank-switching. The loader must be in both banks (!), but your "main" application is likely on bank 0. Therefore one needs to init AUDIN, else it will not work on all Lynxes. Quote Link to comment Share on other sites More sharing options...
42bs Posted September 14, 2018 Share Posted September 14, 2018 Off topic: Loading with bank-switching This is a thought and I write it down, so it does not get lost :-) If the code is split up between both banks (RAID0), loading could be even quicker, as the block only needs to be selected only half as often. Means: First 1K is loaded from block x, bank 0, switch AUDIN, 2nd 1K is loaded from block x, bank 1. There is no need to have a full 512K game, it should work also with smaller ones. This trick might help playing a sample off the card: At least 0.256s at 8kHz Quote Link to comment Share on other sites More sharing options...
enthusi Posted September 14, 2018 Author Share Posted September 14, 2018 Yes, but even block changing is fast enough for 8Khz samples ; ) It would be a bit of fun to use it to load twice as many small files without byte-offset seeking. If you dont change block but just AUDIN you need a proper interleave for the banks since the ripple counter continues Quote Link to comment Share on other sites More sharing options...
42bs Posted September 14, 2018 Share Posted September 14, 2018 Yes, but even block changing is fast enough for 8Khz samples ; ) It would be a bit of fun to use it to load twice as many small files without byte-offset seeking. If you dont change block but just AUDIN you need a proper interleave for the banks since the ripple counter continues If you do not load full blocks, yes the counter poses a challenge. Quote Link to comment Share on other sites More sharing options...
enthusi Posted September 14, 2018 Author Share Posted September 14, 2018 For my current game it is a notable speed difference (well, only notable when you launch them side by side) if I use my own loader or the one lynxdir implements. Mostly just on the leftover green of the palette though But until I have use for it I tend to ignore AUDIN now. Quote Link to comment Share on other sites More sharing options...
sage Posted September 14, 2018 Share Posted September 14, 2018 you can check my code which is "by accident" compatible to what lynxdir is wrinting to the roms Quote Link to comment Share on other sites More sharing options...
+karri Posted September 14, 2018 Share Posted September 14, 2018 Great findings! There is so much new ideas, interleaved banks to speed up access. Keep offsets as zero to speed up loading files. And a single block loader. Quote Link to comment Share on other sites More sharing options...
sage Posted September 14, 2018 Share Posted September 14, 2018 Great findings! There is so much new ideas, interleaved banks to speed up access. Keep offsets as zero to speed up loading files. And a single block loader. That trick was already used in lynxer. you will find it in some of the old homebrew roms. but maybe by accident Quote Link to comment Share on other sites More sharing options...
42bs Posted September 15, 2018 Share Posted September 15, 2018 Yepp, LYNXER had #ALIGN, but not interleaving of bank 0 and 1. At least non of the 68k versions and the C lynxer (I think Matthias wrote it) did neither. Quote Link to comment Share on other sites More sharing options...
sage Posted September 15, 2018 Share Posted September 15, 2018 I am still not sure that you win a lot. as there is only one counter reset, you cannot read one bank (sample from interrupt) while the other is loading sprites etc. I am not evn sure if there is a second counter. Quote Link to comment Share on other sites More sharing options...
42bs Posted September 15, 2018 Share Posted September 15, 2018 There is only one counter. Using interleaving for arbitrary data seems complicated as the counter increase on every read. But reading a complete block works as the lower bits are zero after you have read the block on bank 0. Quote Link to comment Share on other sites More sharing options...
enthusi Posted September 15, 2018 Author Share Posted September 15, 2018 I see no problem with interleaving either. Imaginge complete 256 Byte pages. bank0 bank1 --------------- page1 X X page2 page3 X you'd still have to set anew block after n pages of course. And X doesnt have to be empty but could be just the same starting in back1 instead of 0. You gain nothing except for causing some confusion/obfuscation If you load full blocks in that fashion you effectively double the size of a block though. Quote Link to comment Share on other sites More sharing options...
42bs Posted September 16, 2018 Share Posted September 16, 2018 This was the final version, that I now use, btw: enthusi_one_shot_loader <snip> Funny. I found this code dated 2009 called micro_loader on my disc, but I am not sure who did it: .psc02 ; turn on 65SC02 instruction set RCART_0 = $fcb2 ; cart data register .org $0200 ldx #15 b0: stz $fda0,x dex bpl b0 ;; size in pages lda RCART_0 sta 2 ;; dst $233 lda #2 sta 4 asl ; 4 pages per block sta 5 ldy #51 ; already 51 bytes loaded in 1st block b1: lda RCART_0 sta (3),y iny bne b1 inc 4 ; next dst page dec 2 beq done dec 5 ; next block pages bne b1 inc 0 ; next block lda 0 jsr $fe00 ; select bra b1 done: It was written in ca65 syntax. Quote Link to comment Share on other sites More sharing options...
enthusi Posted September 17, 2018 Author Share Posted September 17, 2018 Oh, Never Seen this. Interesting. No C64 guy when $00 is being used :-) Are 0,1,2 even initialized? Quote Link to comment Share on other sites More sharing options...
+karri Posted September 17, 2018 Share Posted September 17, 2018 Interesting find. I remember seeing the file name at some point in time. Leaving out the directory is something I briefly discussed with Wookie when we were trying to understand the obfuscation and RSA encryption phases years ago. Some consoles like PSX stream in objects from the CD at run time. So you could move around in your world and automatically load in new objects in front of you and discard old objects behind you. This problem is very similar to displaying nautical charts using OpenGL. You need to have fast access to objects that are nearby in order to get decent drawing speeds on 4K displays. Once I get the eJagfest and "Shaken, not stirred" out of mind I could have a look on some experimental spatial engine that would allow creation of a RPG with dynamically loaded content. Perhaps Stardreamer could take a turn in that direction? To boldly go where no Atarian has ever gone before? 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.