SaturnGoddex Posted May 26, 2023 Share Posted May 26, 2023 Howdy all, I've made this thread just as a placeholder for now, I've been working on designing a new Apple IIGS accelerator. It'll run somewhere in the range of 14mhz (though it may be possible to overclock for the bold of heart), and should be a reasonable price (especially compared to vintage ones). It will feature an integrated ram expansion and a new approach to the cache that (fingers crossed it works) might make it faster than any other accelerators at the same frequency. I don't want to make any firm promises on a timeline but a very good portion of the design is already done and I'd expect to have a prototype ready sometime soon. If you're interested, I did put together a little survey, filling it out would be helpful to me. I did add an option to leave an email address if you'd like any updates on it, but it's completely optional and any news will also be posted to this thread. Survey: https://forms.gle/C3QA4jtwn41cwDkAA Otherwise, enjoy, and fingers crossed for exciting things to come! 5 Quote Link to comment Share on other sites More sharing options...
vespertillio Posted May 26, 2023 Share Posted May 26, 2023 Sounds promising and fun. Survey submitted. 🙂 Quote Link to comment Share on other sites More sharing options...
bikeguychicago Posted May 26, 2023 Share Posted May 26, 2023 (edited) How do you envision this project compared to existing solutions like AppleSqueezer and Transwarp? Edited May 26, 2023 by bikeguychicago 1 Quote Link to comment Share on other sites More sharing options...
magnusfalkirk Posted May 26, 2023 Share Posted May 26, 2023 Sounds interesting. Already have an Applesqueezer in my ROM 01 GS but in need of one for the ROM 03. Survey submitted. Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted May 27, 2023 Author Share Posted May 27, 2023 15 hours ago, bikeguychicago said: How do you envision this project compared to existing solutions like AppleSqueezer and Transwarp? The first thing (something that may or may not be something anyone cares about but me) is that this is all actual hardware, with no fpga or emulation solutions on the board. There is going to be a cpld handling some of the logic functions, but that's just because otherwise I'd have to fit around 15 random chips worth of logic gates on a board that's already a little cramped. This tickles my fancy (I've personally just never felt that excited about a big black box of an fpga doing everything, it's technically very impressive but at that point I feel like I might as well just run an emulator.) It also means that I can take out the single most expensive component from the apple squeezer and use cheaper parts that will also be more readily available. This is just speculation on my part but I'd imagine component shortages are part of the reason other new accelerators are sold out and unavailable to me, so I've been making sure to avoid any parts that'll be difficult to get my hands on. The second thing is the bottleneck the IIGS has. Writing to video ram is the biggest obstacle for games, with writes always slowing down to the 1Mhz system bus regardless of how fast the CPU is. The specific idea I'm working with is using a queue for any writes to vram. It can be filled at the full accelerated 14mhz speed, with it then saturating 100% of the bandwidth of the slow bus. Basically it'd work more like a DMA, transferring data on every single clock cycle. As far as I know, the fastest way to move data into vram is using the PEA instruction, which takes 5 cycles to put 2 bytes into vram (assuming the stack has already been set to point there). Only 2 of those cycles actually need to touch slow memory, so the hypothetical best time for something like this would be 2000 nanoseconds for the actual slow writes, and about another 214s at the 14 mhz speed. But with the queue approach, as long as it isn't full, every cycle could be accelerated, meaning the whole process would take only 357 ns. 2214/357 ~= a 6x speed increase at the worst bottleneck in the system. Even if this is only true part of the time this can be a significant performance increase. Of course, once the cpu fills the queue it'll be stuck waiting to do writes in the old fashioned slow way. But since the queue is a decent length as long as on average we're writing only 1 byte to vram every 14 cycles then it empties as quick as it fills and we remain at max speed. Factoring in other code (most programs or games are not just writing a test pattern endlessly after all) I'm willing to bet this is true more often than not. Of course though, the proof is in the pudding, and I won't know if any of this actually works out until I have the thing in my hands and test it Quote Link to comment Share on other sites More sharing options...
haightc Posted June 27, 2023 Share Posted June 27, 2023 not an option in the survery, since I already have 8mb card I wouldn't need anymore RAM. I think my priority in an accelerator would be compatibilitity and still feeling like a GS. Quote Link to comment Share on other sites More sharing options...
leech Posted June 29, 2023 Share Posted June 29, 2023 One requirement that the AppleSqueezer messes up on... make it a vertical card. The AppleSqueezer requires that I remove most of my expansion cards (Like I had a Grapple printer card in there). On 6/27/2023 at 12:21 PM, haightc said: not an option in the survery, since I already have 8mb card I wouldn't need anymore RAM. I think my priority in an accelerator would be compatibilitity and still feeling like a GS. I had that in mine too, but removed the extra RAM for the AppleSqueezer. If I had the courage to up the clock speed on my Transwarp to 14 or 16mhz, I'd use that instead... not sure the 8mb vs 14mb is going to make all of that much of a difference. Curious; doesn't the 65c816 go up to 20mhz? Pretty sure that's what the Rapidus runs at for the Atari 8bits. Would be interesting to get one that fast in the IIGS. Though if the bottleneck will still be the video display, that seems less useful. Quote Link to comment Share on other sites More sharing options...
Iamgroot Posted July 1, 2023 Share Posted July 1, 2023 I believe the bottleneck for graphics is if only writing directly to graphics memory in bank $E1. The IIGS has shadowing, which is the Auxiliary bank, which allows graphics to be written at full speed. Then with shadowing turned on, the shadowing is only slowed to 1 Mhz when graphics is copied/shadowed directly to graphics memory. Quote Link to comment Share on other sites More sharing options...
DeathAdderSF Posted July 5, 2023 Share Posted July 5, 2023 On 6/29/2023 at 2:40 PM, leech said: make it a vertical card. Seconded. Most "extreme" IIgs users like to make use of all those nifty slots. I mean, why not? They're there. Any accelerator that compromises slot usage is out of the question, IMO. Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted January 27 Author Share Posted January 27 Well, the prototype is away at the pcb house. Fingers crossed Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted January 27 Author Share Posted January 27 On 6/29/2023 at 3:40 PM, leech said: One requirement that the AppleSqueezer messes up on... make it a vertical card. The AppleSqueezer requires that I remove most of my expansion cards (Like I had a Grapple printer card in there). I had that in mine too, but removed the extra RAM for the AppleSqueezer. If I had the courage to up the clock speed on my Transwarp to 14 or 16mhz, I'd use that instead... not sure the 8mb vs 14mb is going to make all of that much of a difference. Curious; doesn't the 65c816 go up to 20mhz? Pretty sure that's what the Rapidus runs at for the Atari 8bits. Would be interesting to get one that fast in the IIGS. Though if the bottleneck will still be the video display, that seems less useful. So I wasn't able to make the design a vertical card, it does have more clearance the the apple squeezer. Hopefully enough to make most cards a non issue (honestly I didn't have a lot of cards to measure and compare it to). As far as clock speed goes, that's a bit of an interesting topic. Officially according to WDC all 65c816's are rated for 14 mhz. But anecdotally they can handle up to ~20 mhz consistently, especially the non dip packages. I went with the PLCC variant since it saved space, but supposedly it's more tolerant of overclocking. I don't want to ship it as anything but the stock speed the parts are rated for, but the cpu speed will be configurable from a desk accessory for those so inclined to experiment. My main goal was getting GSOS to run as smoothly as possible while maintaining good backwards compatibility and napkin math says that after I added in the vram acceleration that end clock speed (past a point) doesn't matter Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted January 27 Author Share Posted January 27 On 7/1/2023 at 12:13 PM, Iamgroot said: I believe the bottleneck for graphics is if only writing directly to graphics memory in bank $E1. The IIGS has shadowing, which is the Auxiliary bank, which allows graphics to be written at full speed. Then with shadowing turned on, the shadowing is only slowed to 1 Mhz when graphics is copied/shadowed directly to graphics memory. From my understanding the specific write cycles are stretched when shadowing is enabled, but I could be wrong. Quickdraw routines write directly to bank $E1 so I didn't focus on chasing down that piece of minutia Quote Link to comment Share on other sites More sharing options...
+OLD CS1 Posted January 28 Share Posted January 28 58 minutes ago, SaturnGoddex said: Officially according to WDC all 65c816's are rated for 14 mhz. But anecdotally they can handle up to ~20 mhz consistently, especially the non dip packages. The CMD SuperCPU for the Commodore 64 runs a 65C816S at 20MHz in a PLCC package. Quote Link to comment Share on other sites More sharing options...
leech Posted January 28 Share Posted January 28 10 hours ago, OLD CS1 said: The CMD SuperCPU for the Commodore 64 runs a 65C816S at 20MHz in a PLCC package. The Rapidus for the Atari 8bit also hits 20mhz. Would be nice if all of these CPU upgrades got some software support on these platforms. Quote Link to comment Share on other sites More sharing options...
cathrynm Posted April 1 Share Posted April 1 On 1/27/2024 at 2:59 PM, SaturnGoddex said: As far as clock speed goes, that's a bit of an interesting topic. Officially according to WDC all 65c816's are rated for 14 mhz. But anecdotally they can handle up to ~20 mhz consistently, especially the non dip packages. I went with the PLCC variant since it saved space, but supposedly it's more tolerant of overclocking. I don't want to ship it as anything but the stock speed the parts are rated for, but the cpu speed will be configurable from a desk accessory for those so inclined to experiment. My main goal was getting GSOS to run as smoothly as possible while maintaining good backwards compatibility and napkin math says that after I added in the vram acceleration that end clock speed (past a point) doesn't matter That sounds really good as is. That sounds total plausible, that GSOS is spending all its time copying to video memory, I wonder now if emulators simulate this aspect accurately. I like the idea of a real 65816, though I might pick up an applesqueezer too if that shows up on the market. (I'm mostly commenting just so when this gets released I get notified, as I recently picked up a GS and have no acceleration at all.) Quote Link to comment Share on other sites More sharing options...
Modnarmai Posted April 7 Share Posted April 7 This is a good approach. Adding another option for us users has to be good as well. Quote Link to comment Share on other sites More sharing options...
jltursan Posted April 9 Share Posted April 9 Btw, any updates about this development? Quote Link to comment Share on other sites More sharing options...
cathrynm Posted April 14 Share Posted April 14 Will this card work with DMA, specifically the Drive Turbo card? Or will it also have the same limitation that the AppleSqueezer has? It would be nice to have the ability to work with DMA, even if at reduced speed. Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted April 24 Author Share Posted April 24 On 4/9/2024 at 3:24 AM, jltursan said: Btw, any updates about this development? Still working away. I wasn't happy with the first prototype so now I'm working on the second one. I'll post when there's any major news 2 Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted April 24 Author Share Posted April 24 On 4/14/2024 at 1:06 AM, cathrynm said: Will this card work with DMA, specifically the Drive Turbo card? Or will it also have the same limitation that the AppleSqueezer has? It would be nice to have the ability to work with DMA, even if at reduced speed. I did research into it, but adding DMA isn't possible (at least nicely) with how I'm handling addresses. It may be possible to implement in a later version, but I can't find any public specifications on how the Drive turbo actually operates at a low level and I don't have one to test on, so for now I'm not worrying about DMA compatibility. Honestly if I could tinker with it I could probably write a custom driver for the drive turbo that would hopefully perform as well as DMA (theoretically an 8mhz block move instruction would be able to saturate the 1mhz expansion bus anyways at 7 cycles a byte). I'm also considering if it would be faster and simpler to add some sort of SD interface on the accelerator itself. TLDR I'm going to be looking into storage options more closely once the actual accelerator is up and running, but that's a project for another day. For now, it won't have any dma capability Quote Link to comment Share on other sites More sharing options...
cathrynm Posted April 24 Share Posted April 24 2 hours ago, SaturnGoddex said: I did research into it, but adding DMA isn't possible (at least nicely) with how I'm handling addresses. It may be possible to implement in a later version, but I can't find any public specifications on how the Drive turbo actually operates at a low level and I don't have one to test on, so for now I'm not worrying about DMA compatibility. Honestly if I could tinker with it I could probably write a custom driver for the drive turbo that would hopefully perform as well as DMA (theoretically an 8mhz block move instruction would be able to saturate the 1mhz expansion bus anyways at 7 cycles a byte). I'm also considering if it would be faster and simpler to add some sort of SD interface on the accelerator itself. TLDR I'm going to be looking into storage options more closely once the actual accelerator is up and running, but that's a project for another day. For now, it won't have any dma capability For me, personally, an SD Card on device would be as good as Drive Turbo if we can store a few partitions on it, and can boot off of it to GS.OS or Prodos. I think the main place an accelerator is handy is GS.OS, really. I've never used HFS partitions, but I imagine some people might want that. Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted May 6 Author Share Posted May 6 On 4/23/2024 at 10:56 PM, cathrynm said: For me, personally, an SD Card on device would be as good as Drive Turbo if we can store a few partitions on it, and can boot off of it to GS.OS or Prodos. I think the main place an accelerator is handy is GS.OS, really. I've never used HFS partitions, but I imagine some people might want that. I took a closer look into it, actually it seems like I could make things even simpler by just sticking a big flash chip on the board. Prodos maxes out at a 16mb partition size, so just getting a 16 or 32 mb nor flash chip and writing up a little prodos driver would give you 1 very high speed device which could be paired with whatever secondary storage of your choice. Theoretically it should just be a matter of finding space on the board and writing that driver, then you'd have a storage device that operates at about 114% the theoretical max speed of a dma device (at least for reads) while only adding about $10 onto the end price tag. That's probably the approach I'll look into for the next version. It's a little ironic, I converted the whole accelerator to 3.3 volts to switch to use much more reasonably priced ram (hence the 8mhz cpu speed) but it did also curb-cut adding a good storage solution. Of course, I don't want to put the cart in front of the horse though Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted May 6 Author Share Posted May 6 More fun research - I finally managed to find a good reference for making a prodos device driver. Also realized I mistakenly thought it used 256 byte blocks when its actually 512. So the most likely storage option I'll do is 32mB of on board flash for a gsos install. I never really intended it to be the main way of getting data on and off the machine, but it's been a long standing annoyance of mine that I can't have my floppy emu emulate a hard drive and a disk at the same time to do file transfers. In my opinion a good IIGS setup needs at least 2 drives so I'm happy to make a relatively plain one and leave disk emulation to those to know more plus it would kinda be annoying to have to open the case up to get to an sd card every time you wanted to add files. 1 Quote Link to comment Share on other sites More sharing options...
cathrynm Posted May 6 Share Posted May 6 10 hours ago, SaturnGoddex said: More fun research - I finally managed to find a good reference for making a prodos device driver. Also realized I mistakenly thought it used 256 byte blocks when its actually 512. So the most likely storage option I'll do is 32mB of on board flash for a gsos install. I never really intended it to be the main way of getting data on and off the machine, but it's been a long standing annoyance of mine that I can't have my floppy emu emulate a hard drive and a disk at the same time to do file transfers. In my opinion a good IIGS setup needs at least 2 drives so I'm happy to make a relatively plain one and leave disk emulation to those to know more plus it would kinda be annoying to have to open the case up to get to an sd card every time you wanted to add files. Anything you can do to get GS.OS to load faster would be nice. Really, load times are as important as CPU speed on this platform. I'm fine with flash rom just soldered in. For me, 64MB and two 32MB partitions would be slightly more luxurious than one. But having one fast 32MB partition would be quite nice and you can get a nice set of apps installed on 32MB. I use Fujinet for 5.25" drive emulation myself. As long as real physical copy-protected Apple floppies still work at 1Mhz, I think it's good. Quote Link to comment Share on other sites More sharing options...
SaturnGoddex Posted May 6 Author Share Posted May 6 7 hours ago, cathrynm said: Anything you can do to get GS.OS to load faster would be nice. Really, load times are as important as CPU speed on this platform. I'm fine with flash rom just soldered in. For me, 64MB and two 32MB partitions would be slightly more luxurious than one. But having one fast 32MB partition would be quite nice and you can get a nice set of apps installed on 32MB. I use Fujinet for 5.25" drive emulation myself. As long as real physical copy-protected Apple floppies still work at 1Mhz, I think it's good. Keeping compatibility with real floppies is a big priority. It's a bit of a headache but I think dropping physical media is one plank too many for my ship of Theseus 3 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.