Jump to content
IGNORED

New IIGS Accelerator In Development


SaturnGoddex

Recommended Posts

Howdy all, I've made this thread just as a placeholder for now, I've been working on designing a new Apple IIGS accelerator. It'll run somewhere in the range of 14mhz (though it may be possible to overclock for the bold of heart), and should be a reasonable price (especially compared to vintage ones). It will feature an integrated ram expansion and a new approach to the cache that (fingers crossed it works) might make it faster than any other accelerators at the same frequency. I don't want to make any firm promises on a timeline but a very good portion of the design is already done and I'd expect to have a prototype ready sometime soon.

 

If you're interested, I did put together a little survey, filling it out would be helpful to me. I did add an option to leave an email address if you'd like any updates on it, but it's completely optional and any news will also be posted to this thread.

 

Survey: https://forms.gle/C3QA4jtwn41cwDkAA

 

Otherwise, enjoy, and fingers crossed for exciting things to come!

  • Like 5
Link to comment
Share on other sites

15 hours ago, bikeguychicago said:

How do you envision this project compared to existing solutions like AppleSqueezer and Transwarp?

The first thing (something that may or may not be something anyone cares about but me) is that this is all actual hardware, with no fpga or emulation solutions on the board. There is going to be a cpld handling some of the logic functions, but that's just because otherwise I'd have to fit around 15 random chips worth of logic gates on a board that's already a little cramped. This tickles my fancy (I've personally just never felt that excited about a big black box of an fpga doing everything, it's technically very impressive but at that point I feel like I might as well just run an emulator.) It also means that I can take out the single most expensive component from the apple squeezer and use cheaper parts that will also be more readily available. This is just speculation on my part but I'd imagine component shortages are part of the reason other new accelerators are sold out and unavailable to me, so I've been making sure to avoid any parts that'll be difficult to get my hands on.

 

The second thing is the bottleneck the IIGS has. Writing to video ram is the biggest obstacle for games, with writes always slowing down to the 1Mhz system bus regardless of how fast the CPU is. The specific idea I'm working with is using a queue for any writes to vram. It can be filled at the full accelerated 14mhz speed, with it then saturating 100% of the bandwidth of the slow bus. Basically it'd work more like a DMA, transferring data on every single clock cycle.

 

As far as I know, the fastest way to move data into vram is using the PEA instruction, which takes 5 cycles to put 2 bytes into vram (assuming the stack has already been set to point there). Only 2 of those cycles actually need to touch slow memory, so the hypothetical best time for something like this would be 2000 nanoseconds for the actual slow writes, and about another 214s at the 14 mhz speed. But with the queue approach, as long as it isn't full, every cycle could be accelerated, meaning the whole process would take only 357 ns.

 

2214/357 ~= a 6x speed increase at the worst bottleneck in the system. Even if this is only true part of the time this can be a significant performance increase. Of course, once the cpu fills the queue it'll be stuck waiting to do writes in the old fashioned slow way. But since the queue is a decent length as long as on average we're writing only 1 byte to vram every 14 cycles then it empties as quick as it fills and we remain at max speed. Factoring in other code (most programs or games are not just writing a test pattern endlessly after all) I'm willing to bet this is true more often than not. Of course though, the proof is in the pudding, and I won't know if any of this actually works out until I have the thing in my hands and test it :P

 

 

Link to comment
Share on other sites

  • 1 month later...

One requirement that the AppleSqueezer messes up on... make it a vertical card.  The AppleSqueezer requires that I remove most of my expansion cards (Like I had a Grapple printer card in there). 

On 6/27/2023 at 12:21 PM, haightc said:

not an option in the survery, since I already have 8mb card I wouldn't need anymore RAM.    I think my priority in an accelerator would be compatibilitity and still feeling like a GS.   

I had that in mine too, but removed the extra RAM for the AppleSqueezer.  If I had the courage to up the clock speed on my Transwarp to 14 or 16mhz, I'd use that instead... not sure the 8mb vs 14mb is going to make all of that much of a difference.

Curious; doesn't the 65c816 go up to 20mhz?  Pretty sure that's what the Rapidus runs at for the Atari 8bits.  Would be interesting to get one that fast in the IIGS.  Though if the bottleneck will still be the video display, that seems less useful.

Link to comment
Share on other sites

I believe the bottleneck for graphics is if only writing directly to graphics memory in bank $E1.  The IIGS has shadowing, which is the Auxiliary bank, which allows graphics to be written at full speed.  Then with shadowing turned on, the shadowing is only slowed to 1 Mhz when graphics is copied/shadowed directly to graphics memory.

Link to comment
Share on other sites

  • 6 months later...
On 6/29/2023 at 3:40 PM, leech said:

One requirement that the AppleSqueezer messes up on... make it a vertical card.  The AppleSqueezer requires that I remove most of my expansion cards (Like I had a Grapple printer card in there). 

I had that in mine too, but removed the extra RAM for the AppleSqueezer.  If I had the courage to up the clock speed on my Transwarp to 14 or 16mhz, I'd use that instead... not sure the 8mb vs 14mb is going to make all of that much of a difference.

Curious; doesn't the 65c816 go up to 20mhz?  Pretty sure that's what the Rapidus runs at for the Atari 8bits.  Would be interesting to get one that fast in the IIGS.  Though if the bottleneck will still be the video display, that seems less useful.

So I wasn't able to make the design a vertical card, it does have more clearance the the apple squeezer. Hopefully enough to make most cards a non issue (honestly I didn't have a lot of cards to measure and compare it to).

 

As far as clock speed goes, that's a bit of an interesting topic. Officially according to WDC all 65c816's are rated for 14 mhz. But anecdotally they can handle up to ~20 mhz consistently, especially the non dip packages. I went with the PLCC variant since it saved space, but supposedly it's more tolerant of overclocking. I don't want to ship it as anything but the stock speed the parts are rated for, but the cpu speed will be configurable from a desk accessory for those so inclined to experiment. My main goal was getting GSOS to run as smoothly as possible while maintaining good backwards compatibility and napkin math says that after I added in the vram acceleration that end clock speed (past a point) doesn't matter

Link to comment
Share on other sites

On 7/1/2023 at 12:13 PM, Iamgroot said:

I believe the bottleneck for graphics is if only writing directly to graphics memory in bank $E1.  The IIGS has shadowing, which is the Auxiliary bank, which allows graphics to be written at full speed.  Then with shadowing turned on, the shadowing is only slowed to 1 Mhz when graphics is copied/shadowed directly to graphics memory.

From my understanding the specific write cycles are stretched when shadowing is enabled, but I could be wrong. Quickdraw routines write directly to bank $E1 so I didn't focus on chasing down that piece of minutia

Link to comment
Share on other sites

58 minutes ago, SaturnGoddex said:

Officially according to WDC all 65c816's are rated for 14 mhz. But anecdotally they can handle up to ~20 mhz consistently, especially the non dip packages.

The CMD SuperCPU for the Commodore 64 runs a 65C816S at 20MHz in a PLCC package.

Link to comment
Share on other sites

10 hours ago, OLD CS1 said:

The CMD SuperCPU for the Commodore 64 runs a 65C816S at 20MHz in a PLCC package.

The Rapidus for the Atari 8bit also hits 20mhz.  Would be nice if all of these CPU upgrades got some software support on these platforms.

Link to comment
Share on other sites

  • 2 months later...
On 1/27/2024 at 2:59 PM, SaturnGoddex said:

As far as clock speed goes, that's a bit of an interesting topic. Officially according to WDC all 65c816's are rated for 14 mhz. But anecdotally they can handle up to ~20 mhz consistently, especially the non dip packages. I went with the PLCC variant since it saved space, but supposedly it's more tolerant of overclocking. I don't want to ship it as anything but the stock speed the parts are rated for, but the cpu speed will be configurable from a desk accessory for those so inclined to experiment. My main goal was getting GSOS to run as smoothly as possible while maintaining good backwards compatibility and napkin math says that after I added in the vram acceleration that end clock speed (past a point) doesn't matter

That sounds really good as is. That sounds total plausible, that GSOS is spending all its time copying to video memory, I wonder now if emulators simulate this aspect accurately. I like the idea of a real 65816, though I might pick up an applesqueezer too if that shows up on the market.  (I'm mostly commenting just so when this gets released I get notified, as I recently picked up a GS and have no acceleration at all.) 

Link to comment
Share on other sites

  • 2 weeks later...
On 4/14/2024 at 1:06 AM, cathrynm said:

Will this card work with DMA, specifically the Drive Turbo card?  Or will it also have the same limitation that the AppleSqueezer has? It would be nice to have the ability to work with DMA, even if at reduced speed.

I did research into it, but adding DMA isn't possible (at least nicely) with how I'm handling addresses. It may be possible to implement in a later version, but I can't find any public specifications on how the Drive turbo actually operates at a low level and I don't have one to test on, so for now I'm not worrying about DMA compatibility. Honestly if I could tinker with it I could probably write a custom driver for the drive turbo that would hopefully perform as well as DMA (theoretically an 8mhz block move instruction would be able to saturate the 1mhz expansion bus anyways at 7 cycles a byte). I'm also considering if it would be faster and simpler to add some sort of SD interface on the accelerator itself.

TLDR I'm going to be looking into storage options more closely once the actual accelerator is up and running, but that's a project for another day. For now, it won't have any dma capability

Link to comment
Share on other sites

2 hours ago, SaturnGoddex said:

I did research into it, but adding DMA isn't possible (at least nicely) with how I'm handling addresses. It may be possible to implement in a later version, but I can't find any public specifications on how the Drive turbo actually operates at a low level and I don't have one to test on, so for now I'm not worrying about DMA compatibility. Honestly if I could tinker with it I could probably write a custom driver for the drive turbo that would hopefully perform as well as DMA (theoretically an 8mhz block move instruction would be able to saturate the 1mhz expansion bus anyways at 7 cycles a byte). I'm also considering if it would be faster and simpler to add some sort of SD interface on the accelerator itself.

TLDR I'm going to be looking into storage options more closely once the actual accelerator is up and running, but that's a project for another day. For now, it won't have any dma capability

For me, personally, an SD Card on device would be as good as Drive Turbo if we can store a few partitions on it, and can boot off of it to GS.OS or Prodos. I think the main place an accelerator is handy is GS.OS, really. I've never used HFS partitions, but I imagine some people might want that.

Link to comment
Share on other sites

  • 2 weeks later...
On 4/23/2024 at 10:56 PM, cathrynm said:

For me, personally, an SD Card on device would be as good as Drive Turbo if we can store a few partitions on it, and can boot off of it to GS.OS or Prodos. I think the main place an accelerator is handy is GS.OS, really. I've never used HFS partitions, but I imagine some people might want that.

I took a closer look into it, actually it seems like I could make things even simpler by just sticking a big flash chip on the board. Prodos maxes out at a 16mb partition size, so just getting a 16 or 32 mb nor flash chip and writing up a little prodos driver would give you 1 very high speed device which could be paired with whatever secondary storage of your choice. Theoretically it should just be a matter of finding space on the board and writing that driver, then you'd have a storage device that operates at about 114% the theoretical max speed of a dma device (at least for reads) while only adding about $10 onto the end price tag. That's probably the approach I'll look into for the next version. It's a little ironic, I converted the whole accelerator to 3.3 volts to switch to use much more reasonably priced ram (hence the 8mhz cpu speed) but it did also curb-cut adding a good storage solution. Of course, I don't want to put the cart in front of the horse though :P

Link to comment
Share on other sites

More fun research - I finally managed to find a good reference for making a prodos device driver. Also realized I mistakenly thought it used 256 byte blocks when its actually 512. So the most likely storage option I'll do is 32mB of on board flash for a gsos install. I never really intended it to be the main way of getting data on and off the machine, but it's been a long standing annoyance of mine that I can't have my floppy emu emulate a hard drive and a disk at the same time to do file transfers. In my opinion a good IIGS setup needs at least 2 drives so I'm happy to make a relatively plain one and leave disk emulation to those to know more :) plus it would kinda be annoying to have to open the case up to get to an sd card every time you wanted to add files.

  • Like 1
Link to comment
Share on other sites

10 hours ago, SaturnGoddex said:

More fun research - I finally managed to find a good reference for making a prodos device driver. Also realized I mistakenly thought it used 256 byte blocks when its actually 512. So the most likely storage option I'll do is 32mB of on board flash for a gsos install. I never really intended it to be the main way of getting data on and off the machine, but it's been a long standing annoyance of mine that I can't have my floppy emu emulate a hard drive and a disk at the same time to do file transfers. In my opinion a good IIGS setup needs at least 2 drives so I'm happy to make a relatively plain one and leave disk emulation to those to know more :) plus it would kinda be annoying to have to open the case up to get to an sd card every time you wanted to add files.

Anything you can do to get GS.OS to load faster would be nice. Really, load times are as important as CPU speed on this platform. I'm fine with flash rom just soldered in. For me, 64MB and two 32MB partitions would be slightly more luxurious than one.  But having one fast 32MB partition would be quite nice and you can get a nice set of apps installed on 32MB.  I use Fujinet for 5.25" drive emulation myself. As long as real physical copy-protected Apple floppies still work at 1Mhz, I think it's good.

Link to comment
Share on other sites

7 hours ago, cathrynm said:

Anything you can do to get GS.OS to load faster would be nice. Really, load times are as important as CPU speed on this platform. I'm fine with flash rom just soldered in. For me, 64MB and two 32MB partitions would be slightly more luxurious than one.  But having one fast 32MB partition would be quite nice and you can get a nice set of apps installed on 32MB.  I use Fujinet for 5.25" drive emulation myself. As long as real physical copy-protected Apple floppies still work at 1Mhz, I think it's good.

Keeping compatibility with real floppies is a big priority. It's a bit of a headache but I think dropping physical media is one plank too many for my ship of Theseus

  • Like 3
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...