Jump to content
IGNORED

Math FPU


Recommended Posts

On 5/28/2017 at 2:55 AM, Matej said:

The beauty and simplicity of the (excellent) Apple II architecture.

 

Like no other, on this class, for sure.

 

  • Like 1
Link to comment
Share on other sites

On 3/9/2021 at 10:12 PM, Mazzspeed said:

The NTSC Plus4 is a bit unique in that it runs it's processor at 2.2Mhz when the screen is blanked, making it one of the fastest 6502 compatible machines released

 

Retrospectively, the Apple II/c,  with its compact, portable and elegant design, reaches 4.0 Mhz on a 6502/compatible CPU, and up to 1.1+ MB of RAM on CPU-bus. 

 

It was also the very last of its kind, and its generation. Talk about ending an era, with some class and dignity... ;-) 

 

I guess that is why they command the price$$$ they do...

  • Like 1
Link to comment
Share on other sites

6 hours ago, Faicuai said:

 

Retrospectively, the Apple II/c,  with its compact, portable and elegant design, reaches 4.0 Mhz on a 6502/compatible CPU, and up to 1.1+ MB of RAM on CPU-bus. 

 

It was also the very last of its kind, and its generation. Talk about ending an era, with some class and dignity... ;-) 

 

I guess that is why they command the price$$$ they do...

Which also came with it's share of issues...

 

As far as I'm aware most Apple users opted to run the machine in 1Mhz mode for compatibility reasons.

 

Having said that, I don't think the device sold well, especially outside the US - As I've never actually seen one. IMO the IIGS was the better machine that I'm sure Steve Jobs would have preferred never saw the light of day in favor of his (non preemptively multi tasking) Macintosh.

 

I owned a Powerbook 170 in the day, I never really used it. TBH, Apple products have never really inspired me in the slightest. However I was close to pulling the trigger on a nice IIGS once, that's probably the only Apple product I really like.

 

When it comes to accelerating 8 bit machines, I think the Ultimate 64 with it's 6502 implemented in FPGA running up to a maximum of 48Mhz highlights that the future is in FPGA - Not second hand or surplus 6502 variants that introduce compatibility issues. People class the Amiga as the true successor to the A8 line and the fact is that an FPU on an Amiga is basically never used, it's essentially a complete waste of PCB space.

Edited by Mazzspeed
Link to comment
Share on other sites

25 minutes ago, Mazzspeed said:

the fact is that an FPU on an Amiga is basically never used, it's essentially a complete waste of PCB space.

Well, there is a reason it may be a "waste of PCB space" and the Amiga, and not on (say) the PC and all of its iterations.

 

The Apple II/c is the very LAST link on the Apple-II line (8bits). It sold little, because it was at the very end of the product life-cycle, already. By that time, 16biit computing had already taken off in full force. I am even surprised they sold as much, and how MUCH the command on eBay... they run in circles around almost all 8-bits out there.

 

I fully agree with FPGA-based re-implementations of 6502... but 48 Mhz is just a stop-gag. Only when we get to 200-400 Mhz (while hooked up on local-bus) is when we'll go places... and pretty far, I'm sure. 

 

 

Edited by Faicuai
Link to comment
Share on other sites

4 minutes ago, Faicuai said:

I fully agree with FPGA-based re-implementations of 6502... but 48 Mhz is just a stop-gag. Only when we get to 200-400 Mhz (while hooked up on local-bus) is when we'll go places... and pretty far, I'm sure. 

How though - the ANTIC & GTIA work lock step with the 1.79MHz signal on a pixel by pixel basis.  They will not run at higher speeds.  If we only get the higher clock speeds in the vblank time, why stop at 200MHz - go for 2GHz to make up for the 90% of the time we're still locked to 1.79MHz.

Link to comment
Share on other sites

2 minutes ago, Faicuai said:

Well, there is a reason it may be a "waste of PCB space" and the Amiga, and not on (say) the PC and all of its iterations.

Both machines were designed for different markets.

 

I remember back in the day, the IBM PC was demanding stupid prices few home users could initially afford, and it didn't play games well - At all. The Amiga's/ST's and other 16/32bit machines were the ones most of the younger generation lusted over.

 

Then the clones started slowly trickling into the market, prices began to drop and Windows 95 was released. Once Gabe Newell (who worked for MS at the time) started pushing for gaming under the PC platform after the release of DOOM and DirectX was released - That was the beginning of the end for Motorola 68k machines.

 

For better or worse...

 

 

Link to comment
Share on other sites

5 minutes ago, Stephen said:

How though - the ANTIC & GTIA work lock step with the 1.79MHz signal on a pixel by pixel basis.  They will not run at higher speeds.  If we only get the higher clock speeds in the vblank time, why stop at 200MHz - go for 2GHz to make up for the 90% of the time we're still locked to 1.79MHz.

Well, either that or...

 

...an ENTIRE CPU-board re-implemented on FPGA (we already have Sophia), and we missing the CPU and ANTIC... drop-in replacement on the 800...

 

8-)

Edited by Faicuai
(missing part)
Link to comment
Share on other sites

Just now, Faicuai said:

Well, either that or...

 

...an ENTIRE CPU-board re-implemented on FPGA (we already have Sophia), and we missing the CPU and ANTIC...

 

8-)

Now you're talking.  Imagine the possibilities if we had the full compliment of ships working at crazy high speeds.  We've seen what ANTIC can pump out (60fps video) when being driven by a high speed DMA source.  Unlike just running an emulator (or Eclaire XL) in turbo mode, we'd get the increase resolution and colour depth as well.

  • Like 1
Link to comment
Share on other sites

Just now, Faicuai said:

Well, either that or...

 

...an ENTIRE CPU-board re-implemented on FPGA (we already have Sophia), and we missing the CPU and ANTIC...

 

8-)

I think that's what they do with the Ultimate 64, which is good as it's not emulation as every logic gate is still perfectly recreated in hardware using FPGA.

 

However, as stated, it's effectively useless as software designed to run at ~1Mhz is off the charts at 48Mhz. Realistically, you run 4Mhz tops, and even then its too fast.

Link to comment
Share on other sites

Just now, Stephen said:

Now you're talking.  Imagine the possibilities if we had the full compliment of ships working at crazy high speeds.  We've seen what ANTIC can pump out (60fps video) when being driven by a high speed DMA source.  Unlike just running an emulator (or Eclaire XL) in turbo mode, we'd get the increase resolution and colour depth as well.

As I said... then we'll go places... ;-)

 

 

Link to comment
Share on other sites

6 minutes ago, Faicuai said:

As I said... then we'll go places... ;-)

 

 

You have to be careful, as at what point do you look back and say "It's no longer an A8"...

 

Point in case: The Vampire range of accelerators for the Amiga. Effectively the Amiga is just a keyboard and IO, even the custom chipset is ignored for the logic built into the Vampire. Hence the reason there's a stand alone Vampire.

Link to comment
Share on other sites

19 minutes ago, Mazzspeed said:

You have to be careful, as at what point do you look back and say "It's no longer an A8"...

 

Point in case: The Vampire range of accelerators for the Amiga. Effectively the Amiga is just a keyboard and IO, even the custom chipset is ignored for the logic built into the Vampire. Hence the reason there's a stand alone Vampire.

Everything has its use and purpose. Gotta love that Vampire, though! 

 

Running a compact, practical GUI, even on Antic's existing high-res mode, but with plenty of memory, fast GFX updates, and connectivity (and code-wise done with 6502-compatible code, and using hosts' I/O resources, is not not bad at all)... I am a fan of productivity.

 

Kiddie-gaming is not the only gaming in town... except an arcade, buttery-smooth port of Galaga with tons of color and shooting a gazillion sprites flying around... 8-))

Edited by Faicuai
Link to comment
Share on other sites

On 5/28/2017 at 8:55 AM, Matej said:

The Number Cruncher is a copied version with all the bugs ironed out
http://www.asic.cc/NumberCruncher.html

image.gif.5e2720296fa40b7496e4936dccce9977.gif

 

who knows whether that design could be adapted for the Atari?

 

image.gif

  • Like 1
Link to comment
Share on other sites

1 hour ago, Mazzspeed said:
2 hours ago, _The Doctor__ said:

give antic it's own memory and you won't need to halt the cpu... the architecture was heading that way... why do you think there are different antic access modes regarding memory?

Now this would be more beneficial than an FPU.

This is also another reason to love VBXE.  It can be used with the ANTIC turned completely off.  Fulltime 1.79MHz to the CPU and a stock display if required, or a much better than stock.  What's to not like?

  • Like 1
Link to comment
Share on other sites

Need to build an interface for that Apple 2 circuit. 

 

http://retro64.altervista.org/blog/assembly-math-6502-8-bit-fast-multiply-routine-16-bit-multiply-without-bit-shifting/

Another work around for fast integer multiple and divide is use table driven routines. Seen on an Commodore 64 website. If your program still has space remaining, sound not be difficult to use. Programmers even have fast ways to do trigonomentry functions. 

Edited by CuloMajia
Link to comment
Share on other sites

55 minutes ago, Stephen said:

This is also another reason to love VBXE.  It can be used with the ANTIC turned completely off.  Fulltime 1.79MHz to the CPU and a stock display if required, or a much better than stock.  What's to not like?

VBXE is actually really impressive. Apart from hardware 80 column support, even the titles/demo that have been coded to make use of the additional features VBXE provides look amazing.

 

If you could just get people to code for it...

  • Like 1
Link to comment
Share on other sites

12 hours ago, Mazzspeed said:

I think that's what they do with the Ultimate 64, which is good as it's not emulation as every logic gate is still perfectly recreated in hardware using FPGA.

The results of the logic gates are represented in the FPGA as truth tables. It isn't the exact circuit replication. Far from it. Just the functionality as seen on paper.

 

12 hours ago, Mazzspeed said:

However, as stated, it's effectively useless as software designed to run at ~1Mhz is off the charts at 48Mhz. Realistically, you run 4Mhz tops, and even then its too fast.

When I want speed on a classic system I turn to emulation. I can have a 1GHz 6502 in Applewin on an i9. 500MHz on an older i5/i7. A beautiful thing to render 3D spirographs, lissajous spirals, bessels, and ripples in realtime with Applesoft BASIC. Things we only imagined as kids.

 

12 hours ago, Mazzspeed said:

You have to be careful, as at what point do you look back and say "It's no longer an A8"...

Absolutely right. This incongruity arises at many points on many systems. While I'm all for speed and such, too much simply changes your original rig into something too far removed from what it was. I like classic systems for what they were.

 

12 hours ago, Mazzspeed said:

Point in case: The Vampire range of accelerators for the Amiga. Effectively the Amiga is just a keyboard and IO, even the custom chipset is ignored for the logic built into the Vampire. Hence the reason there's a stand alone Vampire.

This state of affairs is common with accelerators. Even in the Apple II. The accel cards of the day took over and ignored the main memory and cpu. They just slowed down to interface with the expansion cards and other existing peripherals. It really isn't the original console doing any work anymore.

Link to comment
Share on other sites

1 hour ago, Keatah said:

The results of the logic gates are represented in the FPGA as truth tables. It isn't the exact circuit replication. Far from it. Just the functionality as seen on paper.

The data sheets for any hardware logic gate will provide truth tables, they're the very basic functionality of the logic gate in question.

 

However, from the perspective of retro computing, FPGA is a wonderful thing. Not only because specialized chips used in the manufacture of our beloved machines are no longer available and FPGA provides us with a means to recreate such often fragile chips, but because FPGA allows us to recreate the chips in question and run them faster than they were ever intended to run.

 

1 hour ago, Keatah said:

This state of affairs is common with accelerators. Even in the Apple II. The accel cards of the day took over and ignored the main memory and cpu. They just slowed down to interface with the expansion cards and other existing peripherals. It really isn't the original console doing any work anymore.

The difference is the Amiga is designed from the onset to have separate fast ram dedicated to the processor and chip ram for the custom chipset, so the CPU doesn't have to halt during DMA - A totally different arrangement to the fairly simple IIe. The point being: you still used the custom chipset for all graphics and sound duties as well as IO.

 

With a Vampire fitted, the custom chipset can be bypassed entirely, the Vampire has it's own HDMI output - So in certain cases regarding the OCS/ECS chipset the Vampire isn't slowing down for the peripherals at all, it's completely replacing them! It's at this point that the original machine has effectively become a keyboard and a casing and one has to wonder if it's really the actual Amiga anymore or just a re-implementation compatible with AmigaOS. It's the same with the Turbo Chameleon for the C64, you can actually run the cart as a full C64 independent of the host machine.

 

I don't really class myself as a purist, but once you reach that point, I think you've negated what you were really trying to achieve in the first place - And that is to improve the existing design, not to replace it entirely.

 

I guess it's a slippery slope...

Edited by Mazzspeed
  • Like 1
Link to comment
Share on other sites

Yes I can see that.

 

These days I'm not really a purist or datecode chaser. I'm interested in capturing the experience like I had back in the day and bringing it forward - sometimes with reliability enhancements and perhaps in altogether new formfactors via emulation on SFF PC. Like the childhood dream of an All-In-One machine.

 

The TurboChameleon cart would have interested me back in the day. But not so much today. BITD my buddies and I would've been looking for ways to practically use it to bring the platform into the modern era. Today I just use modern hardware and enjoy my Apple II (and other vintage material) for what it is.

 

On FPGA. Yes I believe they will be necessary at some point and integral to keeping the old machines in working order. Especially for replacement parts. For daily use I still lean toward software emulation and all the side goodies it offers.

Link to comment
Share on other sites

  • 2 weeks later...
On 3/9/2021 at 7:12 PM, Mazzspeed said:

The SFD1001 was similar to the Plus4 as it used the cartridge port to facilitate the connection, however I don't believe the Plus4 used the same IEEE-488 parallel protocol the SFD1001 used. The NTSC Plus4 is a bit unique in that it runs it's processor at 2.2Mhz when the screen is blanked, making it one of the fastest 6502 compatible machines released. Using a bus expander you could use the IEEE-488 interface cartridge as well as any other cartridge on the C64 provided address spaces didn't conflict. 

 

However, as stated, the 1541 achieves transfer speeds of a little under 6000 bytes per second using modern S-JiffyDOS drive ROMS - Which is actually slightly faster than the SFD1001. However the SFD1001 had the advantage of capacity, being able to store up to 1MB of data.

 

Personally, I think it would be easier and more beneficial (and compatible) to implement an FPGA 6502 able to run at far higher speeds than any off the shelf or recycled FPU wouldn't it? The Ultimate 64 uses this method and can run up to speeds of 48Mhz, which is silly fast for any practical use, but more compatible than even aftermarket (and rare) accelerators using 6502 variants and likely more beneficial than an FPU which would require dedicated code in order to be utilized correctly.

 

Once again, please no 'us vs them'. I'm just mentioning the other product as there are some interesting implementations that as far as I'm aware have never been attempted on the A8 line.

 

Thanks for that indepth post. Good stuff.

 

I guess a secondary 650x could also work but probably wouldn't outperform an actual FPU except for the fact that modern CPU variants and FPGAs can be a lot faster based upon clock speed. You seem to be a knowledgeable Commodore user so you probably would know which existing titles already have the ability to off-load certain processes to the 650x in the 1541 and other Commodore disk drives.

 

Using multiple 650x CPUs would have the advantage of using compatible code. And it would be similar to Nintendo's approach with the SNES using faster [cartridge-based] custom 65816s as co-processors. But I think that's already been mentioned and possibly by myself.  :) 

 

Link to comment
Share on other sites

On 3/12/2021 at 2:34 PM, Spancho said:

The Number Cruncher is a copied version with all the bugs ironed out
http://www.asic.cc/NumberCruncher.html

image.gif.5e2720296fa40b7496e4936dccce9977.gif

 

who knows whether that design could be adapted for the Atari?

 

image.gif

 

That's pretty insane - and cool - that there were/are boards to add Motorola 68881/2 FPUs to the Apple II line. It could've been done for A8 as well had the 1090 been released and caught on. Then again, Apple did have better success at selling the A2 line to businesses and the education market than Atari did with the A8. The 68881/2 didn't really catch on with the majority of Atari ST/Amiga owners since the FPUs were so damn expensive at the time, most of the entry level STs and Amigas didn't have sockets for them on their motherboards, and there was a lack of games that supported them. By the time games on the PC debuted that took advantage of FPUs, the ST and the Amiga were basically on their last commercial legs.

Link to comment
Share on other sites

  • 3 months later...
On 3/23/2021 at 6:22 PM, Lynxpro said:

 

That's pretty insane - and cool - that there were/are boards to add Motorola 68881/2 FPUs to the Apple II line. It could've been done for A8 as well had the 1090 been released and caught on. Then again, Apple did have better success at selling the A2 line to businesses and the education market than Atari did with the A8. The 68881/2 didn't really catch on with the majority of Atari ST/Amiga owners since the FPUs were so damn expensive at the time, most of the entry level STs and Amigas didn't have sockets for them on their motherboards, and there was a lack of games that supported them. By the time games on the PC debuted that took advantage of FPUs, the ST and the Amiga were basically on their last commercial legs.

Just in case...
http://www.geekdot.com/numbercruncher-reloaded/

Link to comment
Share on other sites

  • 1 year later...

I've wondered what role the Parallax Propeller 2 could play here. It has been shown to be able to emulate the 6502 at up to 14 Mhz (board side). That is a bit slow for a modern microcontroller, but for a good reason. It has 1 meg of 8-channel concurrent DMA RAM. If you put the 6502 and Antic cores on this, you wouldn't need a halt line, but you'd use the /RDY line in its place, since you wouldn't need exclusive access due to hub arbitration, but you'd probably want to emulate CPU gating. That would be the simplest way to emulated bus-mastering DMA on a concurrent DMA setup. You'd only need to use hardware means to prevent software races (like overwriting the display list buffer in mid-use) since the hardware races are already covered.

 

I mention this here because one could use one of the other P2 cogs as a space to implement a math chip. Just use the P2 features. Hardware RNG, 32/32/64 multiplication, 64/32/32 division with modulus, some of the transcendental functions, etc. It also has the CORDIC solver. That is much slower (58+ P2 cycles) than the other math that takes 2 P2 cycles, but that said, if you map that into 1.79 Mhz time, that could still be done in 1 Atari cycle, depending on other overhead.

 

I had various ideas on how to use the P2 to emulate or accelerate an A800. A more elegant approach might be to emulate just the A800 system board and keep the rest of the machine original. And yeah, emulating RAM/ROM would be fine too. The old motherboard and the power brick would thank you too since the P2 would take less power than several RAM cards, Sally, ANTIC, and GTIA.

 

An idea I had was that is emulating the original 6502, implement the most used illegal opcodes, and mod the P2 6502 core to add new instructions like Mult, Div, RND, SIN, etc. Then the question that begs is "How could the system use them?" You could get some modest gains by rewriting the ROM (and BASIC) to add in the custom opcodes where applicable. Others in the Parallax forum said it would likely be better to just emulate the '816. And that is a thought. I'm not sure how one would use COP, for instance, but that seems like a way to pull in FPU support.

 

Another acceleration idea when using a microcontroller could be to find a way to rewrite the ROM to use the native instructions of the microcontroller and for interrupts and syscalls, use the native instructions (maybe pause the CPU if necessary for timing issues and race prevention) but still include the original ROM in case it is used in non-standard ways by software. So any LD/ST in that region would use the 6502 core and the original ROM, while interrupts and calls would use the incompatible core to run routines optimized to the native microcontroller code. And really, I've considered that for BASIC too. Just write a P2 Atari BASIC, use native opcodes for math and floating point, and keep the same memory map.

 

But if making a custom P2 math coprocessor core, I wouldn't know what would be the best way it could communicate with the CPU core. Obviously, using a P2 would mean you have the hub memory, plus cog and LUT RAM. So I don't don't how one might use the COP instruction (if emulating '816), the custom instruction that has Bill Mensch's initials (WDM? Apparently meant to control a coprocessor of some sort), or where to put a place or means to pass operands to a custom FPU.

Edited by Spotted Lady
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...