Jump to content

SYNERTEC 6516 - Pseudo 16bit CPU for A400/800?


Recommended Posts



I found an article by Randall Hyde the April 1980 issue of "MICRO - The 6502 Journal" about an pseudo 16bit upgrade chip to the 6502 called 6516. The article says that this chip has been designed for Atari Inc for the Atari 400/800 computers, but never made it. The article is attached as scans.



Another article from the same author brings a little light to the history of the 6502 chips and the ill-fated 6516






Link to comment
Share on other sites



I found an article by Randall Hyde the April 1980 issue of "MICRO - The 6502 Journal" about an pseudo 16bit upgrade chip to the 6502 called 6516. The article says that this chip has been designed for Atari Inc for the Atari 400/800 computers, but never made it. The article is attached as scans.



Another article from the same author brings a little light to the history of the 6502 chips and the ill-fated 6516





An interesting read :)

I'd never heard of that before, but I do like the idea of (some of) the extensions, though it seems to have acquired some additional fluff in its design.. Just having 16bit registers with the existing instruction set with a single 2byte opcode to control 8/16bit register sizes with memory accesses using the specific registers size would have been more than enough to give it a very much needed kick up the arse..


There's something about the simplicity of 6502 that I love and the 65816 doesn't have that attraction for me, and the 6516 with all the other instructions makes it feel like it's trying to be something like an 8088.. But it would have been a fun chip anyway at the time :)

Link to comment
Share on other sites

If they would have got them in the XL line of computers, they would have got a serious leg up on Commodore. Looking at some of those instructions, someone can write more efficient code and get more done within the VBI & DLI cycle. There is a single instruction to push and pull all the registers onto the stack and that be very useful for DLIs and allow someone to change all the color registers on every line or write a powerful player/missile multiplexer.

Link to comment
Share on other sites

Well, internally Atari was in fact working since 1981 on its own 16bit upgrade to the 6502. The Atari 6502 was codenamed "Sally" The 16-bit CPU was codenamed "Lynda" Here is an email from the former head of Atari's ASG (Advanced Semiconductor Group)


"When I joined Atari in 1981 one of my first assignments was to work on a true 16-bit

upgrade, software compatible version of the 6502, as one did not exist. The chip

manufactured by Synertek and Rockwell at the time was not acceptable as it would not

allow older Atari games to play with the new system, i.e. it was not upward

compatible with the older 6502. This then gave rise to my development work on the

"Lynda" chip, which was an upward compatible version of the older 6502 family and

had true 16-bit features, not "pseudo" 16-bit. I hired Mr. Fox, again as a

consultant to this project as he was familiar with the older architecture. During

this time frame, i.e. 1982 I was promoted to Vice President and General manager of

Atari's Semiconductor Group and subsequently turned the "Lynda" project over the

Atari's Home Computer group. As you can imagine, the NIH factor entered into the

equation and I never heard about the project again. When I left Atari in July 1984,

it was my understanding from Mr. Tramiel that his intention was to sell all of the

previous chip development projects which were not completed to outside sources. If

this is indeed what happened then perhaps Mensche bought the rights to "Lynda"-who

knows. It sounds as though the 65816 was another design, perhaps Synertek's,

licensed to Atari, as it was my understanding that "Lynda" was never finished."





  • Like 1
Link to comment
Share on other sites

That's the CPU I mentioned in a couple topics.

If you look around you will find some comments from the author of that article, not sure if it was usenet or where he posted but he was active in the Apple II community if I remember right.


The company contacted the author and denied ever saying anything about the CPU.

I'm not sure if their contract as a 2nd source for the 6502 prevented them from releasing it or if they were just testing the waters before attempting to build it.


Since it was never released and may never have even reached the design phase you can design opcodes to match the instruction/feature list yourself if you want to emulate it. It's pretty much a 65816 without support for additional memory so you could try to be compatible with the 65816 instructions where possible.

Link to comment
Share on other sites

I don't see any value in emulating it, 65816 emulation would be much more practical given that such an upgrade has been done.


On the other hand, if the CPU, or at least some enhanced 6502 was to be done in a PGA which could replace the 6502 on a real machine...

  • Like 1
Link to comment
Share on other sites

I don't see any value in emulating it, 65816 emulation would be much more practical given that such an upgrade has been done.


On the other hand, if the CPU, or at least some enhanced 6502 was to be done in a PGA which could replace the 6502 on a real machine...


Has there been no work on using the 65816 as drop in chips (same clock etc)but just using an adapter to make the thing pin compatible ?

Link to comment
Share on other sites

James...As far as I recall, from reading commodore sites, one of the 6502 designers, Bill Mensch inked a deal with Apple to co design a 16 bit 6502 (the 65816) as used in the IIgs, apparently apple didn't want to do any of the development/designing and only wanted the processor


Additionally I do recall seeing on some atari related site (that linked to mathy nistlerooys site) that there exists a version of atari800 emulator (i think it was the dos version) that also emulates the 65816 as well as 6502, it also had a link to a 65816 compatible OS (can't remember if it was the original less buggy 'Turbo OS' or the Drac 816 Polish OS)

Link to comment
Share on other sites

...that there exists a version of atari800 emulator (i think it was the dos version) that also emulates the 65816 as well as 6502


Altirra has the option to emulate the 65816..






don't think there's any patches to utilize the processor with the standard os and i don't think altirra allows you to select Alternate os's like a800win does

Edited by carmel_andrews
Link to comment
Share on other sites

Well, lets see... 6516 vs 65816.

The 6516 added pretty much all of what the 65C02 did and several features over the 65C02.

Pretty much everything the 65816 has minus the extended address buss features.

Direct Page register (Z)

The D register.

16 bit mode for X, Y and SP (A becomes D in 16 bit mode)

User defined flag and supporting instructions

The ability to directly push/pull all registers to/from the stack.

Some additional addressing modes.

You could reassemble 6502 code with minor changes to take advantage of any of those features.

Taking advantage of the 65816's additional features beyond the 6516 doesn't look so simple to me but I'm no expert on the 65816 by any means.



Most of the additions to the 6516 are borrowed from the 6809 but it falls a little short of the 6809.

B register is missing.

No multiply instruction.

No 2nd stack pointer.

6516 has to toggle between 8 and 16 bit modes where the 6809 has 8 and 16 bit instructions.

6809 uses a single instruction to PUSH/PULL multiple registers to/from the stack.


While the article says the missing items are no big deal, that's not quite so in reality.

MPY is definitely faster and as you do larger number multiplies where you may do several smaller multiplies to get the result it really adds up.

A later MICRO article said it was faster to transfer some math operations to a 6809 daughter card for the Apple II than to do them natively.

It can also make code smaller.

The 2nd stack pointer on the 6809 is regularly used as another index register and simulating additional stack pointers is slower.

Toggling modes clearly requires extra instructions and slower code as a result.


While it may not have equaled the 6809, I think the 16 bit registers and direct page register alone would have allowed programmers to write much faster code. Certainly eliminating much of the 6809's speed advantage.

The article's author suggests 30% faster code than a 6502 and I'd have to say at least that much.

Just as important, you can write smaller code allowing you to fit more in memory.

The gap between 6809 and 6516 would certainly have been much smaller than between 6502 and 6809.

The 6516 would probably have been cheaper as well.

I don't think the difference between 6516 and 65816 would have been more than a clock cycle here or there. The two chips appear very similar if you stay within 64K.


I ported some code from the Z80 (A simple music player for the AY sound chips) to the 6803, 6809 and 6502.

The 6803 code was much smaller than the 6502 even though the '03 only has one index register.

The 16 bit index register cut out a lot of the code the 6502 required and I think that is where the biggest gain would come.

The 6809 code was obviously the smallest and fastest.


Ultimately, there are times when having a B register is an advantage, times when having a Y register is an advantage, and even times when having an A register + B register make up the D register is an advantage... but having 16 bit registers almost always offered an advantage.

Even the 65C02's added features made the 6502 code a few instructions smaller and faster.

16 bit index registers eliminates the need for a lot of page zero use, freeing it for other things.

The 6516's larger stack and 16 bit registers would also support high level languages like C better.

I think supporting the added memory of the 65816 from C would be a little more complex but if you don't you have something similar to a 6809 C compiler, just added 8/16 bit mode instructions and no B register.



If Atari could have used the 6516 when the 8 bit computer was introduced it would have had a HUGE advantage in processing power over competitors at the time.

Programmers could have easily ported code to it and optimized crucial sections of code for more speed.

They could have also ported code from it, working backwards from the faster code to strait 6502 code while running both on the same machine.

Larger programs could have fit on carts without paging.

Faster built in math and as a result, faster Atari BASIC.

You name it, the Atari would have benefited.


Now, does the 6516 offer any advantage over the 65816? Given the comments from the IIgs community on 65816 compatibility I'd say no. Most IIgs compatibility issues were due to a change in the drive controller, not the 65816. Adding an old controller and drives let you run almost all the old software. Only a handful didn't run due to the 65816 (that was the word from the IIgs community anyway).


If you are implementing the 6516 in an FPGA you could possibly throw in some optimizations that would make it worthwhile. Maybe cache the current memory page and direct page to cut cycle times when cache is enabled. Otherwise the 65816 pretty much does what the 6516 does and more.

I think I would examine the instructions and address modes very carefully and only support those on the 65816. That way the 65816 would still be in the migration path.

  • Like 2
Link to comment
Share on other sites

  • 9 years later...

Just came back from VCF East (2019), and Joe Decuir referenced this processor in his keynote.. he said they (Atari) bid out the work to create a 16-bit version of the 6502 but they decided it created too much risk for the release schedule. This was sent out for bid in 1977 (!) not later -- so at least at some point during the design phase of the A8 they were thinking 16-bit.


I plan to email him and ask a bit about this, he referred to it as the 6509 though he admitted he had less than 3 hours of sleep from the flight over to NJ the night (morning) before.

  • Like 5
Link to comment
Share on other sites

How would this 6516 compare with the released 65802?



Sent from my iPad using Tapatalk

The 65802 just removes the 24 bit buss features vs the 65816.

You still have 16 bit register modes, new instructions, new address modes, and 16 bit stack pointer.


After looking over the 6516 opcodes a bit more, it appears as if it doesn't require 8/16 bit mode switching for all 16 bit additions like the 65802/65816 does.

With mixed 8/16 bit code, this could speed up some code, and it might make it easier to convert code from 8 bit to 16 bit.

It would depend on what you are doing though.


The single instructions to push/pull all registers would make the 6516 smaller/faster for some interpreters, compiled languages, as well as for some interrupt handlers.


The biggest difference is the faster instructions on the 6516.

I think the 65816/65802 keep the original instruction timing, where the 6516 was to speed up several opcodes.

The speedup is going to be program specific at times (some will benefit more than others), but overall *at least* 20% on the same code *should* have been possible..

The faster instructions look similar to the HD6303 vs the MC6803, so the part was probably going to be fully microcoded, and pipelined.


The thing the 6516 seems to lack that the 65802/65816 have are the memory move instructions.

That can make code smaller, and faster.

The code isn't always faster, as the memory move opcodes take 7 clock cycles per byte.

A partially unrolled loop of 16 bit moves using a fixed address range takes around 5 clock cycles per byte if I remember right.

Had they had even shaved off 2 clock cycles, it would always be better to use the new opcodes.

The HD6309 memory move instructions only require 3 clock cycles per byte, so there was a lot of room for improvement.

Had the 6516 included memory move instructions, it probably would have required 3 clock cycles per byte. That may have made it into a final part.


I now strongly disagree with his statement about the 6516 multiply executing fast enough that the hardware multiply shouldn't make that big of a difference.

For a 16 bit number, that may be somewhat true, but the author didn't seem to have the benefit of using the hardware multiply for floating point math where a the number of bytes involved is greater.

After converting the multiply on the MC-10 to use the hardware multiply, it makes a huge difference for the floating point math.

Mutliply, SIN(), COS(), etc... are all significantly faster.

If you run a 3D plot (which uses multiply, SIN, and COS) it adds up to over a 40% speedup.

For a 3D plot, the MC-10 with the optimized BASIC running at 0.89 MHz isn't much slower than the CoCo 3 running at 1.77 MHz with the standard BASIC.

Ahl's Benchmark is around 43% faster.

It just depends on the program.

FWIW, the HD6303 offers a 30% speedup for the hardware multiply, which would make the difference even more significant.

An MC-10 with a 6303 and the faster BASIC should be able to do the 3D plot in almost the same time as the CoCo 3 running at double speed.

FWIW, Rockwell had a multiply opcode on a few of it's 6502 based microcontrollers. Too bad they can't just drop into a 6502 socket.


Edited by JamesD
  • Like 3
Link to comment
Share on other sites

Something completely different... I never catch a 65802, looking for years on eBay for it. Has anybody any hint for a source?

Someone mentioned having a few on here, and they might sell them.

But I never saw them actually say they were selling them.

Maybe they sold them to people that commented via PM.


Other than that, the last time I saw them for sale was the 90s, and I've had that on an ebay search notification for as long as I can remember.

You could pick one up by purchasing something that has one, but you'd be taking it out of something that is even rarer than the 65802.




Has anybody tried this 65C816 to 65C802 converter yet?

You mean this?



The 65802 was supposedly a 65816 adapted to the 6502 pinout internally, though that info was 2nd hand if it was originally from Western Design.

If you want to put it in an Atari, you are better off getting the 4 MB board (Antonia?) as it doesn't cost much more.

This would make sense if you want to drop it in a different machine. I'm considering one for my IIc Plus as I want to make the MHz upgrade mod.


  • Like 1
Link to comment
Share on other sites

Some comments on the listed instructions...

LDS - saving and restoring the stack pointer directly to/from memory could make multitasking easier. I don't see and STS instruction though.

The LHA, LHX, etc... instructions have the potential to make code a little faster than a 16 bit LDA. You could also access data aligned on 256 byte blocks without changing the offset in the block, not that I can think of a use for that off the top of my head.

LAX, LAY, SAY would speed up indexing some data?

ADD, SUB lets you ditch a clear carry instruction. It's a small improvement, but nice to have. Motorola has the opposite problem with 16 bit registers, no add with carry. I think the add with carry requiring you to clear the carry is the better option since dealing with large numbers requires multiple 8 bit adds which are slower than a single clear carry.

AXA, AYA lets you save some clock cycles when you need to add the same value repeatedly in a loop and X or Y aren't needed.

AAY, AAX lets you load up a step value in the accumulator and just add it each pass in a loop. The similar ABX instruction on Motorola gets used a lot on the 6803.

AMX, AMY, if your registers are all used, this might same you some additional register shuffling/manipulation.

NEG not sure how often you'd need it, but it's faster & smaller than the 6502 equivalent operation.

The shifts should help with code size & speed in places.

TZA (TAZ?) This is borrowed from the 6809. Opcodes to set page zero's high byte are a big deal. You can leave the interpreter/OS's page zero alone. Individual tasks can have their own page 0, and even individual functions that would benefit from a lot of direct page space will benefit.

XHA, XHX, XHY let you switch between big and little endian numbers easily. Great for dealing with files exchanged between systems, but taking 3 opcodes for that?

The user definable flag should be faster than using page 0 to hold a flag, and saves you having to find space on page zero.

If you have 16 bit registers, it only makes sense to have 16 bit stack instructions to deal with them.

BR1,BR2, BR3 seems like an attempt to create a feature of the Z80. You could place OS function pointers at those addresses, or replace calls that are used a lot to shrink some code. This may be more useful in a microcontroller with very little ROM space. They should take the same number of clocks as a standard JSR, but it wouldn't be as fast as BSR would have been. (BSR is a relative branch with a byte offset) It could have been BSR0, BSR1, BSR2, where the number is the 8th and 9th bits of the offset, and the following byte is the other 8 bits including sign. This gives the BSR a +- range of almost 512 bytes. It also implies a BSR3 since you are dealing with 2 bits, which gives you almost a 1K jump range for subroutines, while only using a single post byte for the address info. Just a thought... kinda pointless unless you design a new part.

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...