Jump to content
IGNORED

BYTE sieve, some results


Recommended Posts

I spend a fun couple of days running the BYTE sieve benchmark on emulated computers, with SIZE=2048. The results chart is attached.

Some interesting notes:
The Model II was a very fast computer for its time.
Tandy's BASIC compiler was okay but not amazing. Microsoft's BASIC compiler did pretty darn well.
The Model 4 is a decent speed up over the Model II. I'm a little surprised, since the Model II had some very solid engineering and fast support chips. But the difference is noticeable even in general use.
The IBM PC 5150 was very fast for what it was designed to do: business calculations using text applications.
COMAL 2.0 was a decent speed-up for a C64. Too bad it wasn't more popular!

 

Byte Sieve results.png

Sieve.bas

  • Like 4
Link to comment
Share on other sites

10 hours ago, carlsson said:

So the 20 MHz 65816 SuperCPU on the C64 performs about the same as a 19 MHz 6502 on the BBC Master. I would have thought that BBC BASIC would push the Tube co-pro even further compared to the SuperCPU.

I think there is a bottleneck in the program somewhere. I ran a theoretical "max speed" 6502 (400Mhz or so) via PiTube on my master and the speed was still just under 5 seconds. 

 

I think the requirement to print each output in line 165 slows things down because it can only go as fast as the video system.

 

I'm going to try it without the print of each, just the raw computation and also running original program but using the video out via the new PiTube VDU driver.

 

 

 

 

Link to comment
Share on other sites

Removing the requirement to print each time a computation is done makes quite a difference on the tube (I expect it would on other platforms too).

 

  • BBC Master no Tube = 22 secs
  • 3Mhz Tube 65C02 = 14.23 secs
  • 4Mhz Tube 65C102 = 10.7 secs
  • 14.7Mhz 65816 = 3.8 secs

 

I'll run another set to see how they land with the print for each computation reinstated and running the video via the pitube driver.

  • Like 1
Link to comment
Share on other sites

The BYTE benchmark was designed to exercise the system. You should NOT optimize the program by removing the print statements. That's cheating!

 

The PicoMite system's BASIC I tested doesn't need line numbers - removing them would make the program run quicker. I did NOT remove the line numbers when I posted my time.

Link to comment
Share on other sites

11 hours ago, Forrest said:

The BYTE benchmark was designed to exercise the system. You should NOT optimize the program by removing the print statements. That's cheating!

Actually the BYTE benchmark was designed to test the performance of different languages (BASIC, PASCAL, COBOL, C, etc) doing the same test, not the system. At least that's what I took from the original article. (https://archive.org/details/byte-magazine-1981-09/page/n189/mode/2up). 

 

Also, before you YELL at people for optimising things :) , you should maybe check the original published benchmark listing (link above) and note the lack of PRINT after each computation is performed. 

 

The original published version doesn't have a requirement to PRINT after each computation. That looks like it was an addition to the version posted above. I understand your point though about adjusting a benchmark or "cheating" to favour the feature(s) of one platform or another, but my adjusted version was actually the same as the original, by complete fluke. 

 

I ran the OP benchmark in BBC BASIC on a 1Ghz ARM processor using the Tube interface to the BBC master. The time was 4.95 seconds because of the video bottleneck. That time is the same as it was with a 19Mhz 6502. That's fine as long as we accept the OP benchmark is written in such a way that it introduces an artificial "floor" due to how its written.

 

There is nothing wrong with running the benchmark in the OP as long as we accept that it has some limitations that might show up on some platforms (like bottlenecking to the PRINT speed). I think this only becomes a limiting factor when you have systems like the C64 + SCPU and BBC + Tube CoPro that can complete the test quicker than the artificial floor times that are introduced with the PRINT after each calculation. I guess it comes down to what you are trying to benchmark : computational performance of the processor and language or "the system". 

 

There is a another set of benchmarks that are quite interesting. https://en.wikipedia.org/wiki/Rugg/Feldman_benchmarks#Benchmark_8 . I did some testing with these a while ago and the results were quite fascinating especially when comparing between interpreted and compiled BASIC's. I think I tested on 41 platforms and combinations of different versions of BASIC. I'll dig out the results if there is any interest. 

  • Like 1
Link to comment
Share on other sites

My original January 1983 BYTE is tattered, taped and tainted by 39 years of scribbled notes. 

 

Given that the original intent was to test the efficiency of various languages, it makes no sense to insist that identical instructions be employed. The assembly version for a given processor isn’t going to much resemble the same algorithm expressed in BASIC running on the same hardware, except at the most abstract level.

 

Rather, what needs to be conserved from implementation to implementation is the underlying algorithm. In turn, in a given language, it makes sense to utilize all of the tools made available by the language to accomplish the algorithm as efficiently as possible.

 

So, for example, my BASIC sieve runs more quickly if I use multiple statements per line where possible. Is that “cheating?” I don’t think so, because the underlying alorithm is preserved. However, if I omit even numbers from consideration my result is no longer comparable, because the underlying (overarching?) alorithm has been changed.

 

As a practical matter, the January 1983 BYTE article reported so many languages running on so may platforms that it did become an interesting way to compare not just languages, but computers. For example, results for three or four FORTH implementations on the Apple II are reported. There are several FORTHs available for my platform of retro-obsessive choice.  Some comparison platform to platform becomes possible - for those platforms running FORTH running the sieve algorithm. It’s sort of retro-interesting.

  • Like 1
Link to comment
Share on other sites

  • 3 months later...

I was perusing the wider world of Atariage and found this topic.

We were doing something similar over on TI-99 side of the universe.

It is a bit masochistic to try it in TI-BASIC. :)

 

However the hand-coded version and GCC version are pretty impressive doing 10 iterations in 10 and 13 seconds respectively.

 

 

 

Link to comment
Share on other sites

I think there was another thread on here dealing with sieve.
The screen scroll is going to impact results.  Z80 machines can use a memory move instruction. 
Even though the Tandy CoCo and MC-10 CPUs support 16 bits, the screen scroll is only 8 bit code.
The CoCo 3 does it in around 50 seconds.

I'll never understand why someone ALWAYS has to run these benchmarks on a modern machine.
Never fails. 

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...