Jump to content
IGNORED

jzintv, macho mode, and PAL


Recommended Posts

(I swear there was a relevant thread around here somewhere, but oh well. Forum's been quiet since the contest anyway.)

 

I'm noticing that jzintv doesn't seem to work very well when combining PAL mode and any remotely high macho settings. When in NTSC, I can set macho=100 and get a pretty decent/accurate speedup. Much higher and things get... wonky, but fortunately 100x speed is adequate for most things. But in PAL mode - macho definitely speeds things up, but maybe a factor of 5-10x at most, no matter how high I set it.

 

Is there something weird about how the clocking is implemented that would trigger this? And yes, I have a valid use case here, I'm not just benchmarking ridiculous emulator states. :P

 

 

On the bright side, I can confirm that IntyBASIC's NTSC/PAL check is working perfectly. I just can't fully test out specific things in PAL.

Link to comment
Share on other sites

Hrm... This old Win7 laptop really, really wheezes along. It's got a Core2 Duo P8700, which PassMark claims gets a single-thread rating of 999. My Linux box has a Phenom II X4 965, whose single-thread rating is only 1189, but it rebuilds jzIntv in just seconds. Sure, the Linux box has 4 cores to the Window's box's 2 cores, but that's not nearly enough to explain the difference. (For the record, I have no idea how good PassMark is; it was literally just the first CPU benchmark I could find that compared these two archived online.)

 

Seriously... It takes 40 seconds to rebuild from scratch on my Linux box. In contrast, it's taken over a half hour to rebuild from scratch on my Windows box, and it's still going. That just doesn't seem right. I'd understand a factor of 2 or 3. But 50x? And both CPUs are pegged on the Windows box, so it's not like I'm waiting for the hard drive or something. (Memory isn't the issue either; I'm sitting at 1.2GB of 4GB used during the compile.)

 

I remember Windows being slow... but not this slow.

 

Anyway, once this is done compiling, I'll upload it. I may get to it in the morning; I need to head to bed soon.

Edited by intvnut
Link to comment
Share on other sites

Hrm... This old Win7 laptop really, really wheezes along. It's got a Core2 Duo P8700, which PassMark claims gets a single-thread rating of 999. My Linux box has a Phenom II X4 965, whose single-thread rating is only 1189, but it rebuilds jzIntv in just seconds. Sure, the Linux box has 4 cores to the Window's box's 2 cores, but that's not nearly enough to explain the difference. (For the record, I have no idea how good PassMark is; it was literally just the first CPU benchmark I could find that compared these two archived online.)

 

Seriously... It takes 40 seconds to rebuild from scratch on my Linux box. In contrast, it's taken over a half hour to rebuild from scratch on my Windows box, and it's still going. That just doesn't seem right. I'd understand a factor of 2 or 3. But 50x? And both CPUs are pegged on the Windows box, so it's not like I'm waiting for the hard drive or something. (Memory isn't the issue either; I'm sitting at 1.2GB of 4GB used during the compile.)

 

I remember Windows being slow... but not this slow.

 

Anyway, once this is done compiling, I'll upload it. I may get to it in the morning; I need to head to bed soon.

 

Hmm... Nothing about the Mac version there... :ponder:

 

;)

Link to comment
Share on other sites

 

Hmm... Nothing about the Mac version there... :ponder:

 

;)

 

Well, that's because I'm doing everything from my Mac now while in LTO's temporary west coast headquarters, and didn't happen to time its build because it was built first. Turns out that that's 45 seconds. :-) (2.3GHz Core i7; single-thread CPU Mark of 1634 on that same PassMark benchmark.)

 

 

It's been updated alongside the other two.

 

 

 

Edited by intvnut
Link to comment
Share on other sites

Figured out what was up with the laptop. I had to unplug it and plug it back in. It was stuck in some low power state. Instead of taking over 50 minutes to build, it now took 66 seconds. That sounds a bit more reasonable. Sheesh.

 

Anyway, in case you missed it up-thread, the latest builds are on my webserver. A nice Pi Day release.

 

  • Like 2
Link to comment
Share on other sites

My hero :D Can't wait to check these out!

 

Edit: you've changed your linking. I'm sure I can figure this out, but I noticed that a lot of files changed size dramatically.

./jzintv: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by ./jzintv)
./jzintv: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by ./jzintv)
./jzintv: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./jzintv)

Figured it out. It needs GCC 5.1.

 

Sometimes I envy the dead Windows world. At least you don't have to keep updating every toolchain just to use stuff. My kingdom for statically linked 2MB binaries! :lol:

 

Kvetching aside, I can happily report that PAL is fixed across the board. Works properly in IntyBASIC and jzintv in macho mode. AND I've been lucky enough to do some extensive testing and can report that near as I can tell, macho mode at ludicrous speed does in fact keep proper time, at least on the order of 0.003% or so. I think we can safely ignore any deviation that small (and that may well be an artifact in my coding). I've benchmarked it over some pretty extended periods, possibly more than any sane person would ever have.

 

Now to figure out why a) certain parts of my game run slower in general, and b) everything overall slower in PAL - but not at the 16% rate that I'd expect. It's more like 9-10% slower. I'm a bit confused, but this is the first time I've ever had to deal with this, so it ought to be interesting.

Link to comment
Share on other sites

I generally try to link everything statically, but IIRC there are some things that don't link statically properly in Linux these days (such as the resolver library, which nothing in jzIntv should really need), and I had some issues with missing static libraries under Ubuntu. I've started moving new code to C++ (in particular C++14 as I become more comfortable with it), which means newer tools, and bigger executables.

 

BTW, on PAL the CPU is actually faster (1MHz instead of 895kHz), so if you have some code sections that are longer than 16ms on NTSC, they're very likely still faster than 20ms on PAL. So, the same WAIT statement might chew 2 frames on NTSC and 1 frame on PAL. It doesn't take many of those to give the sort of result you're seeing.

 

You might consider playing with jzIntv's debugger (and its support for source-level debugging IntyBASIC programs) to cycle count stretches of your program.

Edited by intvnut
Link to comment
Share on other sites

BTW, on PAL the CPU is actually faster (1MHz instead of 895kHz), so if you have some code sections that are longer than 16ms on NTSC, they're very likely still faster than 20ms on PAL. So, the same WAIT statement might chew 2 frames on NTSC and 1 frame on PAL. It doesn't take many of those to give the sort of result you're seeing.

 

 

Ah, happy to know that there's an explanation. I already know that I need to start cycle-counting a few spots, and this is more evidence for it. Optimization time, hooray!

Link to comment
Share on other sites

 

Ah, happy to know that there's an explanation. I already know that I need to start cycle-counting a few spots, and this is more evidence for it. Optimization time, hooray!

 

If your game is so tight that you need to do low-level optimizations, then perhaps IntyBASIC is not the proper platform for it... :ponder:

 

 

 

The asm1600 assembler says hi! ;)

 

 

Link to comment
Share on other sites

 

Ah, happy to know that there's an explanation. I already know that I need to start cycle-counting a few spots, and this is more evidence for it. Optimization time, hooray!

 

Post anything you need optimising and we can see if there is a better way to do the same thing in IntyBASIC first.

  • Like 1
Link to comment
Share on other sites

Here's a quick and dirty tutorial on how to use jzIntv's debugger for simple cycle counting.

 

First, an example program:

.

' Example of simple cycle counting w/ jzIntv's debugger

        CLS

        ' Goal:  Cycle count the FOR loop below. Disables/enables interrupts
        ' to avoid including the interrupt handler in the cycle count.

        ASM cycle_start: DIS
        FOR I = 1 to 100
            X = X + 42
            Y = Y - 17
        NEXT I
        ASM cycle_end: EIS

here:   GOTO here

.

To compile this, do the usual procedure, adding one more flag to the as1600 line:

  • intybasic cycle.bas cycle.asm
  • as1600 -o cycle.bin -s cycle.sym cycle.asm

This creates an additional file, cycle.sym, that has the symbols for the executable.

 

Now start up jzintv with the debugger, and ask it to load the symbol file:

 

  • jzintv -d --sym-file=cycle.sym cycle.bin

 

Once inside the debugger, we'll set a couple of break points, one on cycle_start and one on cycle_end. These two symbols added by the ASM statements in the BASIC source, bracket the region we want to measure.

.

> r
Starting jzIntv...
Hit breakpoint at $5095
cycle_start:
 0200 0000 01BC 8007 02F0 5091 02F0 5095 ----I-i-  DIS                  127546
> r
Hit breakpoint at $50AF
cycle_end:
 0065 0000 01BC 8007 02F0 5091 02F0 50AF -C----i-  EIS                  138466
> q

.

 

The two numbers at the far right, 127546 and 138466, are the current Intellivision cycle numbers on reaching each of those statements. The difference between them, 136446 - 127546 = 10920 cycles, is how long the code took.

 

That's all you need to do for simple cycle count measurements. In the example above, I disable interrupts, to avoid them messing with the measurement. It may make your game look odd in the short run, but this code only needs to be there to take measurements.

 

You don't even need to set breakpoints, necessarily. The 'f' command means "run forward until you get this symbol." So, I could have done this instead:

 

.

> f cycle_start
Starting jzIntv...
Fast forwarded to $5095
cycle_start:
 0200 0000 01BC 8007 02F0 5091 02F0 5095 ----I-i-  DIS                  127546
> f cycle_end
Fast forwarded to $50AF
cycle_end:
 0065 0000 01BC 8007 02F0 5091 02F0 50AF -C----i-  EIS                  138466
> 

.

That can be useful if the stretch of code you're trying to measure gets hit many times, but you only want to measure it once in the middle of your game. Use the 'r' command to start jzIntv, and then 'break out' to the debugger with F4 or Cmd-C. Then issue the 'f cycle_start' and 'f cycle_end' commands to measure.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...