Jump to content
IGNORED

Geneve low-level putpixel routine


vol

Recommended Posts

I am trying to do some assembly code which shows a picture but I can't find a way how to put a pixel where it needs to be placed.  There is no problem when a screen mode uses less than 16 KB but what to do if it uses more?  For instance, GRAPHIC4 (256x212, 16 colors) uses 26 KB, GRAPHIC5 (512x212, 16 colors) uses two times more, etc.  So I need to set a graphic page number and I can't figure out how to do this. :( 
I tried to use the next code

       li r0,>018e      *set page 1 via VDP register 14
       li r1,>40        *set the address to 0 on this page
       limi 0
       movb r0,@>f102   *R0l - >8e, R0h - video page
       swpb r0
       movb r0,@>f102
       nop
       movb r1,@>f102   *R1l - hi addr OR >40, R1h - lo addr
       swpb r1
       movb r1,@>f102
       limi 2
       li 0,>f500       *>f5 defines 2 pixels (GRAPHIC4)
       movb 0,@>f100    *puts the 2 pixels at VRAM address >4000

It doesn't work but it works for page 0.
I am aware about the XOP @six call but it can't handle interlaced graphics and it is rather slow.
Would somebody like to help me a bit?  A lot of thanks in advance.
EDIT. I use the MAME/MESS emulator for my project.

Edited by vol
Link to comment
Share on other sites

I am out of the house at the moment, but if you can find the GEME source code and look in the SUBx files, there is all kinds of drawing routines Michael Riccio wrote. None of the code uses interlaced mode, but I think that may get you started. If you have not found it by the time I get home, I will locate the source code in a few hours. 
 

beery

  • Like 2
Link to comment
Share on other sites

@vol   

 

First, don't confuse Page# with address.  I think you meant the next 16K page of RAM.   Use page number to refer to the different possible base addresses for the screen.

 

The value in VR#14 is the next address bits A16-A14.  If you want to write to the second 16K, put a 1 there. I see you did that, so I'm kinda stumped.

 

Maybe... maybe... the order of doing this matters? But all the references mention high bits first, then low bits. 

 

 

Some additional observations:

 

2. If you are setting the page# in VR#2 (Pattern name table base), don't forget to pad it with 1s.  Bad stuff happens when you forget the 1s. (Cool stuff?)

 

In G4 and G5 it's 0pp11111 (pp = 00, 01, 10, 11)

In G6 and G7 it's 00p11111 (p = 0,1) (yep, "the position of A16 differs", see pages 38, 47 of V9938 manual)

 

3. Don't set LIMI 4 until you are done writing data to >F100.  Guard all VDP access addresses from interrupts. MDOS uses LIMI 4, not 2. 


4. To access the higher addresses, you can also write data past the 16K limit. In G4-G7, VR#14 will autoincrement when address bit 13 carries over.  (see page 7 of V9938 manual)

 

5. Change video mode before writing any data.  If you change to G6 or G7,  the 9938 uses the memory chips differently: consecutive addresses "stripe" across the two chips.  (technically, your A15 and A0 are switched.) 

 

  ----> 2x resolution, 1/2 color, same memory
|
|          G4       G5
|       256x212  512x212
|        4bpp     2bpp
|        26.5K    26.5K
| 2x         /---\
| color           }
|memory       <--/
|          G7       G6
|       256x212  512x212
|        8bpp     4bpp
\/       53K      53K





G4 and G5 pages memory range:

0   0000-06a00   26.5K
1   8000-0E600   26.5K
2  10000-16a00   26.5K
3  18000-1e600   26.5K

G6 and G7 pages memory range:

0   0000-0D400   53K
1  10000-0D400   53K

 

 

  • Like 3
Link to comment
Share on other sites

3 hours ago, 9640News said:

Here is the GEME source code.  Look at the SUB1 file for a lot of handling routines.  The routine that draws characters reads characters first from the Text 80 pattern table that is captured, so it may contain the routine(s) you need to write pixels.

 

 

GEME.zip 65.39 kB · 2 downloads

Thank you very much.  I am going to dig into it soon.  BTW are sources of FRACTALS v2.0 available anywhere?  It must contain very good graphic routines too.

Edited by vol
Link to comment
Share on other sites

16 hours ago, FarmerPotato said:

@vol   

 

The value in VR#14 is the next address bits A16-A14.  If you want to write to the second 16K, put a 1 there. I see you did that, so I'm kinda stumped.

 

I have just converted the official MSX2 code...  The MSX2 uses the same VDP as the Geneve.  Maybe it is an issue of the emulator?  I prepared a small program which must plot several dots somewhere around the screen middle.  It plots nothing when I ran it under MAME/MESS but it may work as expected on real hardware.  It would be great if somebody could try this.

       DEF T1
T1:     li 0,0
        li 1,6  ;graphic mode 6, 256x212, 16 colors
        xop @six,0

        li 0,>35
        li 1,2    *sets R2 = $1f
        li 2,>1f  *1f - page 0
        xop @six,0

       li 0,>18e
       li 1,>40
       bl @sva
       li 0,>f5e3
       movb 0,@>f100
       swpb 0
       movb 0,@>f100  ;write 4 sequential dots

       bl @getkey

       li 0,0
       li 1,1  ;text mode 2
       xop @six,0
       blwp @0

sva:   limi 0       ;set an address in VRAM
       movb 0,@>f102   ;R0l - >8e, R0h - video page
       swpb 0
       movb 0,@>f102
       nop
       movb 1,@>f102   ;R1l - hi addr OR >40, R1h - lo addr
       swpb 1
       movb 1,@>f102
       limi 4
       b *11

getkey: li 0,4
        li 1,>ff00
        xop @five,0
        jne getkey
        b *11

five data 5
six  data 6
     END

 

Link to comment
Share on other sites

1 hour ago, vol said:

I have just converted the official MSX2 code...  The MSX2 uses the same VDP as the Geneve.  Maybe it is an issue of the emulator?  I prepared a small program which must plot several dots somewhere around the screen middle.  It plots nothing when I ran it under MAME/MESS but it may work as expected on real hardware.  It would be great if somebody could try this.

       DEF T1
T1:     li 0,0
        li 1,6  ;graphic mode 6, 256x212, 16 colors
        xop @six,0

        li 0,>35
        li 1,2    *sets R2 = $1f
        li 2,>1f  *1f - page 0
        xop @six,0

       li 0,>18e
       li 1,>40
       bl @sva
       li 0,>f5e3
       movb 0,@>f100
       swpb 0
       movb 0,@>f100  ;write 4 sequential dots

       bl @getkey

       li 0,0
       li 1,1  ;text mode 2
       xop @six,0
       blwp @0

sva:   limi 0       ;set an address in VRAM
       movb 0,@>f102   ;R0l - >8e, R0h - video page
       swpb 0
       movb 0,@>f102
       nop
       movb 1,@>f102   ;R1l - hi addr OR >40, R1h - lo addr
       swpb 1
       movb 1,@>f102
       limi 4
       b *11

getkey: li 0,4
        li 1,>ff00
        xop @five,0
        jne getkey
        b *11

five data 5
six  data 6
     END

 

Nothing plotted on the screen on a real Geneve.

  • Like 3
Link to comment
Share on other sites

15 hours ago, FarmerPotato said:

@vol

 

I looked at the most recent 9958 code that I wrote to test hardware. I had set VR#14 after setting VDPWA. Dunno if that makes any difference, it seems like no documentation mentions any order. 

IMHO this order is not important but routines for the MSX2 always set R14 at first. Does your code work?  Can you share it?
BTW I have surprisingly found out that XOP @six,0 uses R6 not R4.  This means that R4 and R5 are just skipped in the call - what an oddity!  And this call is astonishingly slow.

Edited by vol
Link to comment
Share on other sites

On 11/24/2022 at 6:06 PM, FarmerPotato said:

@vol

 

I looked at the most recent 9958 code that I wrote to test hardware. I had set VR#14 after setting VDPWA. Dunno if that makes any difference, it seems like no documentation mentions any order. 

 

The 9938/58 documentation is horrible, however there is documentation for this in the 9918A datasheet.  Since the 38/58 need to be software compatible, that means the hardware interface needs to have the same quirks as the 9918A.  Well, it *should have* the same quirks, anyway.

 

On 11/25/2022 at 9:27 AM, vol said:

IMHO this order is not important but routines for the MSX2 always set R14 at first. ...

 

Opinions matter little when it comes to how hardware does, or does not, work.  In the 9918A the order matters, so by extension it probably matters for the 38/58.

 

The 9918A datasheet does specify on pg. 2-1: "NOTE The CPU address is destroyed by writing to the VDP register."

 

The 9938 datasheet provides a specific sequence for accessing memory on pg. 2:

"

Accessing Memory

To access memory, follow the procedure below.

1. Switch banks (VRAM to Expansion RAM)

2. Set the address counter (A16 to A14)

3. Set the address counter (A7 to A0)

4. Set the address counter (A13 to A8), and specify read or write

5. Read or write the data

"

 

It does not explain the subtle detail behind that sequence, but there are technical reasons.  Basically, write all your VDP registers, then set the VDP address counter.  Any time a VDP register is written, the VDP address counter needs to be reset.  Break that rule at your own peril.  If you want to know why, keep reading. 

 

 

NOTE: All mention of "register" and "address" below refer to the VDP (9918A and 9938), not the host CPU (9900, Z80, or whatever).

 

The VDPs have an address counter that determines where bytes are read/written in VRAM.  On the 9918A the address is 14-bits, on the 9938 it is 17-bit.  For the 9938 the extra 3-bits are held in VR14 (any register above VR7 is an expansion over the 9918A, which only has 7 VDP registers).

 

The physical MODE pin input to the VDP determines if a data write to the VDP represents register/address data, or data to be written to VRAM.  When the MODE pin input is low, reads and writes go to VRAM with the location determined by the current VDP address counter.  When the MODE pin input is high, writes are to a VDP register, or setting the VDP address counter.

 

Note: The MODE input is where the name "port" comes from in the 9938 datasheet, and the 9938 actually has two MODE pins (ports 0, 1, 2, 3), but that is not relevant for this discussion.

 

The value of the MODE pin is tied to specific memory addresses in the host CPU address space, which determines what VDP "ports" are used (from a programmer's perspective) to read/write the VDP.  On the 99/4A, writing to a VDP register/address counter (MODE input is high) is wired to address >8C02, usually named with a macro as VDPWA or similar.  Some systems wire the VDP to the CPU's I/O ports rather than memory-mapping, but it does not matter to the VDP.

 

Hardware is expensive, and gates inside a chip cost money and die-space, so things get reused.  Writing to a VDP register takes two writes to the VDP, as does setting the 14-bit VDP address counter.

 

The first byte written to the VDP when MODE=1 (port 1) will be either the data for a VDP register, or the low 8-bits of the VDP address counter.  However, the VDP does not know what the first data byte is until it receives the next byte from the host CPU.  Thus it has to save this first byte somewhere, and that location is the low 8-bits of the VDP address counter.

 

The second byte written to the VDP when MODE=1 will specify either the VDP register to write, or the rest of the bits (high 6-bits) of the VDP address counter.  The VDP uses the two MS-bits of the second byte to decide:

 

1. 10xx xrrr : Write to the VDP register specified in the low 3-bits (9918A has 7 VDP registers, the 9938 uses the low 6-bits for a possible 64 VDP registers).

2. 01aa aaaa: Set the VDP address counter as a "write" address (inhibits pre-fetch), using the low 6-bits as the high-part of the 14-bit VDP address counter.

3. 00aa aaaa: Set the VDP address counter as a "read" address (perform pre-fetch), using the low 6-bits as the high-part of the 14-bit VDP address counter.

 

  | VDP 14-bit      |
  | address counter |
  +-----------------+
10|XX XRRR|DDDD DDDD|
01|AA AAAA|AAAA AAAA|
00|AA AAAA|AAAA AAAA|
----------+----------
 2nd byte | 1st byte
 written  | written

 

In all cases it is important to realize that the VDP address counter is always written.  If a VDP register is being written, the address counter's value is most likely not a VRAM location that a programmer would want to read/write VRAM data.  The difference between setting a "read" or "write" address is whether the VDP does a pre-fetch of the data at a newly set address.

 

The 9938 has 3 additional most-significant-bits for its address counter, which come from VR14 (VDP register), which are *probably* not affected (speculating here) when writing a VDP register or address.

 

To know for sure how to properly use the chip (because the datasheet does not specify the detail):

 

1. Follow the 9918A and 9938 rules, i.e. always set the VDP address counter *after* writing any VDP register, and realize the reading after setting a write address, or writing after setting a read address, have side-effects.

 

2. Write specific tests to characterize the hell out of the 38/58 to determine its "rules" for when/how the VDP address counter register is updated.

 

3. Decap the 38 or 58 to look how the VDP address counter is implemented inside the chip.

 

It *is* possible that the 9938 does not destroy the VDP address counter when writing to a VDP register, it *might* have a separate register to storing the first byte written, but that would make it incompatible with the 9918A, so I suspect it does not have the extra gates in the silicon.

 

  • Like 5
  • Thanks 1
  • Confused 1
Link to comment
Share on other sites

On 11/22/2022 at 9:24 PM, 9640News said:

Nothing plotted on the screen on a real Geneve.

I have just found the explanation for this mystery.  My routine must draw a pixel and it actually draws it but something else prevents us to note it.  This else is the VDP that works in parallel with the CPU.  When we call XOP to set a screen mode we get a new screen mode and the clean screen.  The latter is a result of the VDP command which cleans VRAM.  So when my routine puts a pixel this command is in progress and it cleans that pixel.  We need to wait until the command stops and only after that do other our tricks with the VDP.  I use the next code to wait until the VDP is free.

VDP0 equ >F100
VDP1 equ VDP0+2
VDP2 equ VDP0+4
VDP3 equ VDP0+6
wait:
    li 0,>28f   *>8f = >80 + 15
    limi 0
    movb 0,@VDP1
    swpb 0
    movb 0,@VDP1
    mov 1,0     *delay
    movb @VDP1,1
    li 0,>8f   *>8f = >80 + 15
    movb 0,@VDP1
    swpb 0
    movb 0,@VDP1
    limi 4
    andi 1,>100
    jne wait

 

Edited by vol
  • Like 2
Link to comment
Share on other sites

@vol Wow. I'm glad you solved it. I wonder what other MDOS calls would run into this trap. Or if there is another XOP that would force a wait for the CE status bit. Maybe another VDP command XOP, like a trivial block copy?
 

I never relied on the video XOPs. My code was written for GPL environment first; later I just used that under MDOS. 
 


 

  • Like 2
Link to comment
Share on other sites

When mixing XOP routines and direct reads/writes, I would expect to need extra routines to catch conditions like this.  The question that comes to my mind is whether the other xop opcodes would wait for the command to complete?  Or is the issue related only to using a mix of routines.

 

I use the XOP routines and direct reads/writes for text 2 (80 column mode) without any issues. I have never mixed routines when using a graphics mode or 9938 commands; I only use either the XOP routines or direct IO in those modes.

  • Like 2
Link to comment
Share on other sites

5 hours ago, mizapf said:

... although everyone swears to have done everything according to the specs.

 

None of the 9938 datasheets that I have seen explain the internal details well enough to make a valid emulation without a lot of characterization testing.  I'm not sure if the 9938 has been decapped, I have not checked.  However, since the 9938 was an expansion of the 9918A family, and hardware is difficult, it would be a reasonable assumption that Yamaha started with the 9918A layout and added on to it, rather than starting from 100% scratch.  That means the 9938 would have inherited a lot, if not all, of the 9918A's interface quirks.

 

I'm not a fan of MAME due to what I have seen in the code, as well as the overall architecture.  Some parts are good, and I have used it to help figure some things out, but it does not capture how the hardware actually works.  In my experience, software people do not understand hardware and the result is more simulation, rather than hardware emulation.

  • Like 2
Link to comment
Share on other sites

Thanks for all the help.
It seems all my graphical routines work perfectly now. But I am still curious about the necessity of delays when I use the VDP data port.  IMHO a plain MOVB from a register needs at least 5 memory accesses and this provides all the required delay, right?
BTW thanks to @9640News I have slightly examined GEME graphic routines.  I have found out that it rather doesn't use direct VRAM address setting, it rather uses the VDP commands and the VDP coordinate system.  I also checked XHI sources (thanks to @Torrax) - they contain a procedure to set an address in VRAM, mine procedure is similar to it.

 

On 11/27/2022 at 9:43 PM, mizapf said:

You're always welcome to join with your expertise and improve the emulation. 🙂

IMHO MAME/MESS has a really very good emulation of the Geneve but this emu has several irritating "features":
1) it can't release/grab the mouse - you need to use your host OS features to switch between tasks;
2) it can't show interlaced graphics properly.

It is not issues of the Geneva emulation but general MAME/MESS shortcomings.

Edited by vol
Link to comment
Share on other sites

5 hours ago, vol said:

IMHO MAME/MESS has a really very good emulation of the Geneve but this emu has several irritating "features":
1) it can't release/grab the mouse - you need to use your host OS features to switch between tasks;
2) it can't show interlaced graphics properly.

It is not issues of the Geneva emulation but general MAME/MESS shortcomings.

1) Yes. The only thing I can recommend is that if you know you don't need the mouse in advance, just leave away the colorbus switch. This is a shortcoming that may be due to the cross-platform nature of MAME (host platforms, i.e. Windows, Linux, macOS) and the general scope of emulations, but also because the people who are working at the user interface may have saved efforts at the wrong place. I could well imagine that you should be able to click in the window for capturing the mouse, and to hit a "release key" to get it back. But I don't know enough details of the UI architecture.

 

2) Yes, interlaced is not implemented in the 9938 emulation; I'm not the maintainer of the video chip emulations. Anyway, I somehow associate interlace more with headaches than with doubled resolution.

  • Like 2
Link to comment
Share on other sites

6 hours ago, vol said:

... But I am still curious about the necessity of delays when I use the VDP data port.  ...

 

Over running the VDP has been discussed at length in the forums, just search a little and I'm sure you will find answers.

 

6 hours ago, vol said:

... IMHO a plain MOVB from a register needs at least 5 memory accesses and this provides all the required delay, right? ...

 

Memory access timing is not subject to opinion, the information is in the datasheets, and the actual performance has been tested.  On the 99/4A is has been shown that that CPU cannot over run the VDP on a stock system.  The 9995 has the same memory cycle timing as the 9900 (~333ns), and the 9938 looks to give slightly more margin (#CSR / #CSW min pulse width) than the 9918A.

 

The best thing you can do is write a routine to hammer the VDP and see what happens, then you will know for sure.

 

6 hours ago, vol said:

2) it can't show interlaced graphics properly.

 

Emulated interlace, or *actual* interlace?  Video is not a simple as people think it is, especially when trying to support all the legacy timings of hundreds of systems.  The analog CRTs were very flexible, unlike the all-digital displays we have today.  Interlace was a hack, no one wanted it BITD, and it complicates everything.  I suspect it will not happen.

  • Like 1
Link to comment
Share on other sites

3 hours ago, matthew180 said:

Emulated interlace, or *actual* interlace?  Video is not a simple as people think it is, especially when trying to support all the legacy timings of hundreds of systems.  The analog CRTs were very flexible, unlike the all-digital displays we have today.  Interlace was a hack, no one wanted it BITD, and it complicates everything.  I suspect it will not happen.

Several good emulators for computers that are capable to display interlaced graphics (I know about the MSX2, Amiga, Commodore +4) show this graphics flawlessly.  PAL systems can even get more colors using the PAL inversion trick.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...