Jump to content
IGNORED

Luma Enhancement Module Development


ClausB

Recommended Posts

It occurred to me that we don't need an 8x clock, just 4x. Here's why: Our highest resolution mode sends out 8 pixels per Phi2 cycle. Instead of clocking them out of a shift register 8 times per cycle, we can multiplex them out with 8 states per cycle. We can get 8 states by decoding 3 bits - the 1x, 2x, and 4x clocks. So we just need two clock doublers.

 

This is exactly how GTIA does its hi-res mode (ANTIC mode F). Those pixels come out at 7.2 MHz but GTIA only has a 3.6 MHz clock. It uses half a cycle to display one pixel and the other half to display the next pixel.

Link to comment
Share on other sites

So we just need two clock doublers.

A very simple clock doubler is merely a delay line and an XOR gate. You delay the input clock by 1/4 cycle and XOR both signals to get twice the frequency and 50% duty. Put two of those in series and you also get the 4x clock we need. So the first doubler needs 140 ns delay and the second needs 70 ns. I have found a triple 70 ns delay line for $9 from DDD. A bit pricey but we likely won't be mass producing!

Link to comment
Share on other sites

Yep - we could do that. Which IC were you looking at? They seem to have the ability to make the delays anything we want - that can't be a low volume project! Can it? If 70ns is a standard (low volume part) value, that should work. The duty cycle won't quite be 50% but that won't matter so much.

 

 

Bob

 

 

 

 

So we just need two clock doublers.

A very simple clock doubler is merely a delay line and an XOR gate. You delay the input clock by 1/4 cycle and XOR both signals to get twice the frequency and 50% duty. Put two of those in series and you also get the 4x clock we need. So the first doubler needs 140 ns delay and the second needs 70 ns. I have found a triple 70 ns delay line for $9 from DDD. A bit pricey but we likely won't be mass producing!

Link to comment
Share on other sites

Yep - we could do that. Which IC were you looking at? They seem to have the ability to make the delays anything we want - that can't be a low volume project! Can it? If 70ns is a standard (low volume part) value, that should work. The duty cycle won't quite be 50% but that won't matter so much.

This is the email quote I got yesterday:

 

MOQ is 10 pieces

3D7323Z-70 $8.68 each 1 week to ship

MDU3C-70 $11.55 each 4-6 weeks

 

This is the part:

http://www.datadelay.com/datasheets/3d7323.pdf

 

The delay tolerance is 2%. The ideal delays are 69.8 ns for NTSC and 70.5 ns for PAL. They differ by less than the tolerance.

Link to comment
Share on other sites

can't you do this in cpld?

Certainly the XORs will be in the CPLD. What about the delays?

 

I researched a bit on the Web and saw some things about the Xilinx Digital Clock Manager core and about Digital Locked Loops, but I could not find enough details to see if such a thing would fit into our smallish CPLD. Do you have details to share?

Link to comment
Share on other sites

You might be right. I'm old-school so I'm trying to design hardware, not software-on-a-chip. That's fine for large, complex designs like VBXE, but I don't think LEM needs it. I'll try to keep an open mind, though.

 

"I can change, if I have to." - Red Green

Link to comment
Share on other sites

So, they make these custom at 70ns? wow....

 

Instead of spending $100 on delay lines, how about I just tweak a clock into the circuit and simulate 70ns? I would hate to want a 35ns delay down the road.

 

We're going to do at least three iterations of the boards, I expect. Maybe more.

 

Bob

 

 

 

Yep - we could do that. Which IC were you looking at? They seem to have the ability to make the delays anything we want - that can't be a low volume project! Can it? If 70ns is a standard (low volume part) value, that should work. The duty cycle won't quite be 50% but that won't matter so much.

This is the email quote I got yesterday:

 

MOQ is 10 pieces

3D7323Z-70 $8.68 each 1 week to ship

MDU3C-70 $11.55 each 4-6 weeks

 

This is the part:

http://www.datadelay.com/datasheets/3d7323.pdf

 

The delay tolerance is 2%. The ideal delays are 69.8 ns for NTSC and 70.5 ns for PAL. They differ by less than the tolerance.

Link to comment
Share on other sites

I was thinking of a gated oscillator setup, actually. 02 would gate a series of clock pulses that would load registers from SRAM, or whatever. It would have to be manually adjusted on the prototypes, while the finished boards could use delay lines that wouldn't need adjusting ('tweaking').

 

Bob

 

 

 

how about I just tweak a clock into the circuit and simulate 70ns?

Not sure what you mean. How will you sync it to Phi2?

Link to comment
Share on other sites

so every 4 pixels would be a bit disorted, but within controlable range

may be a good idea to use higher frequency than nessesary, and then scalling it down by the clock divider

it might reduce pixel skew in those 4 pixel chunks if the falling edge of phi2 would activate the clocking circuit it would be in-phase with phi2 all the times - even if not - higher frequency to start with would give smaller skew rate

Link to comment
Share on other sites

If I make the initial delay and the data-to-data delay variable, I can adjust the pixels for best fit, can't I?

 

I'm not sure... it still isn't entirely clear what the sequence is for the process.

 

*02 clock falls, indicating the start of a new cycle.

*S4 falls, indicating $8000-$9FFF data access. (it had better be ANTIC because that's our only clue)

*After an adjustable delay, (perhaps 0) SRAM is accessed for the first data byte/bits.

*SRAM data is latched into the CPLD data reg.

*Data is clocked out of the register at an adjustable clock rate. (after an adjustable delay?) **when does this happen? do we need two sets of data regs?**

 

Is that about right?

 

Would it be worthwhile to have a line counter and start/stop without requiring DLIs? We have the vertical and horizontal sync pulses in the LUMA input. Maybe implement two-line modes?

 

Bob

 

 

 

so every 4 pixels would be a bit disorted, but within controlable range

may be a good idea to use higher frequency than nessesary, and then scalling it down by the clock divider

it might reduce pixel skew in those 4 pixel chunks if the falling edge of phi2 would activate the clocking circuit it would be in-phase with phi2 all the times - even if not - higher frequency to start with would give smaller skew rate

Link to comment
Share on other sites

Couldn't we somehow automate the enable/disable process?

 

Just reserve an address which, if accessed, will enable the LEM, another for disable.

 

Since we're talking custom Display Lists anyway, we could have something like a dummy graphics line before the real display.

 

e.g.

2 x 8 Blank

1 x 7 Blank

LMS $BE00 Mode D - tell the LEM to enable itself. (read to page $BE00 will return zeros, any access to $BE00 enables LEM mode)

LMS $9C40 Mode 2

23 x Mode 2

LMS $BE80 Mode D - tell the LEM to disable itself. (any access to $BE80 shuts off LEM mode)

Link to comment
Share on other sites

I'm not sure... it still isn't entirely clear what the sequence is for the process.

It's been bouncing around in my head for a year, so it's pretty clear to me:

 

As far as the SRAM goes, the sequence is laid out in the timing diagrams I posted at the top of this thread. At the rising edge of Phi2, 8 bits of SRAM data get clocked into the first data register. 140 ns later, 8 bits from another bank go into the second register. (That's one reason why a 140 ns delay line on Phi2 would be ideal.)

 

As for the luma output, we must divide each 560 ns bus cycle into 8, 4, or 2 equal parts and select 1, 2, or 4 bits at a time per pixel using a variable width, variable period multiplexer. (A 70 ns delay helps generate the counter to address the mux).

Link to comment
Share on other sites

Couldn't we somehow automate the enable/disable process?

 

Just reserve an address which, if accessed, will enable the LEM, another for disable.

 

Since we're talking custom Display Lists anyway, we could have something like a dummy graphics line before the real display.

 

e.g.

2 x 8 Blank

1 x 7 Blank

LMS $BE00 Mode D - tell the LEM to enable itself. (read to page $BE00 will return zeros, any access to $BE00 enables LEM mode)

LMS $9C40 Mode 2

23 x Mode 2

LMS $BE80 Mode D - tell the LEM to disable itself. (any access to $BE80 shuts off LEM mode)

Very interesting idea! A few details in your example need correcting:

 

Page $BE is outside the range we've selected, but $9E would work.

 

ANTIC mode 2 would not be useful. The design only works with single-line modes which use DMA on every line.

 

But the clever idea of using ANTIC to enable and disable the luma is worth considering.

Link to comment
Share on other sites

ANTIC mode 2 would not be useful. The design only works with single-line modes which use DMA on every line.

Wait a minute! Why shouldn't single-line character modes work? As long as the character set data are stored in the SRAM and the character codes are stored elsewhere, it should work. You would still get 40 or 20 characters across the screen but each character would be 16 bits wide and have the same hi-res luma options as the graphics modes. One complication however is the 8 color-clock delay between luma video and GTIA video. In graphics modes that is easily corrected by offsetting the plotting locations, but in character modes it gets more restrictive.

Link to comment
Share on other sites

The timing diagram shows the SRAM but not the LUMA timing. (does it?) While we are loading one register, are we reading LUMA from the other? In the next/same cycle?

 

Bob

 

 

 

 

I'm not sure... it still isn't entirely clear what the sequence is for the process.

It's been bouncing around in my head for a year, so it's pretty clear to me:

 

As far as the SRAM goes, the sequence is laid out in the timing diagrams I posted at the top of this thread. At the rising edge of Phi2, 8 bits of SRAM data get clocked into the first data register. 140 ns later, 8 bits from another bank go into the second register. (That's one reason why a 140 ns delay line on Phi2 would be ideal.)

 

As for the luma output, we must divide each 560 ns bus cycle into 8, 4, or 2 equal parts and select 1, 2, or 4 bits at a time per pixel using a variable width, variable period multiplexer. (A 70 ns delay helps generate the counter to address the mux).

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...