Jump to content
IGNORED

rb+ tutorial #2: A scary trip through the Object Processor and raptor lists


ggn

Recommended Posts

Since the rb+ installation article had some good response (i.e. people even responded to it!) I figure I'll start this one as a WIP post and fill/extend it gradually. If people want to contribute with questions or corrections/suggestions/some paragraphs of text, feel free to!

 

So, without further ado...

 

0. Introduction

 

The Jaguar's Object Procesor (OP) is a great piece of hardware. It's quite capable and flexible, and the raw power it gives makes one forgive its shortcomings and bugs. Which is of course a shame as most of the surrounding hardware is so bad :P (well, excluding the 68000, that's great of course :)).

 

It's also one of the main things one has to master in order to program the console, independant of the language used or library. As the title says, it's scary. It introduces many concepts that can baffle newcomers and/or inexperienced people in general. So what this post aims to do is demistify and explain the chip as much as possible.

 

1. The Object Processor

1.a. Why it's called like that anyway?

 

A big part of the OP puzzle is hidden in its name. Object Processor. Something that processes objects. An object is a bit of an abstract term though. What constitutes as an object in the console's terms?

 

Instead of using graphs, bar/pie charts, let's go with an example instead. This is the title screen of Downfall, a game which a few of you might be aware of.

 

post-10979-0-60380300-1489163616_thumb.png

Click here to see a video in action if you're not aware of the game and come back to read the rest of this. What you see is the game's logo, some info text and some parallax thing in the background. You see that all those things are overlapping more or less. In order to achieve this in a typical bit-mapped screen of 80s/early 90s computers you'd have to do something like this:

 

a) Clear the screen

b) Draw the "farthest" layer of the parallax scroll

c) Draw the 2nd "farthest" layer of the parallax scroll, but since there are overlapping pixels with the layer drawn in (b), clear those first

d) Repeat © for layers 3-5

e) Draw the score/hiscore texts, again erasing all overlapping pixels first

f) Draw the logo (erasing again)

g) Draw the text (erasing again)

 

As you probably realise by now the CPU (or specialised hardware like the blitter) will spend a considerable time drawing and clearing pixels (also called masking) every time the screen is refreshed. Also important is the fact that a lot of specialised code has to be written in order to perform all these steps fast enough.

 

Could we do better? Enter the OP.

 

In contrast with bitmapped displays, i.e. you get a block of RAM, you fill it with bits and pixels are rendered on screen, the OP doesn't have a display. Instead you have to instruct it to render rectangles of graphics around the screen. Their widths, heights, colour depths, transparency and many more parameters are highly customisable. To put it in another way: to the OP everything is a sprite - player graphics, bullets, backgrounds, you name it. If it's not an object then it doesn't appear on screen.

 

So, to get back to the above example, we could define an object as big as the screen we're rendering and do all that busy work there. But (and this is what makes more sense) we can instuct the OP to render each different layer as a separate object and then combine them by itself. It will the be the OP's responsibility to compose the screen out of the parts we tell it to use.

 

1.b. Object lists

 

What we learned so far is that the Jaguar can render various boxes of data onto the screen. This of course raises a few questions like "how do we tell it how many boxes to render?" and "how do we describe these boxes so the hardware can understand them?". The answer to both these is Object Lists (OL).

 

An OL is really a forward linked list (If you already know what a linked list is then skip this paragraph). Very briefly put, imagine having an array where we want to store a few parameters per object - say we want to store 10 sprite's x, y positions, width and height. We could go right ahead and dimension an array in BASIC like

DIM stuff[4][10]
and then use that array to store everything. But what happens if we don't know beforehand how many sprites we'll use? One solution is to just overdimension and hope that limit never gets crossed, but that's wasting RAM and we don't want to. A more RAM optimal way is to dimension a 5th field that will tell us where in ram the next index actually is. Our modified array will store x,y,w,h,adr_of_next_index. So in order to traverse the array we have to know the address of the first index, go there, do what we need to do, then fetch adr_of_next_index, jump to that address, and so on until we reach the list's end (let's say that the list ends when we read a 0 in adr_of_next_index).

 

post-10979-0-97390600-1489427621_thumb.png

post-10979-0-70680900-1489427875_thumb.png

(No shame if you didn't digest all that info in the first read - go back and read it again if you're not sure how it works. Or reply to this post with a question. Ultimately it doesn't matter but it helps understanding some of the stuff you'll see later).

 

So by using this arrangement we can store as many object as we require with the minumum RAM waste.

 

1.c. Anatomy of an object

 

Let's see exactly how flexible the OP is by looking at how an object is defined.

 

There are five kinds of objects:

  • Bitmap object
  • Scaled bitmap object
  • Graphics Processor object
  • Branch object
  • Stop object
The first two are the most used objects. The fourth is just a terminator for the list and is placed at the end of the OL so it's only used once per list. Branch objects are also specialised and allow for skipping parts of the list - this doesn't sound very practical at first, but since there are a few conditionals that can be executed it becomes quite useful. GPU objects... we'll get to that eventually.

 

Each object is at least one phrase long (one phrase is 64 bits, or 8 bytes). So what probably happens is that the OP reads a phrase off the OL for the current object, and if the type demands it then read extra phrase(s). The first 3 bits of each object's first phrase contain the TYPE field. That's 0 for bitmapped object, 1 for scaled, 2 for GPU object, 3 for branch object and 4 for stop object.

 

1.c.1. Bitmapped object

 

Let's have a quick look at the fields for a normal bitmapped object. Don't pay too much attention at the descriptionm especially if you don't understand something - we'll explain everything eventually.

 

TYPE - object type (hardcoded to 0)

YPOS - object y position in screen

HEIGHT - object height

LINK - address of next object in OL

DATA - address of graphics data for the object

XPOS - object x position in screen

DEPTH - object's bit depth

PITCH - how many bytes is a single scanline of the object

INDEX - For bitmapped objects, choose which palette to use

DWIDTH - how many phrases wide the object is in memory (can be different than PITCH!)

IWIDTH - how many phrases wide the object is on screen (can be different than PITCH and DWIDTH!)

REFLECT - flag that controls if the object will be drawn mirrored horizontally

RMW - Adds the current pixel value to what's already on the line buffer (advanced topic for now!)

TRANS - flag that enables transparency. Colour 0 will not be drawn on screen

RELEASE - flag that tells the OP to yield the bus to other chips

FIRSTPIX - tells the OP which is to be considered the first pixel to be drawn per scanline

 

Now that's a lot of things, right? If you were to implement those in software you'd consume a lot of bytes. And since CPUs are usually comfortable with 8, 16 or 32 bit values (byte/word/longword) you'd need a word per parameter (even for a flag) and a longword per address to be safe. There are 16 parameters in that list, two of them are addresses. So we're looking at 14*2+2*4=36 bytes=288 bits! How did they squeeze it down to 128? (if you wonder "why 128?" then remember: a bitmapped object is 2 phrases!) Well, for starters some fields are on/off (0/1) so they just need a single bit to represent them. Similarly other fields like YPOS/XPOS don't need the full 16 bits that we allocated because of physical constraints - no reason to have a X value of 65535 for example! So most of the fields are chopped down in a similar fashion. Finally, the address fields are always aligned to 8 bytes due to hardware constraints (burst read access if I'm not mistaken). This means that the address values will be a multiple of 8 - so 8,16,32,48 etc. If you look these numbers in binary or hex notation you'll notice that the last 3 bits are zero. So no need to store them at all!

 

All the above tricks do save a lot of RAM and bandwidth for the OP but it comes as a cost. Namely, it's not very easy to construct and update an OL from a CPU standpoint. Inserting and updating bitfields require a few instuctions per operation and since the 68000 is a bit slow when it tries to shift values (an essential part unless you can avoid it), it can become quite demanding to update even basic values like screen x,y coordinates. So when a lot of objects are used it is recommended to use the GPU (Tom) to update the lists.

 

One final thing worth mentioning here is that during screen composing the OP will trash the OL partially (I am not sure which fields are modified at the moment, perhaps someone can help?). If the OP is allowed to run on the next frame with the list trashed you will definitely see the screen do weird things, from blanking out completely to displaying garbage and then crashing the machine! So it is necessary to update the OL every vertical blank (VBL). I know of two methods here: one is to update the actual list and the other is to keep a second copy of the OL and copy it to the live list during the VBL. The later is what raptor uses and it is much less complicated: you can do the processing during the frame is displayed and not have to struggle modifying things at the last moment.

 

1.c.2. Scaled bitmapped object

 

The scaled object is exactly the same as the bitmapped, except that the type is of value 1 here, with the addition of third phrase that contains some extra fields.

 

HSCALE - Horizontal scale factor in 3.5 unsigned fixed point format.

VSCALE - Vertical scale factor in 3.5 unsigned fixed point format.

REMAINDER - This seems to be internal state kept by the OP (anyone can help here?)

 

So what's all this 3.5 fixed point format thing then? First of all, fixed point is a way for a CPU that does not support numbers with fractions to support them. If you consider numbers in the decimal system you have the integer part followed by the fractional. They are separated using a deliminator (comma or dot, depending on your region). If we remove the deliminator from the number then a number like "123.456" becomes "123456". There is no way we can figure out where the fractional part begins unless someone tells us that it's after the 3rd digit. So if we all agree that the integer part is the first 3 digits and the fractional part the last 3 then we've made a fixed point system. We can now encode all numbers from 000.000 to 999.999 using 6 digits. (if you wonder why 000.000 and not -999.999, then notice that the sign is also a digit and we'll have to spend an extra digit to encode it). We call that "3.3 unsigned fixed point format". If we require negative numbers too we have to allocate an extra digit so our numbers can be "-999.999" to "+999.999" - that's a 4.3 signed format.

 

Now, let's switch to binary representation. It's pretty much the same thing as the above only with two available digits (0 and 1). So to our example above a 3.3 fixed point binary format would be able to store numbers from 000.000 to 111.111. So, our 3.5 unsigned format would of course hold numbers from 000.00000 to 111.11111.

 

So much for the definitions. But that doesn't help us much in the way of knowing how much we zoom our object, right? After all, what's 0.11111 binary when converted to decimal system where we're more comfortable? Let's begin with the easy stuff - integers. Since we use 3 bits for integer part we can store decimal numbers 0 to 2^3-1=7.

 

Moving on to the scary part: fractions! So ask yourself: what is 0.1 in the decimal system? It's 1/10th, right? And 0.001? That's 1/100th. 0.0001 is 1/1000th an so on. So what happens is that we divide 1 by the base as many times as we have decimal places. If we formulate this, it's something like "1/(10^fraction_digit)" to represent a fraction digit (10 raised to the power of the number of fraction digits is the same as dividing with 10 as many times as the number of fraction digits). It just so happens though that 10 is the decimal system's base. So we can change the formula to "1/(base^fraction_digit)". Finally, "1" is used because our examples had 1 in them. So the final transformation we do to the formula is "number/(base^fraction_digit". This lets us represent any digit in the fractional part. I hope you've got it by now, but if not... Let's switch to binary number system. Our base is 2 here and the range for number here is 0 to 1 so we can write our generic formula as "0 (or 1)/(2^fraction_digit)".

 

Let's write some examples then: %0.00001 (notice that numbers prefixed with % are considered binary by assemblers like rmac) is actually 1/2^5 in decimal, so 1/32 or 0.03125. %0.01 is 1/2^2=1/4=0.25. %0.1 is 1/2=0.5.

 

So what numbers can we represent on a 3.5 unsigned binary format? I would expect %000.00000 to produce nothing as it's a scale factor of exactly 0 so let's leave that out for now. %000.00001 would be the smallest number and %111.11111 the largest. %000.00001 as we wrote above is 0.03125. So that's actually our increment - all scale factors will be an integer multiple of this. For example, the next number in sequence, %000.00010, is 0.0625 which is, true enough, double of 0.03125. %001.00000 is obviously 1, so that's the number we need to put in order to have no scaling at all. And so on and so forth until we reach %111.11111, which is 7+1/2+1/4+1/8+1/16+1/32=7.96875 - that's the largest scale we can have from the OP.

 

1.c.3. Graphics Processor object

 

Scary stuff - let's leave that out for now!

 

1.c.4. Branch object

 

This object enables the OP to skip parts of the OL or even create loops if used carefully.

 

TYPE - object type (hardcoded to 3)

YPOS - if a comparison is performed, this is the value to compare against.

CC - Condition Code

LINK - If a branch is taken, this is the address to branch to.

 

First of all, if we simply want to branch to a different point in the OL we can simply set CC to 0, YPOS to $7fff and fill the LINK field with the address to branch to. This can be used to remove objects that are unused at the time of display (for example, say you have 30 objects that display bullets and you only have 10 active. You could add a branch object before the bullet objects and branch so the OP will skip 20 objects and display the last 10).

 

The other three cases can branch if the Video Counter (VC) is equal (CC=0), smaller (CC=2) or larger (CC=1) compared to the value YPOS contains. This can lighten the OP's load greatly. For example, consider the following playfield:

 

433200-moto-racer-playstation-screenshot

There's no reason for the OP to render the lower parts of the screen while it's rendering the upper half. So we can set YPOS to half the screen height and save tons of bandwidth.

 

Also, using comparison branches you can effectively create loops (i.e. render the same object for the first 50 scanlines) but I'm not sure if there's any vaule to doing this - most likely the objects will become trashed!

 

There are also two other branch types but I'll leave them alone for now as they're more specialised.

 

One final note (careful readers will probably wonder about this): If the branch is not taken, then the OP expects the next object to be on the next phrase from the branch object! If you violate this, then funky things will happen :)!

 

1.c.5. Stop object

 

Pretty straightforward stuff, just stick a 4 in the TYPE field and fill the rest of the phrase with zeros. The OP will stop processing more obejcts after this. You're done!

 

1.c.6. Reference: the reference manual on Objects.

 

Here's a direct quote from the jaguar reference manual.

 

Bit Mapped Object
This object displays an unscaled bit mapped object. The object must be on a 16 byte boundary in 64 bit RAM.

First Phrase
Bits Field Description
0-2 TYPE Bit mapped object is type zero
3-13 YPOS This field gives the value in the vertical counter (in half lines) for the first (top) line of the object. The vertical counter is latched when the Object Processor starts so it has the same value across the whole line. If the display is interlaced the number is even for even lines and odd for odd lines. If the display is non-interlaced the number is always even. The object will be active while the vertical counter >= YPOS and HEIGHT > 0.
14-23 HEIGHT This field gives the number of data lines in the object. As each line is displayed the height is reduced by one for non-interlaced displays or by two for interlaced displays. (The height becomes zero if this would result in a negative value.) The new value is written back to the object.
24-42 LINK This defines the address of the next object. These nineteen bits replace bits 3 to 21 in the register OLP. This allows an object to link to another object within the same 4 Mbytes.
43-63 DATA This defines where the pixel data can be found. Like LINK this is a phrase address. These twenty-one bits define bits 3 to 23 of the data address. This allows object data to be positioned anywhere in memory. After a line is displayed the new data address is written back to the object.

Second Phrase
Bits Field Description
0-11 XPOS This defines the X position of the first pixel to be plotted. This 12 bit field defines start positions in the range -2048 to +2047. Address 0 refers to the left-most pixel in the line buffer.
12-14 DEPTH This defines the number of bits per pixel as follows:
0 1 bit/pixel
1 2 bits/pixel
2 4 bits/pixel
3 8 bits/pixel
4 16 bits/pixel
5 24 bits/pixel
15-17 PITCH This value defines how much data, embedded in the image data, must be skipped. For instance two screens and their common Z buffer could be arranged in memory in successive phrases (in order that access to the Z buffer does not cause a page fault). The value 8 * PITCH is added to the data address when a new phrase must be fetched. A pitch value of one is used when the pixel data is contiguous - a value of zero will cause the same phrase to be repeated.
18-27 DWIDTH This is the data width in phrases. i.e. Data for the next line of pixels can be found at 8 * (DATA + DWIDTH)
28-37 IWIDTH This is the image width in phrases (must be non zero), and may be used for clipping.
38-44 INDEX For images with 1 to 4 bits/pixel the top 7 to 4 bits of the index provide the most significant bits of the palette address.
45 REFLECT Flag to draw object from right to left.
46 RMW Flag to add object to data in line buffer. The values are then signed offsets for intensity and the two colour vectors.
47 TRANS Flag to make logical colour zero and reserved physical colours transparent.
48 RELEASE This bit forces the Object Processor to release the bus between data fetches. This should typically be set for low colour resolution objects because there is time for another bus master to use the bus between data fetches. For high colour resolution objects the bus should be held by the Object Processor because there is very little time between data fetches and other bus masters would probably cause DRAM page faults thereby slowing the system. External bus masters, the refresh mechanism and graphics processor DMA mechanism all have higher bus priorities and are unaffected by this bit.
49-54 FIRSTPIX This field identifies the first pixel to be displayed. This can be used to clip an image. The significance of the bits depends on the colour resolution of the object and whether the object is scaled. The least significant bit is only significant for scaled objects where the pixels are written into the line buffer one at a time. The remaining bits define the first pair of pixels to be displayed. In 1 bit per pixel mode all five bits are significant, In 2 bits per pixel mode only the top four bits are significant. Writing zeroes to this field displays the whole phrase.
55-63 Unused write zeroes.

Scaled Bit Mapped Object
This object displays a scaled bit mapped object. The object must be on a 32 byte boundary in 64 bit RAM. The first 128 bits are identical to the bit mapped object except that TYPE is one. An extra phrase is appended to the object.
Bits Field Description
0-7 HSCALE This eight bit field contains a three bit integer part and a five bit fractional part. The number determines how many pixels are written into the line buffer for each source pixel.
8-15 VSCALE This eight bit field contains a three bit integer part and a five bit fractional part. The number determines how many display lines are drawn for each source line. This value equals HSCALE for an object to maintain its aspect ratio.
16-23 REMAINDER This eight bit field contains a three bit integer part and a five bit fractional part. The number determines how many display lines are left to be drawn from the current source line. After each display line is drawn this value is decremented by one. If it becomes negative then VSCALE is added to the remainder until it becomes positive. HEIGHT is decremented every time VSCALE is added to the remainder. The new REMAINDER is written back to the object.
24-63 Unused write zeroes.

Graphics Processor Object
This object interrupts the graphics processor, which may act on behalf of the Object Processor. The Object Processor resumes when the graphics processor writes to the object flag register.
Bits Field Description
0-2 TYPE GPU object is type two
3-13 YPOS This object is active when the vertical count matches YPOS unless YPOS = 07FF in which case it is active for all values of vertical count.
14-63 DATA These bits may be used by the GPU interrupt service routine. They are memory mapped as the object code registers OB0-3, so the GPU can use them as data or as a pointer to additional parameters.

Execution continues with the object in the next phrase. The GPU may set or clear the (memory mapped) Object Processor flag and this can be used to redirect the Object Processor using the following object.

Branch Object

This object directs object processing either to the LINK address or to the object in the following phrase.
Bits Field Description
0-2 TYPE Branch object is type three
3-13 YPOS This value may be used to determine whether the LINK address is used.
14-15 CC These bits specify what condition is used to determine whether to branch as follows:
0 Branch if YPOS == VC or YPOS == 7FF
1 Branch if YPOS > VC
2 Branch if YPOS < VC
3 Branch if Object Processor flag is set
4 Branch if on second half of display line (HC10 = 1)
16-23 unused
24-42 LINK This defines the address of the next object if the branch is taken. The address is defined as described for the bit mapped object.
43-63 unused

Stop Object
This object stops object processing and interrupts the host.
Bits Field Description
0-2 TYPE Stop object is type four
3-63 DATA These bits may be used by the CPU interrupt service routine. They are memory mapped so the CPU can use them as data or as a pointer to additional parameters.
1.d. Bit depths - bandwidth

 

After digesting the basics objects one of the most confusing aspects for newcomers (especially rb+) is bit depth. "Since I draw some graphic on my desktop/laptop computer, it should just appear on the screen, right?". Well, yes and no.

 

In an ideal world we'd draw everything in as many colours we like and give it to the hardware to cope. Unfortunately the OP is quite fast but it cannot cope with this idea. It might appear so at first but as you start piling up objects one on top of the other it simply runs out of juice. When the OP is composing the screen, it more or less does the following for each line:

  • Goes through the whole OL until it reaches a stop object (branch objects are of course evaluated)
  • For each object the OP has to
  • parse object coordinates
  • parse screen coordinates
  • translate screen coordinates to object coordinates
  • figure out where it should read from the object's graphics address
  • fetch pixels
  • combine pixels with the pixels of the previous objects (transparency, ordering etc) if any
A big bottleneck in the above list is fetch pixels. If we draw uncontrollably and export everything to 16 or 24 bits per pixel (bpp), the OP has to read 2 or 3 bytes. Like said above, this eats up the chip's read bandwidth pretty fast. So if you keep piling up objects like these you're going to end up with garbage on screen - it's simply not possible to read all that data in the time frame allocated.

That's why the OP's designers added bit depths in the chip. If you know that your character sprite won't use more than 16 colours on screen (which translates to 4 bits, 2^4=16) then why waste 12 (or 20 in 24bpp mode) more? Multiply that with the number of objects you would like to have on screen (say 50?) and you end up with a lot of saved bandwidth. And that's for 16 colours, if you want to use even less then you can save much more.

 

Combining is also costly, especially when transparency comes into place. There's potentially a lot of read data thrown away just because a lot of objects are piling up on top of the others. Also, because the whole list is parsed per line, the OP also has to parse objects that might not apply in all scanlines, thus waste even more bandwidth. The use of branch objects can help massively here.

 

In conclusion, it makes good sense to plan ahead what you want to do and be bandwidth considerate. (After all, drawing the screen is only part of the problem - you also need logic, audio, inputs, and many more things)

 

1.e. Bitmapped objects, palettes and pixel formats

 

For 16, 24 and CRY modes it is easy to store colour information. Since each pixel is so many bits, we can encode the intensities for Red, Green and Blue directly on the pixel data. Very briefly, for 16bpp the format is:

Bit 0123456789abcdef
    RRRRRBBBBBGGGGGG
so, 5 bits for Red and Blue (0-31), and 6 for Green (0-63).

 

For 24bpp objects we have 8 bits (0-255) for Red, Blue and Green and 8 unused.

Bit 0123456789abcdef0123456789abcdef
    RRRRRRRRBBBBBBBBGGGGGGGG00000000
(note that the RBG order is intentional, that's how the OP expects values to be written)

 

CRY mode - one byte for RGB and one for intensity - let's leave that for later.

 

Let's go back to <16 bpp modes. Since in these modes we don't have enough bits to store component intensities, the solution is to store the intensities in a designated memory area separately and for the object itself just mark down the index to the intensities table. That memory area is called a Palette or CLUT (Colour Look-Up Table). The OP's CLUT table holds 256 entries and it uses the 16bpp format described above. So for example if we use 4bpp mode and our first 4 pixels are 9,2,5,8, the object's first two bytes will look like this:

 

Pixel     0    1    2    3
Values    9    2    5    8
Values 1001 0010 0101 1000
Notice that each pixel has all its bits packed one after the other. This is true for all bit depths and is called chunky format.

 

1.f. Let's display an object on screen

1.g. Advanced topics (CRY, RMW etc)

2. Raptor lists

 

Here's an object as defined in a raptor list. Values in red are identical (or almost identical) to OL fields. (well, code snippets can't be coloured it seems, so I'll get back to this)

 

(REPEAT COUNTER) - Create this many objects of this type (or 1 for a single object)

sprite_active - sprite active flag

sprite_x - 16.16 x value to position at

sprite_y - 16.16 y value to position at

sprite_xadd - 16.16 x addition for sprite movement

sprite_yadd - 16.16 y addition for sprite movement

sprite_width - width of sprite (in pixels)

sprite_height - height of sprite (in pixels)

sprite_flip - flag for mirroring data left<>right

sprite_coffx - x offset from center for collision box center

sprite_coffy - y offset from center for collision box center

sprite_hbox - width of collision box

sprite_vbox - height of collision box

sprite_gfxbase - start of bitmap data

(BIT DEPTH) - bitmap depth (1/2/4/8/16/24)

(CRY/RGB) - bitmap GFX type

(TRANSPARENCY) - bitmap TRANS flag

sprite_framesz - size per frame in bytes of sprite data

sprite_bytewid - width in bytes of one line of sprite data

sprite_animspd - frame delay between animation changes

sprite_maxframe - number of frames in animation chain

sprite_animloop - repeat or play once

sprite_wrap - wrap on screen exit, or remove

sprite_timer - frames sprite is active for (or spr_inf)

sprite_track - use 16.16 xadd/yadd or point to 16.16 x/y table

sprite_tracktop - pointer to loop point in track table (if used)

sprite_scaled - flag for scaleable object

sprite_scale_x - x scale factor (if scaled)

sprite_scale_y - y scale factor (if scaled)

sprite_was_hit - initially flagged as not hit

sprite_CLUT - no_CLUT (8/16/24 bit) or CLUT (1/2/4 bit)

sprite_colchk - if sprite can collide with another

sprite_remhit - flag to remove (or keep) on collision

sprite_bboxlink - single for normal bounding box, else pointer to table

sprite_hitpoint - Hitpoints before death

sprite_damage - Hitpoints deducted from target

sprite_gwidth - GFX width (of data)

 

So it's evident that raptor lists try to be close to the OP's object definitions while adding extra fields to help the processing of sprites (animation, hitpoints, collision etc.).

 

3. Wrapping up

 

Hopefully everyone reading this post got something out of it. It's nothing more than re-stating what the hardware manual says with as many explanations to the newcomer as possible. Also it shows how much stuff rb+ and raptor do behind your back (constructing OLs, calculating object parameters, aligning graphics data so it will be processed ok and so much more).

 

Thanks for your patience while reading this! Let me know if there's something not clear, if I omitted something or if there's an error somewhere.

Edited by ggn
  • Like 23
Link to comment
Share on other sites

I see the love (likes) but I don't see the feedback :). Any problems with what I wrote above? Any suggestions? Should I rewrite something? Did I miss anything? Speak up!

Sorry - no feedback needed from me. Everything made sense on the first go through. With this and your tutorial for installing RB from scratch, I may force myself to make some free time to try things out. Had my Jag since 94 - coded for a lot of different systems. Last dedicated system I did anything for was the Nuon. I'm usually doing 8-bit stuff.

Link to comment
Share on other sites

Here's an easier way to think about scaled objects.

 

Think of the first part (the integer) as being 0-7, with 1 being 'normal size' so... 001.00000 = normal size

 

Now, 11111 (binary) is the largest value you can make with 5 bits (it's 31 decimal). So, for each integer, we can have 32 steps in size, if we start from zero (which we do!)

 

Knowing this, then:

  • 000 01111 (0.15) is 0.5 times normal size (32/2 = 16, start at 0 = 15)
  • 001 01111 (1.15) is 1.5 times normal size.
  • 010 00000 (2.00) is 2.0 times normal size.
  • 010 01111 (2.15) is 2.5 times normal size.
  • 111 00000 (7.00) is 7.0 times normal size.

Say you want it 1/4 size? well 32/4 = 8. Starting at 0 = 7

  • 000 00111 (0.7)

Special note, don't go above 7.00000 - it doesn't work.

  • Like 4
Link to comment
Share on other sites

Say you want it 1/4 size? well 32/4 = 8. Starting at 0 = 7

  • 000 00111 (0.7)

Let's do some math. Say we want to scale the decimal number 100 by 1/4. 1/4 is exactly 1/(2^2). So our scale factor would be %000.01000.

 

Proof: let's keep the fraction digits (%01000) and multiply that by 100. %01000 is 8 in decimal, so 100*8=800. 800 in binary is %1100100000. Let's chop the 5 fraction bits to the right - we're left with %11001, which is 25.

 

So unless I'm misunderstanding something, %01000 is a better approximation than %00111.

Link to comment
Share on other sites

Let's do some math. Say we want to scale the decimal number 100 by 1/4. 1/4 is exactly 1/(2^2). So our scale factor would be %000.01000.

 

Proof: let's keep the fraction digits (%01000) and multiply that by 100. %01000 is 8 in decimal, so 100*8=800. 800 in binary is %1100100000. Let's chop the 5 fraction bits to the right - we're left with %11001, which is 25.

 

So unless I'm misunderstanding something, %01000 is a better approximation than %00111.

 

There's not much in it at this resolution. You are correct because 32 = integer +1, reset to 0, but thinking about it this way is easier to digest.

Link to comment
Share on other sites

Some of this I guess will be handled by rb+

 

As the Jaguar has to traverse the who object list every scan line the longer the list the less time the rest of the system has for memory accesses.
From the programming manual

 

The following list gives the priorities of all bus masters.

Highest priority

1. Higher priority daisy-chained bus master

2. Refresh

3. DSP at DMA priority

4. GPU at DMA priority

5. Blitter at high priority

6. Object Processor

7. DSP at normal priority

8. CPU under interrupt

9. GPU at normal priority

10. Blitter at normal priority

11. CPU

Lowest priority

 

While the OP is running it will hog memory access so the 68K cannot read its next instruction and DSP and GPU accesses to main memory and the Blitter can be stalled.
Branch objects allow you to group objects so that they need not be processed - a hud a the top or bottom of the screen and the gameplay for the rest. If I am right the OP runs on all scan lines and not just the ones you see. So it is wise to add a couple of branch objects to do an early out to a stop object when you know it should do nothing. ( That's what Atari *recommended* back in the day )
The worse thing about the OP is that as it processes an object it can overwrite it with updated information so every frame you have to rebuild / copy the OP.

Edited by Seedy1812
  • Like 1
Link to comment
Share on other sites

Good stuff - i've always been confused by the object list ( mainly due to overcomplicating it in my own head probably ).

 

Keep up the good work, looking forward to how this is applied in RB+.

Thanks! I'll keep plugging at it till it makes sense to people.

 

Would it be worth including a simple block diagram of all the processors, memory and bus(es) in the Jag at the start of your doc just as a reference ??

Well I agree that the post is very text laden at the moment. I'll try to do something about it soon :).

  • Like 1
Link to comment
Share on other sites

I gave up using binary for scaled objects because it was too easy for my stupid builder fingers to arse it up and my old man eyes found it difficult to spot the mistakes, so used integers instead and had a very happy, successful and fruitful life ever since.

That last part might not be 100% true or directly related.

 

Link to comment
Share on other sites

Or just use this table:


Value    Scale factor | Value    Scale factor | Value    Scale factor | Value    Scale factor | Value    Scale factor | Value    Scale factor | Value    Scale factor | Value    Scale factor
0        0            | 32       1            | 64       2            | 96       3            | 128      4            | 160      5            | 192      6            | 224      7
1        0.03125      | 33       1.03125      | 65       2.03125      | 97       3.03125      | 129      4.03125      | 161      5.03125      | 193      6.03125      | 225      7.03125
2        0.0625       | 34       1.0625       | 66       2.0625       | 98       3.0625       | 130      4.0625       | 162      5.0625       | 194      6.0625       | 226      7.0625
3        0.09375      | 35       1.09375      | 67       2.09375      | 99       3.09375      | 131      4.09375      | 163      5.09375      | 195      6.09375      | 227      7.09375
4        0.125        | 36       1.125        | 68       2.125        | 100      3.125        | 132      4.125        | 164      5.125        | 196      6.125        | 228      7.125
5        0.15625      | 37       1.15625      | 69       2.15625      | 101      3.15625      | 133      4.15625      | 165      5.15625      | 197      6.15625      | 229      7.15625
6        0.1875       | 38       1.1875       | 70       2.1875       | 102      3.1875       | 134      4.1875       | 166      5.1875       | 198      6.1875       | 230      7.1875
7        0.21875      | 39       1.21875      | 71       2.21875      | 103      3.21875      | 135      4.21875      | 167      5.21875      | 199      6.21875      | 231      7.21875
8        0.25         | 40       1.25         | 72       2.25         | 104      3.25         | 136      4.25         | 168      5.25         | 200      6.25         | 232      7.25
9        0.28125      | 41       1.28125      | 73       2.28125      | 105      3.28125      | 137      4.28125      | 169      5.28125      | 201      6.28125      | 233      7.28125
10       0.3125       | 42       1.3125       | 74       2.3125       | 106      3.3125       | 138      4.3125       | 170      5.3125       | 202      6.3125       | 234      7.3125
11       0.34375      | 43       1.34375      | 75       2.34375      | 107      3.34375      | 139      4.34375      | 171      5.34375      | 203      6.34375      | 235      7.34375
12       0.375        | 44       1.375        | 76       2.375        | 108      3.375        | 140      4.375        | 172      5.375        | 204      6.375        | 236      7.375
13       0.40625      | 45       1.40625      | 77       2.40625      | 109      3.40625      | 141      4.40625      | 173      5.40625      | 205      6.40625      | 237      7.40625
14       0.4375       | 46       1.4375       | 78       2.4375       | 110      3.4375       | 142      4.4375       | 174      5.4375       | 206      6.4375       | 238      7.4375
15       0.46875      | 47       1.46875      | 79       2.46875      | 111      3.46875      | 143      4.46875      | 175      5.46875      | 207      6.46875      | 239      7.46875
16       0.5          | 48       1.5          | 80       2.5          | 112      3.5          | 144      4.5          | 176      5.5          | 208      6.5          | 240      7.5
17       0.53125      | 49       1.53125      | 81       2.53125      | 113      3.53125      | 145      4.53125      | 177      5.53125      | 209      6.53125      | 241      7.53125
18       0.5625       | 50       1.5625       | 82       2.5625       | 114      3.5625       | 146      4.5625       | 178      5.5625       | 210      6.5625       | 242      7.5625
19       0.59375      | 51       1.59375      | 83       2.59375      | 115      3.59375      | 147      4.59375      | 179      5.59375      | 211      6.59375      | 243      7.59375
20       0.625        | 52       1.625        | 84       2.625        | 116      3.625        | 148      4.625        | 180      5.625        | 212      6.625        | 244      7.625
21       0.65625      | 53       1.65625      | 85       2.65625      | 117      3.65625      | 149      4.65625      | 181      5.65625      | 213      6.65625      | 245      7.65625
22       0.6875       | 54       1.6875       | 86       2.6875       | 118      3.6875       | 150      4.6875       | 182      5.6875       | 214      6.6875       | 246      7.6875
23       0.71875      | 55       1.71875      | 87       2.71875      | 119      3.71875      | 151      4.71875      | 183      5.71875      | 215      6.71875      | 247      7.71875
24       0.75         | 56       1.75         | 88       2.75         | 120      3.75         | 152      4.75         | 184      5.75         | 216      6.75         | 248      7.75
25       0.78125      | 57       1.78125      | 89       2.78125      | 121      3.78125      | 153      4.78125      | 185      5.78125      | 217      6.78125      | 249      7.78125
26       0.8125       | 58       1.8125       | 90       2.8125       | 122      3.8125       | 154      4.8125       | 186      5.8125       | 218      6.8125       | 250      7.8125
27       0.84375      | 59       1.84375      | 91       2.84375      | 123      3.84375      | 155      4.84375      | 187      5.84375      | 219      6.84375      | 251      7.84375
28       0.875        | 60       1.875        | 92       2.875        | 124      3.875        | 156      4.875        | 188      5.875        | 220      6.875        | 252      7.875
29       0.90625      | 61       1.90625      | 93       2.90625      | 125      3.90625      | 157      4.90625      | 189      5.90625      | 221      6.90625      | 253      7.90625
30       0.9375       | 62       1.9375       | 94       2.9375       | 126      3.9375       | 158      4.9375       | 190      5.9375       | 222      6.9375       | 254      7.9375
31       0.96875      | 63       1.96875      | 95       2.96875      | 127      3.96875      | 159      4.96875      | 191      5.96875      | 223      6.96875      | 255      7.96875
Edited by ggn
  • Like 7
Link to comment
Share on other sites

  • 1 year later...

Since the rb+ installation article had some good response (i.e. people even responded to it!) I figure I'll start this one as a WIP post and fill/extend it gradually. If people want to contribute with questions or corrections/suggestions/some paragraphs of text, feel free to!

 

So, without further ado...

 

0. Introduction

 

The Jaguar's Object Procesor (OP) is a great piece of hardware. It's quite capable and flexible, and the raw power it gives makes one forgive its shortcomings and bugs. Which is of course a shame as most of the surrounding hardware is so bad :P (well, excluding the 68000, that's great of course :)).

 

It's also one of the main things one has to master in order to program the console, independant of the language used or library. As the title says, it's scary. It introduces many concepts that can baffle newcomers and/or inexperienced people in general. So what this post aims to do is demistify and explain the chip as much as possible.

 

1. The Object Processor

1.a. Why it's called like that anyway?

 

A big part of the OP puzzle is hidden in its name. Object Processor. Something that processes objects. An object is a bit of an abstract term though. What constitutes as an object in the console's terms?

 

Instead of using graphs, bar/pie charts, let's go with an example instead. This is the title screen of Downfall, a game which a few of you might be aware of.

 

attachicon.gifdownfall.png

Click here to see a video in action if you're not aware of the game and come back to read the rest of this. What you see is the game's logo, some info text and some parallax thing in the background. You see that all those things are overlapping more or less. In order to achieve this in a typical bit-mapped screen of 80s/early 90s computers you'd have to do something like this:

 

a) Clear the screen

b) Draw the "farthest" layer of the parallax scroll

c) Draw the 2nd "farthest" layer of the parallax scroll, but since there are overlapping pixels with the layer drawn in (b), clear those first

d) Repeat © for layers 3-5

e) Draw the score/hiscore texts, again erasing all overlapping pixels first

f) Draw the logo (erasing again)

g) Draw the text (erasing again)

 

As you probably realise by now the CPU (or specialised hardware like the blitter) will spend a considerable time drawing and clearing pixels (also called masking) every time the screen is refreshed. Also important is the fact that a lot of specialised code has to be written in order to perform all these steps fast enough.

 

Could we do better? Enter the OP.

 

In contrast with bitmapped displays, i.e. you get a block of RAM, you fill it with bits and pixels are rendered on screen, the OP doesn't have a display. Instead you have to instruct it to render rectangles of graphics around the screen. Their widths, heights, colour depths, transparency and many more parameters are highly customisable. To put it in another way: to the OP everything is a sprite - player graphics, bullets, backgrounds, you name it. If it's not an object then it doesn't appear on screen.

 

So, to get back to the above example, we could define an object as big as the screen we're rendering and do all that busy work there. But (and this is what makes more sense) we can instuct the OP to render each different layer as a separate object and then combine them by itself. It will the be the OP's responsibility to compose the screen out of the parts we tell it to use.

 

1.b. Object lists

 

What we learned so far is that the Jaguar can render various boxes of data onto the screen. This of course raises a few questions like "how do we tell it how many boxes to render?" and "how do we describe these boxes so the hardware can understand them?". The answer to both these is Object Lists (OL).

 

An OL is really a forward linked list (If you already know what a linked list is then skip this paragraph). Very briefly put, imagine having an array where we want to store a few parameters per object - say we want to store 10 sprite's x, y positions, width and height. We could go right ahead and dimension an array in BASIC like

DIM stuff[4][10]
and then use that array to store everything. But what happens if we don't know beforehand how many sprites we'll use? One solution is to just overdimension and hope that limit never gets crossed, but that's wasting RAM and we don't want to. A more RAM optimal way is to dimension a 5th field that will tell us where in ram the next index actually is. Our modified array will store x,y,w,h,adr_of_next_index. So in order to traverse the array we have to know the address of the first index, go there, do what we need to do, then fetch adr_of_next_index, jump to that address, and so on until we reach the list's end (let's say that the list ends when we read a 0 in adr_of_next_index).

 

attachicon.gifarray.png

attachicon.giflinkedlists.png

(No shame if you didn't digest all that info in the first read - go back and read it again if you're not sure how it works. Or reply to this post with a question. Ultimately it doesn't matter but it helps understanding some of the stuff you'll see later).

 

So by using this arrangement we can store as many object as we require with the minumum RAM waste.

 

1.c. Anatomy of an object

 

Let's see exactly how flexible the OP is by looking at how an object is defined.

 

There are five kinds of objects:

  • Bitmap object
  • Scaled bitmap object
  • Graphics Processor object
  • Branch object
  • Stop object
The first two are the most used objects. The fourth is just a terminator for the list and is placed at the end of the OL so it's only used once per list. Branch objects are also specialised and allow for skipping parts of the list - this doesn't sound very practical at first, but since there are a few conditionals that can be executed it becomes quite useful. GPU objects... we'll get to that eventually.

 

Each object is at least one phrase long (one phrase is 64 bits, or 8 bytes). So what probably happens is that the OP reads a phrase off the OL for the current object, and if the type demands it then read extra phrase(s). The first 3 bits of each object's first phrase contain the TYPE field. That's 0 for bitmapped object, 1 for scaled, 2 for GPU object, 3 for branch object and 4 for stop object.

 

1.c.1. Bitmapped object

 

Let's have a quick look at the fields for a normal bitmapped object. Don't pay too much attention at the descriptionm especially if you don't understand something - we'll explain everything eventually.

 

TYPE - object type (hardcoded to 0)

YPOS - object y position in screen

HEIGHT - object height

LINK - address of next object in OL

DATA - address of graphics data for the object

XPOS - object x position in screen

DEPTH - object's bit depth

PITCH - how many bytes is a single scanline of the object

INDEX - For bitmapped objects, choose which palette to use

DWIDTH - how many phrases wide the object is in memory (can be different than PITCH!)

IWIDTH - how many phrases wide the object is on screen (can be different than PITCH and DWIDTH!)

REFLECT - flag that controls if the object will be drawn mirrored horizontally

RMW - Adds the current pixel value to what's already on the line buffer (advanced topic for now!)

TRANS - flag that enables transparency. Colour 0 will not be drawn on screen

RELEASE - flag that tells the OP to yield the bus to other chips

FIRSTPIX - tells the OP which is to be considered the first pixel to be drawn per scanline

 

Now that's a lot of things, right? If you were to implement those in software you'd consume a lot of bytes. And since CPUs are usually comfortable with 8, 16 or 32 bit values (byte/word/longword) you'd need a word per parameter (even for a flag) and a longword per address to be safe. There are 16 parameters in that list, two of them are addresses. So we're looking at 14*2+2*4=36 bytes=288 bits! How did they squeeze it down to 128? (if you wonder "why 128?" then remember: a bitmapped object is 2 phrases!) Well, for starters some fields are on/off (0/1) so they just need a single bit to represent them. Similarly other fields like YPOS/XPOS don't need the full 16 bits that we allocated because of physical constraints - no reason to have a X value of 65535 for example! So most of the fields are chopped down in a similar fashion. Finally, the address fields are always aligned to 8 bytes due to hardware constraints (burst read access if I'm not mistaken). This means that the address values will be a multiple of 8 - so 8,16,32,48 etc. If you look these numbers in binary or hex notation you'll notice that the last 3 bits are zero. So no need to store them at all!

 

All the above tricks do save a lot of RAM and bandwidth for the OP but it comes as a cost. Namely, it's not very easy to construct and update an OL from a CPU standpoint. Inserting and updating bitfields require a few instuctions per operation and since the 68000 is a bit slow when it tries to shift values (an essential part unless you can avoid it), it can become quite demanding to update even basic values like screen x,y coordinates. So when a lot of objects are used it is recommended to use the GPU (Tom) to update the lists.

 

One final thing worth mentioning here is that during screen composing the OP will trash the OL partially (I am not sure which fields are modified at the moment, perhaps someone can help?). If the OP is allowed to run on the next frame with the list trashed you will definitely see the screen do weird things, from blanking out completely to displaying garbage and then crashing the machine! So it is necessary to update the OL every vertical blank (VBL). I know of two methods here: one is to update the actual list and the other is to keep a second copy of the OL and copy it to the live list during the VBL. The later is what raptor uses and it is much less complicated: you can do the processing during the frame is displayed and not have to struggle modifying things at the last moment.

 

1.c.2. Scaled bitmapped object

 

The scaled object is exactly the same as the bitmapped, except that the type is of value 1 here, with the addition of third phrase that contains some extra fields.

 

HSCALE - Horizontal scale factor in 3.5 unsigned fixed point format.

VSCALE - Vertical scale factor in 3.5 unsigned fixed point format.

REMAINDER - This seems to be internal state kept by the OP (anyone can help here?)

 

So what's all this 3.5 fixed point format thing then? First of all, fixed point is a way for a CPU that does not support numbers with fractions to support them. If you consider numbers in the decimal system you have the integer part followed by the fractional. They are separated using a deliminator (comma or dot, depending on your region). If we remove the deliminator from the number then a number like "123.456" becomes "123456". There is no way we can figure out where the fractional part begins unless someone tells us that it's after the 3rd digit. So if we all agree that the integer part is the first 3 digits and the fractional part the last 3 then we've made a fixed point system. We can now encode all numbers from 000.000 to 999.999 using 6 digits. (if you wonder why 000.000 and not -999.999, then notice that the sign is also a digit and we'll have to spend an extra digit to encode it). We call that "3.3 unsigned fixed point format". If we require negative numbers too we have to allocate an extra digit so our numbers can be "-999.999" to "+999.999" - that's a 4.3 signed format.

 

Now, let's switch to binary representation. It's pretty much the same thing as the above only with two available digits (0 and 1). So to our example above a 3.3 fixed point binary format would be able to store numbers from 000.000 to 111.111. So, our 3.5 unsigned format would of course hold numbers from 000.00000 to 111.11111.

 

So much for the definitions. But that doesn't help us much in the way of knowing how much we zoom our object, right? After all, what's 0.11111 binary when converted to decimal system where we're more comfortable? Let's begin with the easy stuff - integers. Since we use 3 bits for integer part we can store decimal numbers 0 to 2^3-1=7.

 

Moving on to the scary part: fractions! So ask yourself: what is 0.1 in the decimal system? It's 1/10th, right? And 0.001? That's 1/100th. 0.0001 is 1/1000th an so on. So what happens is that we divide 1 by the base as many times as we have decimal places. If we formulate this, it's something like "1/(10^fraction_digit)" to represent a fraction digit (10 raised to the power of the number of fraction digits is the same as dividing with 10 as many times as the number of fraction digits). It just so happens though that 10 is the decimal system's base. So we can change the formula to "1/(base^fraction_digit)". Finally, "1" is used because our examples had 1 in them. So the final transformation we do to the formula is "number/(base^fraction_digit". This lets us represent any digit in the fractional part. I hope you've got it by now, but if not... Let's switch to binary number system. Our base is 2 here and the range for number here is 0 to 1 so we can write our generic formula as "0 (or 1)/(2^fraction_digit)".

 

Let's write some examples then: %0.00001 (notice that numbers prefixed with % are considered binary by assemblers like rmac) is actually 1/2^5 in decimal, so 1/32 or 0.03125. %0.01 is 1/2^2=1/4=0.25. %0.1 is 1/2=0.5.

 

So what numbers can we represent on a 3.5 unsigned binary format? I would expect %000.00000 to produce nothing as it's a scale factor of exactly 0 so let's leave that out for now. %000.00001 would be the smallest number and %111.11111 the largest. %000.00001 as we wrote above is 0.03125. So that's actually our increment - all scale factors will be an integer multiple of this. For example, the next number in sequence, %000.00010, is 0.0625 which is, true enough, double of 0.03125. %001.00000 is obviously 1, so that's the number we need to put in order to have no scaling at all. And so on and so forth until we reach %111.11111, which is 7+1/2+1/4+1/8+1/16+1/32=7.96875 - that's the largest scale we can have from the OP.

 

1.c.3. Graphics Processor object

 

Scary stuff - let's leave that out for now!

 

1.c.4. Branch object

 

This object enables the OP to skip parts of the OL or even create loops if used carefully.

 

TYPE - object type (hardcoded to 3)

YPOS - if a comparison is performed, this is the value to compare against.

CC - Condition Code

LINK - If a branch is taken, this is the address to branch to.

 

First of all, if we simply want to branch to a different point in the OL we can simply set CC to 0, YPOS to $7fff and fill the LINK field with the address to branch to. This can be used to remove objects that are unused at the time of display (for example, say you have 30 objects that display bullets and you only have 10 active. You could add a branch object before the bullet objects and branch so the OP will skip 20 objects and display the last 10).

 

The other three cases can branch if the Video Counter (VC) is equal (CC=0), smaller (CC=2) or larger (CC=1) compared to the value YPOS contains. This can lighten the OP's load greatly. For example, consider the following playfield:

 

433200-moto-racer-playstation-screenshot

There's no reason for the OP to render the lower parts of the screen while it's rendering the upper half. So we can set YPOS to half the screen height and save tons of bandwidth.

 

Also, using comparison branches you can effectively create loops (i.e. render the same object for the first 50 scanlines) but I'm not sure if there's any vaule to doing this - most likely the objects will become trashed!

 

There are also two other branch types but I'll leave them alone for now as they're more specialised.

 

One final note (careful readers will probably wonder about this): If the branch is not taken, then the OP expects the next object to be on the next phrase from the branch object! If you violate this, then funky things will happen :)!

 

1.c.5. Stop object

 

Pretty straightforward stuff, just stick a 4 in the TYPE field and fill the rest of the phrase with zeros. The OP will stop processing more obejcts after this. You're done!

 

1.c.6. Reference: the reference manual on Objects.

 

Here's a direct quote from the jaguar reference manual.

 

Bit Mapped Object
This object displays an unscaled bit mapped object. The object must be on a 16 byte boundary in 64 bit RAM.

First Phrase
Bits Field Description
0-2 TYPE Bit mapped object is type zero
3-13 YPOS This field gives the value in the vertical counter (in half lines) for the first (top) line of the object. The vertical counter is latched when the Object Processor starts so it has the same value across the whole line. If the display is interlaced the number is even for even lines and odd for odd lines. If the display is non-interlaced the number is always even. The object will be active while the vertical counter >= YPOS and HEIGHT > 0.
14-23 HEIGHT This field gives the number of data lines in the object. As each line is displayed the height is reduced by one for non-interlaced displays or by two for interlaced displays. (The height becomes zero if this would result in a negative value.) The new value is written back to the object.
24-42 LINK This defines the address of the next object. These nineteen bits replace bits 3 to 21 in the register OLP. This allows an object to link to another object within the same 4 Mbytes.
43-63 DATA This defines where the pixel data can be found. Like LINK this is a phrase address. These twenty-one bits define bits 3 to 23 of the data address. This allows object data to be positioned anywhere in memory. After a line is displayed the new data address is written back to the object.

Second Phrase
Bits Field Description
0-11 XPOS This defines the X position of the first pixel to be plotted. This 12 bit field defines start positions in the range -2048 to +2047. Address 0 refers to the left-most pixel in the line buffer.
12-14 DEPTH This defines the number of bits per pixel as follows:
0 1 bit/pixel
1 2 bits/pixel
2 4 bits/pixel
3 8 bits/pixel
4 16 bits/pixel
5 24 bits/pixel
15-17 PITCH This value defines how much data, embedded in the image data, must be skipped. For instance two screens and their common Z buffer could be arranged in memory in successive phrases (in order that access to the Z buffer does not cause a page fault). The value 8 * PITCH is added to the data address when a new phrase must be fetched. A pitch value of one is used when the pixel data is contiguous - a value of zero will cause the same phrase to be repeated.
18-27 DWIDTH This is the data width in phrases. i.e. Data for the next line of pixels can be found at 8 * (DATA + DWIDTH)
28-37 IWIDTH This is the image width in phrases (must be non zero), and may be used for clipping.
38-44 INDEX For images with 1 to 4 bits/pixel the top 7 to 4 bits of the index provide the most significant bits of the palette address.
45 REFLECT Flag to draw object from right to left.
46 RMW Flag to add object to data in line buffer. The values are then signed offsets for intensity and the two colour vectors.
47 TRANS Flag to make logical colour zero and reserved physical colours transparent.
48 RELEASE This bit forces the Object Processor to release the bus between data fetches. This should typically be set for low colour resolution objects because there is time for another bus master to use the bus between data fetches. For high colour resolution objects the bus should be held by the Object Processor because there is very little time between data fetches and other bus masters would probably cause DRAM page faults thereby slowing the system. External bus masters, the refresh mechanism and graphics processor DMA mechanism all have higher bus priorities and are unaffected by this bit.
49-54 FIRSTPIX This field identifies the first pixel to be displayed. This can be used to clip an image. The significance of the bits depends on the colour resolution of the object and whether the object is scaled. The least significant bit is only significant for scaled objects where the pixels are written into the line buffer one at a time. The remaining bits define the first pair of pixels to be displayed. In 1 bit per pixel mode all five bits are significant, In 2 bits per pixel mode only the top four bits are significant. Writing zeroes to this field displays the whole phrase.
55-63 Unused write zeroes.

Scaled Bit Mapped Object
This object displays a scaled bit mapped object. The object must be on a 32 byte boundary in 64 bit RAM. The first 128 bits are identical to the bit mapped object except that TYPE is one. An extra phrase is appended to the object.
Bits Field Description
0-7 HSCALE This eight bit field contains a three bit integer part and a five bit fractional part. The number determines how many pixels are written into the line buffer for each source pixel.
8-15 VSCALE This eight bit field contains a three bit integer part and a five bit fractional part. The number determines how many display lines are drawn for each source line. This value equals HSCALE for an object to maintain its aspect ratio.
16-23 REMAINDER This eight bit field contains a three bit integer part and a five bit fractional part. The number determines how many display lines are left to be drawn from the current source line. After each display line is drawn this value is decremented by one. If it becomes negative then VSCALE is added to the remainder until it becomes positive. HEIGHT is decremented every time VSCALE is added to the remainder. The new REMAINDER is written back to the object.
24-63 Unused write zeroes.

Graphics Processor Object
This object interrupts the graphics processor, which may act on behalf of the Object Processor. The Object Processor resumes when the graphics processor writes to the object flag register.
Bits Field Description
0-2 TYPE GPU object is type two
3-13 YPOS This object is active when the vertical count matches YPOS unless YPOS = 07FF in which case it is active for all values of vertical count.
14-63 DATA These bits may be used by the GPU interrupt service routine. They are memory mapped as the object code registers OB0-3, so the GPU can use them as data or as a pointer to additional parameters.

Execution continues with the object in the next phrase. The GPU may set or clear the (memory mapped) Object Processor flag and this can be used to redirect the Object Processor using the following object.

Branch Object

This object directs object processing either to the LINK address or to the object in the following phrase.
Bits Field Description
0-2 TYPE Branch object is type three
3-13 YPOS This value may be used to determine whether the LINK address is used.
14-15 CC These bits specify what condition is used to determine whether to branch as follows:
0 Branch if YPOS == VC or YPOS == 7FF
1 Branch if YPOS > VC
2 Branch if YPOS < VC
3 Branch if Object Processor flag is set
4 Branch if on second half of display line (HC10 = 1)
16-23 unused
24-42 LINK This defines the address of the next object if the branch is taken. The address is defined as described for the bit mapped object.
43-63 unused

Stop Object
This object stops object processing and interrupts the host.
Bits Field Description
0-2 TYPE Stop object is type four
3-63 DATA These bits may be used by the CPU interrupt service routine. They are memory mapped so the CPU can use them as data or as a pointer to additional parameters.
1.d. Bit depths - bandwidth

 

After digesting the basics objects one of the most confusing aspects for newcomers (especially rb+) is bit depth. "Since I draw some graphic on my desktop/laptop computer, it should just appear on the screen, right?". Well, yes and no.

 

In an ideal world we'd draw everything in as many colours we like and give it to the hardware to cope. Unfortunately the OP is quite fast but it cannot cope with this idea. It might appear so at first but as you start piling up objects one on top of the other it simply runs out of juice. When the OP is composing the screen, it more or less does the following for each line:

  • Goes through the whole OL until it reaches a stop object (branch objects are of course evaluated)
  • For each object the OP has to
  • parse object coordinates
  • parse screen coordinates
  • translate screen coordinates to object coordinates
  • figure out where it should read from the object's graphics address
  • fetch pixels
  • combine pixels with the pixels of the previous objects (transparency, ordering etc) if any
A big bottleneck in the above list is fetch pixels. If we draw uncontrollably and export everything to 16 or 24 bits per pixel (bpp), the OP has to read 2 or 3 bytes. Like said above, this eats up the chip's read bandwidth pretty fast. So if you keep piling up objects like these you're going to end up with garbage on screen - it's simply not possible to read all that data in the time frame allocated.

That's why the OP's designers added bit depths in the chip. If you know that your character sprite won't use more than 16 colours on screen (which translates to 4 bits, 2^4=16) then why waste 12 (or 20 in 24bpp mode) more? Multiply that with the number of objects you would like to have on screen (say 50?) and you end up with a lot of saved bandwidth. And that's for 16 colours, if you want to use even less then you can save much more.

 

Combining is also costly, especially when transparency comes into place. There's potentially a lot of read data thrown away just because a lot of objects are piling up on top of the others. Also, because the whole list is parsed per line, the OP also has to parse objects that might not apply in all scanlines, thus waste even more bandwidth. The use of branch objects can help massively here.

 

In conclusion, it makes good sense to plan ahead what you want to do and be bandwidth considerate. (After all, drawing the screen is only part of the problem - you also need logic, audio, inputs, and many more things)

 

1.e. Bitmapped objects, palettes and pixel formats

 

For 16, 24 and CRY modes it is easy to store colour information. Since each pixel is so many bits, we can encode the intensities for Red, Green and Blue directly on the pixel data. Very briefly, for 16bpp the format is:

Bit 0123456789abcdef
    RRRRRBBBBBGGGGGG
so, 5 bits for Red and Blue (0-31), and 6 for Green (0-63).

 

For 24bpp objects we have 8 bits (0-255) for Red, Blue and Green and 8 unused.

Bit 0123456789abcdef0123456789abcdef
    RRRRRRRRBBBBBBBBGGGGGGGG00000000
(note that the RBG order is intentional, that's how the OP expects values to be written)

 

CRY mode - one byte for RGB and one for intensity - let's leave that for later.

 

Let's go back to <16 bpp modes. Since in these modes we don't have enough bits to store component intensities, the solution is to store the intensities in a designated memory area separately and for the object itself just mark down the index to the intensities table. That memory area is called a Palette or CLUT (Colour Look-Up Table). The OP's CLUT table holds 256 entries and it uses the 16bpp format described above. So for example if we use 4bpp mode and our first 4 pixels are 9,2,5,8, the object's first two bytes will look like this:

 

Pixel     0    1    2    3
Values    9    2    5    8
Values 1001 0010 0101 1000
Notice that each pixel has all its bits packed one after the other. This is true for all bit depths and is called chunky format.

 

1.f. Let's display an object on screen

1.g. Advanced topics (CRY, RMW etc)

2. Raptor lists

 

Here's an object as defined in a raptor list. Values in red are identical (or almost identical) to OL fields. (well, code snippets can't be coloured it seems, so I'll get back to this)

 

(REPEAT COUNTER) - Create this many objects of this type (or 1 for a single object)

sprite_active - sprite active flag

sprite_x - 16.16 x value to position at

sprite_y - 16.16 y value to position at

sprite_xadd - 16.16 x addition for sprite movement

sprite_yadd - 16.16 y addition for sprite movement

sprite_width - width of sprite (in pixels)

sprite_height - height of sprite (in pixels)

sprite_flip - flag for mirroring data left<>right

sprite_coffx - x offset from center for collision box center

sprite_coffy - y offset from center for collision box center

sprite_hbox - width of collision box

sprite_vbox - height of collision box

sprite_gfxbase - start of bitmap data

(BIT DEPTH) - bitmap depth (1/2/4/8/16/24)

(CRY/RGB) - bitmap GFX type

(TRANSPARENCY) - bitmap TRANS flag

sprite_framesz - size per frame in bytes of sprite data

sprite_bytewid - width in bytes of one line of sprite data

sprite_animspd - frame delay between animation changes

sprite_maxframe - number of frames in animation chain

sprite_animloop - repeat or play once

sprite_wrap - wrap on screen exit, or remove

sprite_timer - frames sprite is active for (or spr_inf)

sprite_track - use 16.16 xadd/yadd or point to 16.16 x/y table

sprite_tracktop - pointer to loop point in track table (if used)

sprite_scaled - flag for scaleable object

sprite_scale_x - x scale factor (if scaled)

sprite_scale_y - y scale factor (if scaled)

sprite_was_hit - initially flagged as not hit

sprite_CLUT - no_CLUT (8/16/24 bit) or CLUT (1/2/4 bit)

sprite_colchk - if sprite can collide with another

sprite_remhit - flag to remove (or keep) on collision

sprite_bboxlink - single for normal bounding box, else pointer to table

sprite_hitpoint - Hitpoints before death

sprite_damage - Hitpoints deducted from target

sprite_gwidth - GFX width (of data)

 

So it's evident that raptor lists try to be close to the OP's object definitions while adding extra fields to help the processing of sprites (animation, hitpoints, collision etc.).

 

3. Wrapping up

 

Hopefully everyone reading this post got something out of it. It's nothing more than re-stating what the hardware manual says with as many explanations to the newcomer as possible. Also it shows how much stuff rb+ and raptor do behind your back (constructing OLs, calculating object parameters, aligning graphics data so it will be processed ok and so much more).

 

Thanks for your patience while reading this! Let me know if there's something not clear, if I omitted something or if there's an error somewhere.

 

 

This is the document that should have been included with the Jaguar Developer's manual. Thank you so much for making sense of the Object Processor, a device which has been mysterious and elusive for many, many years.

Link to comment
Share on other sites

  • 2 years later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...