Jump to content
IGNORED

Raycaster


Asmusr

Recommended Posts

I decided to see how much speed I could squeeze out of a character based raycaster with only 32 columns. The result can be seen here:

 

https://youtu.be/Y_WR974T_Ak

 

You can also run it yourself using one of the attached files.

 

The frame rate is 15-20 FPS, so this is quite within the limits of a playable game. The question is whether it's too ugly?

 

The algorithm is a simple raycaster where reality is sacrificed for speed.

 

The world is divided into 128 directions, and for each direction I pre-calculate a unit vector (cos(a), sin(a)) in Java and store it as two 8.8 fixed point numbers, i.e. multiply floating point number by 256, round, and store in a 16-bit word. So the output from the Java program is a list of 128 x values and 128 y values as TI-99/4A assembly DATA statements.

 

The player's position is also stored as two 8.8 fixed point numbers (x and y) and the player's direction is stored as one of the 128 directions.

 

For each frame we cast 32 rays from the player's position, each in a different direction, with the player's current direction at the center, so from direction - 16 to direction + 15.

 

Each ray start with the player's position, and we keep adding the unit vector in the given direction until we hit a wall or give up. So for each iteration we need to check what the map contains at the given position. To do this as quickly as possible we make the map width 256 so the map address can be calculated as 256 * y + x + base, which it's very fast to do in assembly.

 

map.thumb.png.84cb6e8ff820073305433998135a6e2e.png

 

So the central loop of the raycaster looks like this, where (r0,r1) contains the current (x,y) position, (r3,r4) contains directional unit vector, and r8 is a counter for max iterations:      

cast_ray_1:
       a    r3,r0                      ; x += xdir
       a    r4,r1                      ; y += ydir
       movb r1,r6                      ; y -> r6 msb
       movb r0,*r5                     ; x -> r6 lsb (for a bit of speed, r5 contains the address of r6 lsb)
       movb @map(r6),r7                ; Get map entry
       jne  cast_ray_2                 ; Not zero is a hit
       dec  r8                         ; Distance count down
       jne  cast_ray_1                 ; Loop until we give up

The value of r8 after the loop determines the distance we have traveled, and the value of r7 determines what we have hit. From the distance we can calculate the height of the wall at the given screen column. The formula is something like max_height / distance, but I use a look-up table.

 

Calculating distances from a single point is resulting in a fish-eye view of the world, as you can see in the demo. A more sophisticated algorithm would calculate the perpendicular distance to the view plane instead. I think it should be possible to fix this without sacrificing speed by having an additional correction look-up table.

 

We now have a height for the walls for each of the 32 screen columns. For each column we want to draw a strip of sky, then a strip of wall, and finally a strip of floor. To do this most efficiently I have pre-drawn the columns at different wall heights in Magellan. That also makes it easy to add the primitive 'textures' you see on the demo. 

 

Drawing columns one by one is bad for performance because you have to set up the VDP write address for each row, so instead we set up the write address once and then draw one byte from each column in turn. To fetch the bytes to draw we set up a pointer for each column that point to the right column data. The drawing loop looks like this:

upload_screen_loop:
       li   r1,column_ptrs
       li   r2,screen_width
upload_screen_loop_1:
       mov  *r1,r0                     ; Get column pointer
       movb *r0+,*r15                  ; Write byte to VDP (r15 contains VDPWD)
       mov  r0,*r1+                    ; Write pointer back
       dec  r2
       jne  upload_screen_loop_1       ; Next column
       dec  r3
       jne  upload_screen_loop         ; Next row
       rt

Finally, to push the last bit of performance out of the TI-99/4A, I run the two central loops from scratch pad RAM.    
   
The code is on GitHub:
https://github.com/Rasmus-M/raycaster

raycaster.dsk raycaster.rpk raycaster8.bin

 

See also:

 

Edited by Asmusr
  • Like 16
  • Thanks 1
Link to comment
Share on other sites

I did not expect this to be finished so quickly! And it runs nearly as fast as my engine, if not faster at this point..

 

Shows I have no right in dictating what can and cannot be done

 

How big of a pain would it be to implement entities like in Wolf3D?

Link to comment
Share on other sites

2 hours ago, Gip-Gip said:

How big of a pain would it be to implement entities like in Wolf3D?

Like enemies and treasure? I think it might be possible to draw something, but the scaling would be a problem, and you would soon run of of characters.

Link to comment
Share on other sites

According to this page http://www.permadi.com/tutorial/raycast/rayc8.html fisheye correction should be done by multiplying the height with the cosine of the angle between the ray and the player's direction, which makes a lot of sense, but if I do that the walls bend up at the edges instead of down like they do now. I can't think of any explanation except that my algorithm is just too inaccurate and/or the resolution too low.

Link to comment
Share on other sites

4 minutes ago, mizapf said:

Please use the current release. ? I cannot support ancient releases.

That sounds good, but if you knew the "long story" then you'd know that really isn't an option.  For better or worse, the computer is going to have to stay as is for the time being (i.e., no updates can be made now). 

Link to comment
Share on other sites

Could be, still: You'll have to expect that lots of things just won't work, and there is likely no way to get them going.

 

I'd really love to give as much support to any MAME user as possible, but this is an evolving system, and on https://www.mizapf.de/en/ti99/mame/changes you can see how much has been done since then. So I hope you (and others) understand why it is virtually impossible for me to help with old releases.

  • Thanks 1
Link to comment
Share on other sites

5 hours ago, Asmusr said:

According to this page http://www.permadi.com/tutorial/raycast/rayc8.html fisheye correction should be done by multiplying the height with the cosine of the angle between the ray and the player's direction, which makes a lot of sense, but if I do that the walls bend up at the edges instead of down like they do now. I can't think of any explanation except that my algorithm is just too inaccurate and/or the resolution too low.

You only have 32 columns - maybe you can just have a fixed correction table? ;)

Link to comment
Share on other sites

20 hours ago, Asmusr said:

According to this page http://www.permadi.com/tutorial/raycast/rayc8.html fisheye correction should be done by multiplying the height with the cosine of the angle between the ray and the player's direction, which makes a lot of sense, but if I do that the walls bend up at the edges instead of down like they do now. I can't think of any explanation except that my algorithm is just too inaccurate and/or the resolution too low.

 

Try this other DDA algorithm to prevent fisheye effect.

It is able to compute directly the wall distance perpendicular to the screen plane without multiplications. 

 

https://lodev.org/cgtutor/raycasting.html#The_Basic_Idea_

  • Like 1
Link to comment
Share on other sites

2 hours ago, artrag said:

 

Try this other DDA algorithm to prevent fisheye effect.

It is able to compute directly the wall distance perpendicular to the screen plane without multiplications. 

 

https://lodev.org/cgtutor/raycasting.html#The_Basic_Idea_

I'm familiar with that algorithm from when I made a raycaster for the F18A, but for this project I want something simpler and faster.

  • Like 1
Link to comment
Share on other sites

On 4/26/2020 at 2:04 AM, Asmusr said:

Drawing columns one by one is bad for performance because you have to set up the VDP write address for each row, so instead we set up the write address once and then draw one byte from each column in turn. To fetch the bytes to draw we set up a pointer for each column that point to the right column data. The drawing loop looks like this:


upload_screen_loop:
       li   r1,column_ptrs
       li   r2,screen_width
upload_screen_loop_1:
       mov  *r1,r0                     ; Get column pointer
       movb *r0+,*r15                  ; Write byte to VDP (r15 contains VDPWD)
       mov  r0,*r1+                    ; Write pointer back
       dec  r2
       jne  upload_screen_loop_1       ; Next column
       dec  r3
       jne  upload_screen_loop         ; Next row
       rt

 

Hi Rasmus, thanks for sharing your code for this.  I have made a small performance improvement in the screen drawing code, saving 20 cycles per byte written to the VDP, by removing the column pointer increment and write back, instead using self-modifying code to update the column offset for each row.  The @0(r1) part of the instruction is modified by the "clr @self_modifying_offset" and "inc @self_modifying_offset".

*********************************************************************
*
* Upload screen
*
upload_screen:
       .proc
       mov  @dbl_buffer_flag,r0
       andi r0,>0001
       sla  r0,10
       ai   r0,nametb
       bl   @vwad

       clr  @self_modifying_offset

       li   r3,screen_height
       bl   @upload_screen_loop_pad
       .endproc
*// upload_screen

*********************************************************************
*
* Upload screen loop
*
upload_screen_loop:
       li   r0,column_ptrs
       li   r2,screen_width
       
upload_screen_loop_1:
       mov  *r0+,r1                    ; Get column pointer
upload_screen_offset:
       movb @0(r1),*r15                ; Write byte to VDP (r15 contains VDPWD)

       dec  r2
       jne  upload_screen_loop_1       ; Next column

       inc  @self_modifying_offset

       dec  r3
       jne  upload_screen_loop       ; Next row
       rt
upload_screen_loop_end:
       equ  $
*// upload_screen_loop

self_modifying_offset:
	equ upload_screen_offset+2-upload_screen_loop+upload_screen_loop_pad

 

  • Like 2
Link to comment
Share on other sites

I think you are all talking about Graphics 1 mode?

In bitmap, I like to arrange the screen table as chars 0:8:16:24 ... 1:9:17:25.

So that I set the VDPWA and stripe down 8 chars in a column, then the next 8, and so on.

Repeat for the other 3rds.

I worked on this when I was trying a rectangle fill.

 

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...