+Pat Brady Posted March 11, 2022 Share Posted March 11, 2022 A post about how to organize your RAM and build your DLs. I don't know whether this idea is novel, but I have not seen it described anywhere. I share it in the hopes that it is useful to some other 7800 developer. Z means number of zones, B means number of background (static) headers per zone, and S means number of logical sprites. Each logical sprite has one header if it fits in one zone, or two headers if it crosses zones using holey DMA. I believe the conventional approach is to reserve enough RAM for each DL to contain every sprite that might ever appear in the corresponding zone. Since most games have multiple sprites that may or may not be in the same zone, this requires Z*(B+S) headers. This can waste a lot of RAM! For 7iX, that wasn't feasible due to how much RAM is used for the playfield and enemy. Instead, I have one block of memory that stores all the headers contiguously. I only need space for Z*B+2*S headers, the same as the number of objects on-screen. The obvious way to do this is a nested loop of zones and objects, which is O(S*Z), unacceptably slow for many cases. I developed this algorithm which is O(S+Z). The algorithm is: Set aside Z+1 bytes of temp RAM. Call this space C. Loop over zones. Initialize C[z] to B. (CPU cost: 8 cycles per zone — a bit more if B is variable.) Loop over logical sprites. Calculate zone z based on the y position of sprite s, using a lookup table or power-of-2 division. Increment C[z]. If the sprite crosses zones using holey DMA, also increment C[z+1]. (CPU cost: ~40 cycles per logical sprite.) Loop over zones, keeping a running offset from the start of my header block. Emplace headers for static objects and 2-byte DL terminators. Update the DLL entry. Skip over enough space for sprite headers for zone z. Replace C[z] with the offset for zone z. (CPU cost: ~105 cycles per zone that can contain sprites, assuming 2 static objects per zone.) Loop over logical sprites. Calculate zone z as in step 3. Emplace headers for sprite s, using C[z]. Increment C[z] by header size. Do the same for zone z+1 if the sprite crosses zones. (CPU cost: ~200 cycles per logical sprite, less if it doesn't cross zones, should be about the same as for static allocation.) So suppose Z=12 and S=16 with 4 byte headers, keeping the assumption of 2 static objects per zone. This algorithm will take about 48 scanlines of CPU time, too slow to fit entirely in VBLANK, but fast enough to finish in the total offscreen with time to spare (unless you go way over 192 lines displayed). In this use case, the conventional approach needs 888 bytes for headers. This algorithm only needs 248, a savings of 640 bytes. I believe you could reduce the off-screen CPU time by doing steps 1-2 on-screen. They don't touch anything MARIA looks at. (I have not tried this.) Or, if you double buffer, then no need to finish the algorithm in off-screen time, and the RAM savings increase to 1280 bytes. In fact the double buffer with this algorithm uses significantly less RAM than single-buffered conventional allocation. (I have not tried this either.) If anybody finds this useful enough to use, I'd love to know! 4 Quote Link to comment Share on other sites More sharing options...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.