TMS9900 is the coolest CPU for multi-tasking

+TheBF · May 21, 2017

Redid the DENILE program with little tasks running and posted it over there.

http://atariage.com/forums/topic/265231-in-denile/page-2

apersson850 · May 23, 2017

In my independent multi tasking scheduler (not when using the p-system), I used the TMS 9901 as the timer to invoke the scheduler. But since I have a console with 64 K RAM internally, I can overlay console ROM when I want to. Thus I can change the interrupt vector in the system to point to my own interrupt service routine. That allows me to create the most efficient task scheduler possible, since the TMS 9901 hardware will do the counting and the interrupt will invoke the scheduler when it triggers.

Then it's just the standard issue of keeping track of running and waiting tasks.

In the p-system I used a different trick. Since the interrupt service routine there partially runs in RAM, you can easily modify it. The p-system is also cartridge independent, since the p-system operating system is on a card in the PEB, not in a cartridge in the cartridge slot. Thus you can for example plug in Mini Memory when running the p-system. So I inserted a wedge in the standard interrupt service in RAM, went to a routine in Mini Memory and then returned to the standard interrupt service. The part in the Mini Memory RAM does the task switching. There are also some support routines to implement monitors and some other stuff, that's convenient to have in a pre-emptive multitasking system.

Edited May 23, 2017 by apersson850

+TheBF · May 23, 2017

Yes having a hardware timer running is critical even in this cooperative version.

It means that even with 5 tasks running, my BEEP and HONK sounds still sound correct. :-)

Without the 9901 in my delay mechanism the sound duration really varied by the amount

of load I put on the computer.

Can you shows the code for the interrupt handler that does the task switch?

I would love to see it.

Here is a conventional (non workspace changing) Forth Task switcher

that I wrote first:

\ Conventional Forth Pause
CODE: PAUSE  ( -- )                  \ this is the context switcher
              SP RPUSH,              \ 28
              IP RPUSH,              \ 28
              RP  4 (UP) MOV,        \ 22 save my return stack pointer in RSAVE user-var
              BEGIN,
                 2 (UP) UP MOV,      \ 22 load the next task's UP into CPU UP  (context switch)
                 *UP R0 MOV,         \ 18 test the tlag for zero
              NE UNTIL,              \ 10 loop until it's not zero
              4 (UP) RP MOV,         \ 22 restore local Return stack pointer so I can retrieve IP and SP
              IP RPOP,               \ 22 load this task's IP
              SP RPOP,               \ 22  load this task's SP
              NEXT,               \ = 194 * .333 = 64.6uS context switch
              END-CODE

The really fascinating thing about Chuck Moore's method is that there is no separate scheduler.

Which means the machine has less work to do right out of the gate.

It sound like it can't possibly work when you are use to premptive systems and yet it does.

(as long as you create a little routine that incorporates a timer, if you need real time)

The amazing thing about it is the granularity of the task switch is very small and even

in conventional processors the task switch only needs to save 3 registers.

(that's not unlike the TMS9900)

How it works:

Emit a character, switch

Check for a key press, switch

Read a key, switch

Waiting for a timer to expire, switch

Write a block of memory, switch.

The only thing you have to be aware of as a designer is that your code does not try to do too much at one time.

But if you are importing code from other environments then of course you would need a pre-emptive tasker.

Then you have all the problems of multiple tasks calling shared resources with pre-emption.

That goes away with Chuck's system and if you have a task that needs to steal the CPU for something

really important, you have it. No record locking needed.

Everything in the system runs to some form of completion by design.

apersson850 · May 24, 2017

Yes, I'm familiar with the two different principles for concurrent processing. Real time processing was kind of my speciality at university, and I still do it for work.

The problem with volountary task switching is of course when something enters a loop that's longer than expected, maybe even eternal. Then you are dead on the water.

The UCSD p-system for the 99/4A supports volountary task switching (you wait on a semaphore, or something like that, and get a switch). What I did there was expand it to pre-emptive switching. Unfortunately, they had made some shortcuts in the heap management code, now when they expected volountary switching only, so heap operations crashed when I tried to use them with pre-emptive switching. Without the source code to unit HEAPOPS, it's virtually impossible to figure out what's going on, and even if you do, by disassembling the p-code, there's no real way to create a corrected unit HEAPOPS without access to the source for unit KERNEL.

Since dynamic memory allocation is a key feature in such a concurrent system, I gave up there. But as long as I stay away from dynamic memory allocation, it does work.

Edited February 25, 2022 by apersson850

Willsy · May 24, 2017

The problem with volountary task switching is of course when something enters a loop that's longer than expected, maybe even eternal. Then you are dead on the water.

You just yield in the loop, or every n iterations of the loop. If all the tasks are cooperative then execution will come back to your loop.

You also have the option to NOT yield if you are doing something time critical.

apersson850 · May 24, 2017

Can you shows the code for the interrupt handler that does the task switch?

I would love to see it.

It's rather funny. Tonight I went up to my 99/4A (the one that's set up and ready to run), turned it on and there it is. Live and kicking. It hasn't been running for several years. What's even more astonishing is that I feel at home almost immediately, loading the disk manager, catalog a few disks to find the program, look at it and then print it from E/A (in Maximem). I print to a PC capturing the file with Hyper Terminal.

Anyway, here it is. This simple example runs two processes, which both print on the screen, but with different delay loops in between. Thus when running it, I see output like "BBBBABBBBABBBBABBBBA" on the screen, just to prove that the whole thing works.

As you can see, there are 12 instructions in the scheduler, so it's not too wasteful of resources. It runs in 79.3 us. Of course, to handle a more complex system, with an arbitrary number of tasks (maybe with different priorities), which could be ready to run or waiting for events to happen, it would become more complex.

In such a case, you need a more complex Task information block (TIB) for each task. It will have to contain the priority and slots for links to other TIB, so that you can link any number of tasks in the ready queue, and also link one or more TIB that are waiting on for example a counting semaphore or some event to happen.

Note that this example will not run without the 64 K RAM expansion of my design, where RAM can replace ROM under software control (CRU bits based at >0400) in the console.

By the way, i did notice the comment says it's written in November 1988. They last long, these floppies.

*******************************                                                 
*                                                                               
* Multitasking with time-sharing on the 99/4A                                   
* A-DATA 881101                                                                 
*                                                                               
                                                                                
       REF  VDPWA,VDPWD,VWTR,KSCAN                                              
       DEF  START                                                               
                                                                                
MEMCON EQU  >400         CRU address of memory control                          
IVECT1 EQU  4            Interrupt vector location for level 1 interrupt        
SCSIZE EQU  960          Screen size                                            
SCRPOS DATA 0            Current screen location                                
ENDSCR DATA SCSIZE                                                              
WRMASK DATA >4000                                                               
                                                                                
* VDP character write routine                                                   
* Assumes character to write is MSBy of R1                                      
* Avoids use of any other register in WS of calling routine                     
* CHOUT is a non-splittable operation, since the VDP address load operation and 
* the data byte write must be allowed to be completed. Otherwise, characters    
* will be misplaced on the screen.                                              
                                                                                
CHOUT                                                                           
       LIMI 0            This operation can't be splitted                       
       MOVB @SCRPOS+1,@VDPWA  Load VDP address                                  
       SOC  @WRMASK,@SCRPOS                                                     
       MOVB @SCRPOS,@VDPWA                                                      
       SZC  @WRMASK,@SCRPOS                                                     
       MOVB R1,@VDPWD         Load VDP data                                     
       INC  @SCRPOS                                                             
       C    @SCRPOS,@ENDSCR   Outside screen?                                   
       JL   ONSCR                                                               
       CLR  @SCRPOS           Begin at screen top again                         
ONSCR  LIMI 1                                                                   
       B    *R11                                                                
                                                                                
* Screen clearing routine                                                       
                                                                                
CLRSCR                                                                          
       LI   R0,>4000                                                            
       SWPB R0                                                                  
       MOVB R0,@VDPWA                                                           
       SWPB R0                                                                  
       MOVB R0,@VDPWA                                                           
       LI   R0,'  '                                                             
       LI   R1,SCSIZE                                                           
CLRLOP MOVB R0,@VDPWD                                                           
       DEC  R1                                                                  
       JNE  CLRLOP                                                              
       B    *R11                                                                
                                                                                
********                                                                        
*                                                                               
* Starting point of program                                                     
*                                                                               
                                                                                
START                                                                           
       LIMI 0                                                                   
       LWPI SCHDWS                                                              
                                                                                
* Initiate VDP chip                                                             
       LI   R0,>1F0      Text mode                                              
       BLWP @VWTR                                                               
       SWPB R0                                                                  
       MOVB R0,@>83D4    VDP R1 copy                                            
       LI   R0,>717      Text mode color register                               
       BLWP @VWTR                                                               
       BL   @CLRSCR                                                             
                                                                                
* Move monitor ROM to expansion RAM.                                            
* This allows changing of any byte in the normal OS, here the interrupt vector. 
                                                                                
       LI   R12,MEMCON   CRU address of expansion memory control                
       SBO  1            Enable RAM @>4000->5FFF                                
       CLR  R0           Source pointer                                         
       LI   R1,>4000     Destination pointer                                    
       LI   R2,>2000     Byte counter                                           
MVLOP1 MOV  *R0+,*R1+                                                           
       DECT R2                                                                  
       JNE  MVLOP1                                                              
       SBO  0            Enable RAM @>0000->1FFF                                
       LI   R0,>4000     Source pointer                                         
       CLR  R1           Destination pointer                                    
       LI   R2,>2000     Byte counter                                           
MVLOP2 MOV  *R0+,*R1+                                                           
       DECT R2                                                                  
       JNE  MVLOP2                                                              
       SBZ  1                                                                   
                                                                                
       LI   R0,SCHDWS    Address of scheduler WS                                
       MOV  R0,@IVECT1                                                          
       LI   R0,SCHED     Address of scheduler routine                           
       MOV  R0,@IVECT1+2                                                        
                                                                                
* Set up PSI for internal clock interrupt, and nothing else.                    
                                                                                
       CLR  R12                                                                 
       SBZ  0            Interrupt mode                                         
       SBZ  2            Disable VDP interrupt                                  
       SBZ  1            Disable external interrupt                             
       LI   R1,>03AB     Counter value for 10ms interrupt                       
       LDCR R1,15        Automatically enters clock mode with this data         
       SBZ  0            Out of clock mode                                      
       SBO  3            Enable clock interrupt                                 
                                                                                
* To start the whole thing up, some task, here task A, must be "restored" from  
* the beginning, although it hasn't yet executed. Both save areas are initiated 
* with values that will allow the tasks to start from their own starting points.
                                                                                
       MOV  @CURTSK,R1   Load values to restore for first task                  
       MOV  *R1+,R13     WP                                                     
       MOV  *R1+,R14     PC                                                     
       MOV  *R1+,R15     ST                                                     
       RTWP              Faked restore                                          
                                                                                
*****                                                                           
*                                                                               
* Task A                                                                        
*                                                                               
                                                                                
WSA    BSS  32           Task A workspace                                       
TASKA                                                                           
       LI   R1,'A '                                                             
LOOPA2 LI   R0,>8000     Delay count                                            
LOOPA1 DEC  R0                                                                  
       JNE  LOOPA1                                                              
       BL   @CHOUT                                                              
       JMP  LOOPA2                                                              
                                                                                
* Task B                                                                        
                                                                                
WSB    BSS  32           Task B workspace                                       
TASKB                                                                           
       LI   R1,'B '                                                             
LOOPB2 LI   R0,>2000     Delay count                                            
LOOPB1 DEC  R0                                                                  
       JNE  LOOPB1                                                              
       BL   @CHOUT                                                              
       JMP  LOOPB2                                                              
                                                                                
********                                                                        
*                                                                               
* Scheduler routine                                                             
                                                                                
SAVEA  DATA WSA,TASKA,1  Task A savearea. Interrupt mask is 1                   
SAVEB  DATA WSB,TASKB,1  Task B savearea. Interrupt mask is 1                   
                                                                                
CURTSK DATA SAVEA        Pointer to save area for currently running task        
READYQ DATA SAVEB        Pointer to save area for task ready to run             
                                                                                
SCHDWS BSS  32           Scheduler workspace                                    
SCHED                                                                           
       SBO  3            Clear interrupt condition                              
       MOV  @CURTSK,R1   Swap off current task                                   
       MOV  R13,*R1      WP                                                     
       MOV  R14,@2(R1)   PC                                                     
       MOV  R15,@4(R1)   ST                                                     
                                                                                
       MOV  @READYQ,R2   Swap on ready task                                     
       MOV  *R2,R13      WP                                                     
       MOV  @2(R2),R14   PC                                                     
       MOV  @4(R2),R15   ST                                                     
                                                                                
       MOV  R1,@READYQ   Alter task bookkeeping                                 
       MOV  R2,@CURTSK                                                          
       RTWP              Return from scheduler                                  
                                                                                
       END

Edited May 24, 2017 by apersson850

apersson850 · May 24, 2017

You just yield in the loop, or every n iterations of the loop. If all the tasks are cooperative then execution will come back to your loop.

You also have the option to NOT yield if you are doing something time critical.

Yes, that's true. In this case you have to think about when you need to do something special in the code to make sure it's cooperative. With pre-emptive multitasking, you have to make sure you do things that require that they aren't interrupted using procedures that take care of that (like CHOUT in my example above) or disable interrupts when doing them. The latter usually isn't recommended, since you may lock up the whole thing that way.

+TheBF · May 25, 2017

By the way, i did notice the comment says it's written in November 1988. They last long, these floppies.

*******************************                                                 
*                                                                               
* Multitasking with time-sharing on the 99/4A                                   
* A-DATA 881101                                                                 
*                                                                               
<SNIP>                                                                                
              
SCHDWS BSS  32           Scheduler workspace                                    
SCHED                                                                           
       SBO  3            Clear interrupt condition                              
       MOV  @CURTSK,R1   Swap off current task                                   
       MOV  R13,*R1      WP                                                     
       MOV  R14,@2(R1)   PC                                                     
       MOV  R15,@4(R1)   ST                                                     
                                                                                
       MOV  @READYQ,R2   Swap on ready task                                     
       MOV  *R2,R13      WP                                                     
       MOV  @2(R2),R14   PC                                                     
       MOV  @4(R2),R15   ST                                                     
                                                                                
       MOV  R1,@READYQ   Alter task bookkeeping                                 
       MOV  R2,@CURTSK                                                          
       RTWP              Return from scheduler                                  
                                                                                
       END

Thanks for sharing this.

It looks to me like the scheduler is doing something like BLWP but manually.

Any why you chose to do it this way?

apersson850 · May 25, 2017

Well. considering that I figured this out 29 years ago, I may not remember correctly. But as far as I do, it's because when you get an interrupt, it effectively makes a BLWP into the interrupt service routine. When the interrupt service is ready, it will RTWP back to the interrupted program. What I do here is that I RTWP back to a different program, thus accomplishing the task switch. This is not a normal interrupt service routine, since usually such a routine runs and do what it should do (move sprites or whatever) and then return in a stealth manner. Except for the intended work, like moving sprites, no other program should really be aware that it ever ran. But this is completely opposite. The interrupt service in a case like this, when it's a scheduler, literally has the command over life and death for the other processes. The trick here is instead that the different processes running all think they own the computer, but instead are unaware about that the CPU now and then is given to some other process, of which they (in this simple example) are completely unaware that it even exists.

So if you look at it again, you see that I don't do a "manual BLWP", because I restore the WP, PC and ST into the current WS. When using BLWP, the CPU stores the same registers in the called WS.

In a more complex concurrent environment, the different processes running are frequently aware of each other, since they may communicate with other processes through semaphores or monitors or whatever you've implemented for that task.

+FarmerPotato · February 25, 2022

@apersson850 Jeg vil gerne prøve din kode!

I plan to implement preemptive multitasking in Geneve 2020 BIOS. Mutex I can accomplish with ABS since it is indivisible.

@Tursi posted about using SETO and INC for Mutex. I imagine this pattern:

SETO @LOCK * release a lock


TRY
INC @LOCK * attempt to acquire
JEQ LOCKED
BL @YIELD * else sleep for a while 
JMP TRY

LOCKED
* we have the lock
* do stuff like access VDP 
SETO @LOCK * release it

But the counter might roll over. Maybe I missed something in that post.
ABS is safer (I can’t find the original poster here who pointed out ABS years ago. I remembered that post when I saw again that ABS is indivisible, in the 99105 book.)

Im thinking of ways to implement Conditions. A Condition is where a task waits for a resource to unlock, or a message to be broadcast.

Say a task tries to acquire a resource by

TRY:

LI R0,LOCK

ABS *R0

JLT LOCKED

BL @WAITR0

JMP TRY * we probably will get it

Im stuck on how it might yield and then how a scheduler could keep a queue of who to wake up when the mutex is free.

A process that holds the lock would have to notify the scheduler that’s it’s freed it.

Or when the scheduler preempts, it checks all mutexes , and runs a process who’s waiting on it.

How to enter a Condition?

I guess it could be a SYS call , ie XOP. I’d rather avoid using XOP for mutexes. However there might be no other solution since Mutexes must exist in some shared memory.

The process could add itself to a queue, instead of doing it with a SYS all. Implementing a lockless queue:

Suppose an array of 32 locks corresponding to places in the queue. A waiter tries to lock each of them with ABS until it gets one. Then, in other corresponding arrays, it places its PID and the address of the Mutex it wants (maybe use a queue for each separate Mutex?) then it yields , hoping to get awakened when the mutex is unlocked.

The scheduler, while looking for a runnable process, evaluates each spot in the queue. If the wanted Mutex is now available, it clears the spot in the queue and runs the PID, which should then lock the Mutex it wanted.

in this scheme, a Condition ensures that a process doesn’t waste its whole CPU time slice in checking for the lock (spin lock— this would be stupid in a single CPU system!)

But Another process can still hog the resource. Worst case, it locks , unlocks, then locks again just as it’s time slice is up.

And this scheme requires that a process cooperate to avoid creating a deadlock, where it locks A abs waits for B while someone else has B abs waits for A.

Threading is hard.

+TheBF · February 25, 2022

44 minutes ago, FarmerPotato said:

@apersson850 Jeg vil gerne prøve din kode!

Threading is hard.

Yup.

My choice would be both:

cooperative tasking for non-real-time stuff. If your O/S primitives are built to fit this and yield on I/O it just works.
Context switching is ridiculously efficient.
interrupt driven queue of time critical tasks ( you can fight with all that as you see fit)

apersson850 · February 25, 2022

3 hours ago, FarmerPotato said:

@apersson850 Jeg vil gerne prøve din kode!

In Danish? It's probably in Swedish!

In more complex systems, where a scheduler controls which tasks to run, you typically have a readyque, where all tasks ready to run are linked. Or rather you link their task information blocks.

You also have a queue of all tasks waiting for semaphores.

When a semaphore is signalled free, you move the tasks waiting for it to the readyque.

If you hit a semaphore that's locked, you line up in the queue.

The code associated with a semaphore call (signal and wait) contains several instructions to handle this.

TMS9900 is the coolest CPU for multi-tasking

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members