Jump to content
IGNORED

Fastest MEMSET


nanochess

Recommended Posts

Hi guys.

 

I was wondering whether CLRSCR in IntyBASIC could be made faster, so I unrolled the inner cycle 4 times and added some extra code for the 1-3 extras.

 

MEMSET:
        SARC R1,2    ; bit 1 ends in OV, bit 0 ends in C
        BNOV $+4
        MVO@ R0,R4
        MVO@ R0,R4
        BNC $+3
        MVO@ R0,R4
        BEQ $+7
        MVO@ R0,R4
        MVO@ R0,R4
        MVO@ R0,R4
        MVO@ R0,R4
        DECR R1
        BNE $-5
        JR R5
This could be the speediest MEMSET for Intellivision :grin: (except of course for the cases of 1-3 words)
Link to comment
Share on other sites

Just for the record, examples/library/memset.asm in Joe's SDK is doing exactly that, except it's unrolled 8 times.

;; ======================================================================== ;;
;;  MEMSET    Fill array with value                                         ;;
;;  MEMSET.1  Alternate entry point                                         ;;
;;                                                                          ;;
;;  AUTHOR                                                                  ;;
;;      Joseph Zbiciak <intvnut AT gmail.com>                               ;;
;;                                                                          ;;
;;  REVISION HISTORY                                                        ;;
;;      08-Sep-2001 Initial Revision                                        ;;
;;                                                                          ;;
;;  INPUTS for MEMSET                                                       ;;
;;      R5    Pointer to invocation record, followed by return address.     ;;
;;            Pointer to destination       1 DECLE                          ;;
;;            Value to fill with           1 DECLE                          ;;
;;            Length                       1 DECLE                          ;;
;;                                                                          ;;
;;  INPUTS for MEMSET.1                                                     ;;
;;      R5    Return address                                                ;;
;;      R4    Pointer to destination                                        ;;
;;      R1    Value to fill with                                            ;;
;;      R0    Length                                                        ;;
;;                                                                          ;;
;;  OUTPUTS                                                                 ;;
;;      R0    Zeroed                                                        ;;
;;      R1    Fill value (unmodified)                                       ;;
;;      R4    Points one element beyond destination array                   ;;
;;      R5    Zeroed                                                        ;;
;;                                                                          ;;
;;  TECHNIQUES                                                              ;;
;;      Unrolled 8x for speed.                                              ;;
;;      Not-unrolled loop handles length % 8 != 0.                          ;;
;;                                                                          ;;
;;  CODESIZE                                                                ;;
;;      29 words                                                            ;;
;;                                                                          ;;
;;  CYCLES                                                                  ;;
;;      Not yet characterized.                                              ;;
;; ======================================================================== ;;
MEMSET      PROC

            MVI@    R5,     R4      ;   8   Destination array
            MVI@    R5,     R1      ;   8   Fill value       
            MVI@    R5,     R0      ;   8   Length

@@1:        PSHR    R5              ;   9   Alternate entry point

            MOVR    R0,     R5      ;   6   \
            ANDI    #7,     R5      ;   8    |-- Handle length % 8 iters,
            BEQ     @@l8_init       ;  7/9  /    if there are any.
                                    ;----
                                    ;  54   Fallthru case
                                    ;  56   Branch taken    

@@loop_1:   MVO@    R1,     R4      ;   9   \
            DECR    R5              ;   6    |-- Store one value at a time.
            BNEQ    @@loop_1        ;  9/7  /
                                    ;----
                                    ;  24*k - 2

@@l8_init:  SLR     R0,     2       ;   8   \
            SLR     R0,     1       ;   6    |-- Divide trip count by 8.
            BEQ     @@done          ;  7/9  /    Abort if it goes to 0.
                                    ;----
                                    ;  21   Fallthru case

@@loop_8:   MVO@    R1,     R4      ;   9   \
            MVO@    R1,     R4      ;   9    |__ Store four elements
            MVO@    R1,     R4      ;   9    |
            MVO@    R1,     R4      ;   9   /
            DECR    R0              ;   6   (Interruptible)
            MVO@    R1,     R4      ;   9   \
            MVO@    R1,     R4      ;   9    |__ Store four more elements
            MVO@    R1,     R4      ;   9    |
            MVO@    R1,     R4      ;   9   /
            BNEQ    @@loop_8        ;  9/7  Iterate length/8 times
                                    ;----
                                    ;  87*k - 2

@@done:     PULR    PC              ;  11   Return
            ENDP    

  • Like 3
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...