algorithmica icon indicating copy to clipboard operation
algorithmica copied to clipboard

Assembly: Alternative Approach

Open brubrz opened this issue 3 years ago • 1 comments

    mov  rax, -100  ; replace 100 with the array size
loop:
    add  edx, DWORD PTR [rax + 100 + rcx]
    add  rax, 4
    jnz  loop       ; checks if the result is zero

https://github.com/algorithmica-org/algorithmica/blob/master/content/english/hpc/architecture/loops.md

I don't understand why the +rcx is needed? Isn't it enough to add back the 100?

brubrz avatar Feb 07 '23 10:02 brubrz

rcx is the static array pointer, rax is the index. without rax you'd be accessing absolute memory addresses 0 to 99, i.e. null pointer dereference. I personally use a slightly different but ultimately equivalent method, that is also good for arbitrary length arrays (array pointer in rdi, array length in rsi as number of 32-bit ints):

sumarray:
   lea   rdi, [rdi + rsi * 4]       ;; convert to an endpointer
   neg  rsi                            ;; we'll loop till rsi is no longer negative
   xor   eax, eax                  ;; clear summing register
.loop:
   add   eax, [rdi + rsi * 4]    ;; dword ptr is implicit from using eax as target register, no immediate word in instruction stream
   add   rsi, 1                       ;; we're counting whole int's.   inc doesn't set all flags so creates a false dependency on prior flags
   jl       .loop                        ;; add;jl will macro-fuse into a single µop.
   ret

IAmAThousandTrees avatar Dec 03 '24 13:12 IAmAThousandTrees