algorithmica
algorithmica copied to clipboard
Assembly: Alternative Approach
mov rax, -100 ; replace 100 with the array size
loop:
add edx, DWORD PTR [rax + 100 + rcx]
add rax, 4
jnz loop ; checks if the result is zero
https://github.com/algorithmica-org/algorithmica/blob/master/content/english/hpc/architecture/loops.md
I don't understand why the +rcx is needed? Isn't it enough to add back the 100?
rcx is the static array pointer, rax is the index. without rax you'd be accessing absolute memory addresses 0 to 99, i.e. null pointer dereference. I personally use a slightly different but ultimately equivalent method, that is also good for arbitrary length arrays (array pointer in rdi, array length in rsi as number of 32-bit ints):
sumarray:
lea rdi, [rdi + rsi * 4] ;; convert to an endpointer
neg rsi ;; we'll loop till rsi is no longer negative
xor eax, eax ;; clear summing register
.loop:
add eax, [rdi + rsi * 4] ;; dword ptr is implicit from using eax as target register, no immediate word in instruction stream
add rsi, 1 ;; we're counting whole int's. inc doesn't set all flags so creates a false dependency on prior flags
jl .loop ;; add;jl will macro-fuse into a single µop.
ret