Auto increment register TR3200
I'd be nice to have a register that increments after each read.
I think that this kind of functionality could be implemented on two ways :
- Having a LOADI/STOREI instructions that does the same thing that LOAD/STORE but auto-increments the register with the address by the data size (LOADI increments by 4, LOADIW increments by 2 and LOADIB increments by 1), before the memory I/O. This instructions could take one extra cycle, as are doing an extra ALU operation.
- Set %r10 register as a special purpose register with the alias %index . When is using as an address register on LOAD/STORE, is incremented before the memory I/O by the data size.
- Optionally, add an instruction with 1 parameter to set if would increment or decrement the register.
The actual behaviour is doing manually the increment/decrement using an ADD/SUB instructions. So this feature would allow to reduce the code size and gain a few clock cycles per iteration on code that access arrays of data.
a special register as %index is better for optimisation maybe the first write to it sets the increment step and the second write to it sets where it should start ?
where the increment step would be a signed value ?
or the step could be implicit the startaddr % 4
A lot more interesting than a register that incremented after every read would be the ability to do indexed reads like you can on PowerPC.
You mean lmw like instructions ?
Ahh.. Ok I just read it ... you mean things like this :
lwzu r3,4(r4) ; r4 = r4 + 4 ; r3 = *(r4)
lwzux r3,r4,r5 ; r4 = r4 + r5 ; r3 = *(r4)
In our case, this would translate to LOADU[b|w]. Indeed, it's a very powerful index read.
PowerPC has these four instruction, amongst others:
lbz - load byte and zero
lbzu - load byte and zero with update
lbzx - load byte and zero, indexed
lbzux - load byte and zero with update, indexed
Which allow all sorts of beautiful and nice implicit behaviour, like easily turning this:
; GPR 1 is where we are putting the data, assume we do something with it in the loop
; GPR 2 is the "start pointer"
; GPR 3 is the "end pointer"
loop:
lbz 1, (0)2
addi 2, 1 ; add immediate means the 1 is treated as 1, not GPR 1
cmp 2, 3
blt loop
into this:
; GPR 1 is where we are putting the data, assume we do something with it in the loop
; GPR 2 is the "start pointer"
; GPR 3 is the "past-the-end pointer"
loop:
lbzux 1, 2, 3
cmp 2, 3
blt loop
In the second example, lbzux sets r2 to r2+r3 as part of the load. Hence 'with update'.
This code could probably be improved, but you get the point.