ScratchABlock
ScratchABlock copied to clipboard
Rewriting of induction variables
Induction variables are widely emitted by compilers during program optimization. They greatly assist in strength reduction.
Consider the following code:
int arrayOfInts[10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
for (i = 0; i < 10; i++)
printf("Next int %d", arrayOfInts[i]);
One possible transformation of it could look like that:
int *p = intPtr;
int arrayOfInts[10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
p = arrayOfInts;
for (i = 0; i < 10; i++) {
printf("Next int %d", *p);
p++;
}
The expression arrayOfInts[i]
has been transformed to the equivalent pointer dereference introducing the induction variable "p".
This is surely a very simple example. Real-world programs will usually contain more complex cases. A well-optimized PowerPC program I used to work on contained up to 4 induction variables per loop.
The ability to (partially) undo this kind of optimization in the decompiler would greatly improve the readability of its output. Such a transformation would be applied at a later decompilation stage and could be human-assisted.
The great book "Advanced compiler design and implementation" by Steven S. Muchnick already scratched an algorithm for induction variable identification and removal (see chapter 14).
A good starting point would be simplifications of pointers that depend on the loop counter.