ScratchABlock icon indicating copy to clipboard operation
ScratchABlock copied to clipboard

Rewriting of induction variables

Open maximumspatium opened this issue 6 years ago • 0 comments

Induction variables are widely emitted by compilers during program optimization. They greatly assist in strength reduction.

Consider the following code:

int arrayOfInts[10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

for (i = 0; i < 10; i++)
    printf("Next int %d", arrayOfInts[i]);

One possible transformation of it could look like that:

int *p = intPtr;
int arrayOfInts[10] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

p = arrayOfInts;

for (i = 0; i < 10; i++) {
    printf("Next int %d", *p);
    p++;
}

The expression arrayOfInts[i] has been transformed to the equivalent pointer dereference introducing the induction variable "p".

This is surely a very simple example. Real-world programs will usually contain more complex cases. A well-optimized PowerPC program I used to work on contained up to 4 induction variables per loop.

The ability to (partially) undo this kind of optimization in the decompiler would greatly improve the readability of its output. Such a transformation would be applied at a later decompilation stage and could be human-assisted.

The great book "Advanced compiler design and implementation" by Steven S. Muchnick already scratched an algorithm for induction variable identification and removal (see chapter 14).

A good starting point would be simplifications of pointers that depend on the loop counter.

maximumspatium avatar Jan 22 '18 13:01 maximumspatium