dewolf icon indicating copy to clipboard operation
dewolf copied to clipboard

[Array Access Detection] Implement extra checks

Open mari-mari opened this issue 3 years ago • 0 comments

Proposal

There are a couple a checks that could be made in order to get type of array if we have void * - since we need type of array in non-aggressive mode (default) in order to mark expression as array element access.

E.g. non-agressive: (a+4i) where type of a is int* would be recognized as a[i] // int a[]; whereas in case type of a is void*, we generate something like this: *(a + i * 4)/*a[i]*/ since a can be basically anything.

When the source code contains some index arithmetic, it is possible to infer original type. Consider the following example: array_checks.zip

unsigned long func3(void * arg1, int arg2, long arg3, long arg4) {
    int i;
    int var_1;
    for (i = 0; i < (unsigned int)(arg2 / 2 - 1); i++) {
        var_1 = i * 2;
        printf(/* format */ "%l\n", *(arg1 + var_1 * 8)/*arg1[var_1]*/, var_1 * 8, arg4);
        printf(/* format */ "%l\n", *(arg1 + (var_1 + 1L) * 8), (var_1 + 1L) * 8);
        arg4 = var_1 + 1L << 3;
    }
    return (unsigned int)(arg2 / 2 - 1);
}

Here multiplier 8 can hint us, that arg1 points on array of long elements. This holds for expressions a[2*i] and a[2*i+1]. I am not sure if it holds also for other expression types.

Approach

Experiment a bit if this assumption is general enough - experiment with different types of arrays and different index expressions. If the trend remains - we can reliably recognize type for different index expressions - then update ArrayAccessDetection to enrich array type infos.

If such inference is possible only for limited subset of expressions or/and to simultaneously requires complex and time intensive heuristics, I suggest to close this issue.

mari-mari avatar Feb 01 '22 15:02 mari-mari