llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

Missed optimization: malloc + memcpy + free => realloc

Open Kmeakin opened this issue 10 months ago • 8 comments

https://godbolt.org/z/YWxW7h8sP

#include <cstdlib>
#include <cstring>
#include <memory>

extern "C" {
auto src1(void* p1, size_t old_len, size_t new_len) -> void* {
    void* p2 = malloc(new_len);
    memcpy(p2, p1, old_len);
    free(p1);
    return p2;
}

auto tgt1(void* p1, size_t old_len, size_t new_len) -> void* {
    return realloc(p1, new_len);
}

}

alive proof: https://alive2.llvm.org/ce/z/RwPKVk

Kmeakin avatar Feb 24 '25 22:02 Kmeakin

Are those really equivalent? I wonder about things like

void foo() {
    void* p1 = malloc(15);
    auto i1 = (uintptr_t)p1;
    void* p2 = src1(p1, 15, 16);
    assert(i1 != (uintptr_t)p2);
}

tavianator avatar Feb 25 '25 15:02 tavianator

Are those really equivalent? I wonder about things like

void foo() { void* p1 = malloc(15); auto i1 = (uintptr_t)p1; void* p2 = src1(p1, 15, 16); assert(i1 != (uintptr_t)p2); }

realloc invalidates the original pointer so using it after realloc returns is UB. I'm not sure if casting to uintptr_t counts as a use in this sense though.

In any case, the optimization should still be applicable in cases where the integer value of the original pointer is not used

Kmeakin avatar Feb 25 '25 22:02 Kmeakin

extern "C" {
auto src1(void* p1, size_t old_len, size_t new_len) -> void* {
    void* p2 = malloc(new_len);
    memcpy(p2, p1, old_len);
    free(p1);
    return p2;
}
}

What if old_len is greater than new_len? You would get access violation for this, no?

Explorer09 avatar Mar 11 '25 10:03 Explorer09

extern "C" {
auto src1(void* p1, size_t old_len, size_t new_len) -> void* {
    void* p2 = malloc(new_len);
    memcpy(p2, p1, old_len);
    free(p1);
    return p2;
}
}

What if old_len is greater than new_len? You would get access violation for this, no?

If old_len > new_len, src1 exhibits undefined behaviour. So we can assume this case cannot happen

Kmeakin avatar Mar 12 '25 23:03 Kmeakin

extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; } }

What if old_len is greater than new_len? You would get access violation for this, no?

If old_len > new_len, src1 exhibits undefined behaviour. So we can assume this case cannot happen

There is no indication in source code that the behaviour is undefined if old_len > new_len. I personally consider this optimization is unsafe. If LLVM optimizes this, it would hide a bug. You should fix your code or just use realloc without this pattern.

Explorer09 avatar Mar 13 '25 00:03 Explorer09

extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; } }

What if old_len is greater than new_len? You would get access violation for this, no?

If old_len > new_len, src1 exhibits undefined behaviour. So we can assume this case cannot happen

There is no indication in source code that the behaviour is undefined if old_len > new_len. I personally consider this optimization is unsafe. If LLVM optimizes this, it would hide a bug. You should fix your code or just use realloc without this pattern.

The behaviour is absolutely undefined. p2 points to an allocation of size new_len but memcpy performs an access of size old_len. The proposed implementation does not introduce UB, the UB is already present in the original src1

Kmeakin avatar Mar 13 '25 00:03 Kmeakin

How about another case? Suppose you call src1(p1, 0, 8), and that the allocation of p2 fails within the src1 function. The realloc function is supposed to keep the original buffer if the allocation of new buffer fails. But with src1, p1 gets free'd instead. This makes the behaviour different from realloc.

Explorer09 avatar Mar 13 '25 01:03 Explorer09

extern "C" {
auto src1(void* p1, size_t old_len, size_t new_len) -> void* {
    void* p2 = malloc(new_len);
    memcpy(p2, p1, old_len);
    free(p1);
    return p2;
}

When I review this code again, I found out there is no handling when p2 allocation fails (p2 == NULL).

Even if you think allocation failure is possible (or you want it to be UB anyway), adding a conditional like if (p2 == NULL) unreachable(); would convey your intent more clearly.

Anyway I suggest this proposed optimization is incorrect. Devil is in the details and so the transform to realloc() by compiler could break things.

Explorer09 avatar Jun 30 '25 11:06 Explorer09