Missed optimization: malloc + memcpy + free => realloc
https://godbolt.org/z/YWxW7h8sP
#include <cstdlib>
#include <cstring>
#include <memory>
extern "C" {
auto src1(void* p1, size_t old_len, size_t new_len) -> void* {
void* p2 = malloc(new_len);
memcpy(p2, p1, old_len);
free(p1);
return p2;
}
auto tgt1(void* p1, size_t old_len, size_t new_len) -> void* {
return realloc(p1, new_len);
}
}
alive proof: https://alive2.llvm.org/ce/z/RwPKVk
Are those really equivalent? I wonder about things like
void foo() {
void* p1 = malloc(15);
auto i1 = (uintptr_t)p1;
void* p2 = src1(p1, 15, 16);
assert(i1 != (uintptr_t)p2);
}
Are those really equivalent? I wonder about things like
void foo() { void* p1 = malloc(15); auto i1 = (uintptr_t)p1; void* p2 = src1(p1, 15, 16); assert(i1 != (uintptr_t)p2); }
realloc invalidates the original pointer so using it after realloc returns is UB. I'm not sure if casting to uintptr_t counts as a use in this sense though.
In any case, the optimization should still be applicable in cases where the integer value of the original pointer is not used
extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; } }
What if old_len is greater than new_len? You would get access violation for this, no?
extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; } }What if
old_lenis greater thannew_len? You would get access violation for this, no?
If old_len > new_len, src1 exhibits undefined behaviour. So we can assume this case cannot happen
extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; } }
What if
old_lenis greater thannew_len? You would get access violation for this, no?If
old_len > new_len,src1exhibits undefined behaviour. So we can assume this case cannot happen
There is no indication in source code that the behaviour is undefined if old_len > new_len. I personally consider this optimization is unsafe. If LLVM optimizes this, it would hide a bug. You should fix your code or just use realloc without this pattern.
extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; } }
What if
old_lenis greater thannew_len? You would get access violation for this, no?If
old_len > new_len,src1exhibits undefined behaviour. So we can assume this case cannot happenThere is no indication in source code that the behaviour is undefined if
old_len > new_len. I personally consider this optimization is unsafe. If LLVM optimizes this, it would hide a bug. You should fix your code or just usereallocwithout this pattern.
The behaviour is absolutely undefined. p2 points to an allocation of size new_len but memcpy performs an access of size old_len. The proposed implementation does not introduce UB, the UB is already present in the original src1
How about another case? Suppose you call src1(p1, 0, 8), and that the allocation of p2 fails within the src1 function. The realloc function is supposed to keep the original buffer if the allocation of new buffer fails. But with src1, p1 gets free'd instead. This makes the behaviour different from realloc.
extern "C" { auto src1(void* p1, size_t old_len, size_t new_len) -> void* { void* p2 = malloc(new_len); memcpy(p2, p1, old_len); free(p1); return p2; }
When I review this code again, I found out there is no handling when p2 allocation fails (p2 == NULL).
Even if you think allocation failure is possible (or you want it to be UB anyway), adding a conditional like if (p2 == NULL) unreachable(); would convey your intent more clearly.
Anyway I suggest this proposed optimization is incorrect. Devil is in the details and so the transform to realloc() by compiler could break things.