[REG 2.111.0] assumeSafeAppend fails after slicing past length
void main()
{
ulong[] newData;
newData.reserve(2);
newData = newData.ptr[0 .. 1];
newData.assumeSafeAppend();
assert(newData.capacity >= 2);
}
The above program works in DMD 2.110 but not 2.111.
The reason for slicing past length is to support types which are neither copyable nor have a .init; after extending the array by slicing its .ptr, we can use moveEmplace to populate it. Currently, we don't really have something like moveAppend, and I can't immediately see how to write one.
There are functions to do this, but they aren't exposed via core.memory.
Check out https://github.com/dlang/dmd/blob/73846450e54832ae0d40b9bf505fb165124cd9cd/druntime/src/core/gc/gcinterface.d#L232
You would do it like:
import core.gc.interface : GC;
extern(C) GC gc_getProxy(); // awkward.
void main() {
ulong[] newData;
newData.reserve(2);
assert(gc_getProxy.expandArrayUsed(newData, ulong.sizeof)); // not as nice, requires extern(C) decl
newData = newData.ptr[0 .. 1];
assert(newData.capacity >= 2);
}
In other words, the primitives are there, but not the higher level functions. We just need to add them.
To be honest, assumeSafeAppend was not meant as a mechanism to expand array size. The primitive was called _d_arrayshrinkfit, and stated purpose was to 'Shrink the "allocated" length of an array to be the exact size of the array.'
The new way gives more semantic meaning to the operation. Growing is different from shrinking. This is also the same primitive used for appending. In the case you are posing, you basically want appending but without initialization.
Note that this expansion also covers resizing in place when you can add on more contiguous memory. This covers all the mechanics of typeless expansion in a way that was awkward with the old interface.
Now, the new GC interface functions are not exposed via core.memory API, because I was unsure if this would prove to be the best interface. Thoughts welcome, we have a chance to change this if there is a better idea.
Thanks. Is it possible to fix the regression?
You would do it like:
I'm trying this patch:
diff --git a/utils/vec.d b/utils/vec.d
index 337e530c..04087f47 100644
--- a/utils/vec.d
+++ b/utils/vec.d
@@ -118,7 +118,13 @@ struct Vec(T)
newData.reserve(newCapacity);
assert(newData.capacity >= newCapacity);
newData = newData.ptr[0 .. data.length];
- newData.assumeSafeAppend();
+ static if (__traits(hasMember, GC, "expandArrayUsed"))
+ {
+ auto res = gc_getProxy.expandArrayUsed(newData, T.sizeof);
+ assert(res);
+ }
+ else
+ newData.assumeSafeAppend();
foreach (i; 0 .. data.length)
moveEmplace(data[i], newData[i]);
data = newData[0 .. data.length];
@@ -227,6 +233,10 @@ private:
T[] data;
}
+// https://github.com/dlang/dmd/issues/21826#issuecomment-3264427803
+import core.gc.gcinterface : GC;
+private extern(C) GC gc_getProxy();
+
// Test object lifetime
debug(ae_unittest) unittest
{
but assert(res) fails (i.e. expandArrayUsed returns false).
You have the ordering of the operations backwards. Again, this is why I didn't expose directly, because I'm not sure of the best API. It's awkward to say the least.
You pass in the slice of what you have, and then tell it how much you want it expanded to. If you pass in a slice that doesn't match up to the current length, it's going to give up.
My thought process on this was, I don't want people constructing an invalid slice to find out it didn't work. First try and expand a valid slice (for which you likely already have one), and if that works, then re-slice it.
So for your code:
static if (__traits(hasMember, GC, "expandArrayUsed"))
{
auto res = gc_getProxy.expandArrayUsed(newData, data.length * T.sizeof);
assert(res);
newData = newData.ptr[0 .. data.length]; // has to go after expansion
}
else
newData = newData.ptr[0 .. data.length].assumeSafeAppend();
Could think about different API for the expansion. Essentially, we need the pointer, the expected "current" length, and the desired expanded length.
Maybe instead of returning a bool, it can return the expanded slice which you then cast? Or maybe a higher-level function can wrap this and be more type-safe?
Is it possible to fix the regression?
Yeah, it's possible. It already has to fetch the existing slice, so it could check, is this going to be bigger, and if so, try the expand function.
Code has changed a bit, as the GSOC student has been moving things.
You have the ordering of the operations backwards. Again, this is why I didn't expose directly, because I'm not sure of the best API. It's awkward to say the least.
Ah, sorry, that's on me - I should have read the documentation more closely.
So for your code:
Works great, thank you!
Maybe instead of returning a bool, it can return the expanded slice which you then cast? Or maybe a higher-level function can wrap this and be more type-safe?
Well, the higher-level function that was missing for me here was moveAppend, but maybe that's too high-level and situational.
I'm guessing it would be adjacent to uninitializedArray / minimallyInitializedArray? In which case accepting a length (in elements rather than bytes), and the array by ref (or returning the new array) makes sense.
My understanding of the original intention of assumeSafeAppend was to get D1-style array extending (look up any talk about migrating from D1 to D2 given by sociomantic way back when).
IIRC, sociomantic's use was that the array never grew beyond its initial allocated capacity, so acted as a reusable buffer.
char[] array = new char[64];
auto ptr = array.ptr;
array = array[0 .. 1];
array ~= "foo";
array = array[0 .. 2];
array ~= "bar";
assert(array.ptr == ptr);
Well, the higher-level function that was missing for me here was
moveAppend, but maybe that's too high-level and situational.
Yeah, it's a bit interesting, because normally, you would just use an append operation! But in this case, you need to move the data in, so the lifetime requirements are different. reminds me of std::vector push_back vs. emplace_back. One thing to consider here, is that reserve is going to make copies of existing data, so something that can't be copied, I don't know what happens here. reserve might break in the future (or appending in general) if we are being honest about what the runtime is doing.
C++ can do this because it's moving data, not copying. But D array extending makes a copy always on reallocation, because there is no ownership of data from the calling slice. Your wrapper type may be different. But it might not make sense in terms of D's runtime library to provide.
My understanding of the original intention of
assumeSafeAppendwas to get D1-style array extending (look up any talk about migrating from D1 to D2 given by sociomantic way back when).
Yes, the point was to be able to mimic the reuse-buffer behavior of D1. There was a discussion on it in the NG when first released. https://forum.dlang.org/post/[email protected] and https://forum.dlang.org/post/[email protected]