godot Use `LocalVector` in `Array`

Due to the fact that Array behaves like RefCount (if we have two references to the object and change one array, the other also changes).

var arr:Array[int]
for i in 5:
	arr.push_back(i)
print(arr)
var arr2:Array[int] = arr
arr2[0] = 1231
print(arr, arr2)

Output:

[0, 1, 2, 3, 4]
[1231, 1, 2, 3, 4][1231, 1, 2, 3, 4]

Array has a built-in SafeRefcount for this. So, using COWVector provides overhead.

Oct 04 '24 09:10 Nazarwadim

This makes calls to Array::assign not a constant time operation when types are compatible, some benchmarking is in order for different cases

Oct 04 '24 10:10 AThousandShips

@AThousandShips This is copying. Immediately after copying, the user will want to make some changes to the array, which in any case will make a copy on write.

Oct 04 '24 11:10 Nazarwadim

That's not granted though, you can't assume that just because someone makes a copy they want to mutate it, it might be passing an argument

Oct 04 '24 11:10 AThousandShips

When passing arguments, we use this constructor:

Array::Array(const Array &p_from) {
	_p = nullptr;
	_ref(p_from);
}

Where no copying occurs.

Oct 04 '24 11:10 Nazarwadim

That's not the only case though, for example constructing a TypedArray from any other array with non-identical typed, or using assign directly, this might be used in various places internally as well so it'd be good to have a benchmark to consider, and to weigh against the benefits of this change

For example:

var my_base_array : Array[Node]
var my_derived_array : Array[Node2D]
...
my_base_array.assign(my_derived_array)

Will not invoke any copying currently, but will with this

Or any use of assign, it doesn't use _ref at all, and is a standard way to handle various cases

This is also used in VariantParser::parse_value where it creates a second array, parses into it, and then uses array.assign with the temporary value: https://github.com/godotengine/godot/blob/f032af74536b317b23c7fca3bc7318ced5537344/core/variant/variant_parser.cpp#L1351-L1365

This will now always perform a full copy of the data instead of just referencing it

Oct 04 '24 11:10 AThousandShips

There are two performance aspects here we need to benchmark and assess:

The gain from removing any overhead from Vector
The loss from losing COW, as Array does not do COW itself

Oct 04 '24 11:10 AThousandShips

Aha. I'll look into that.

For the first point, I think godot-benchmark will be enough.

For the second, I will do something like passing the parameters of various types of arrays and iteration or something like that and measuring time.

Oct 04 '24 12:10 Nazarwadim

Benchmarks: results_master.json results_local_vector.json

About VariantParser::parse_value. Since array is usually typed and values is untyped, we will go to this part of the code. https://github.com/godotengine/godot/blob/f032af74536b317b23c7fca3bc7318ced5537344/core/variant/array.cpp#L248-L275 Where no COW occurs.

assign itself became slower due to copying(bigger array -> more time). In any case, assign with read only behavior is rarely used, so I don't think it will be a problem.

Oct 04 '24 18:10 Nazarwadim

So far this seems to make sense, and the bechmarks also appear to convalidate.

Oct 06 '24 12:10 reduz

What's the verdict, then?

Nov 13 '24 20:11 Mickeon

I thought the main reason to have Array (and packed arrays) be Vector internally was to to able to quickly and safely exchange data between GDScript and the engine, using CoW copies to ensure validity.

This change would eliminate this property from Array. I'm not sure if the performance benefits would be worth it. I expect far more time to be lost on allocation than on ref semantics.

Jan 05 '25 11:01 Ivorforce

One of Vector bottlenecks is the size() method. When profiling, I often see it take a lot, maybe even more than _copy_on_write. All this is because when using LoclVector we already have its size in cache, while for Vector we need to get the size from possibly a cold cache.

Also, I think our discussion about copying will be resolved by benchmarks in tps-demo and voxel. I will try to make 2 benchmarks. One standard to see where the frame time will be smaller, and the other to compare the number of copies that occur in CowData(_copy_on_write) and LocalVector(operator=, LocalVector(const LocalVector &p_from)) for Variant.

Jan 05 '25 11:01 Nazarwadim

One of Vector bottlenecks is the size() method. When profiling, I often see it take a lot, maybe even more than _copy_on_write. All this is because when using LoclVector we already have its size in cache, while for Vector we need to get the size from possibly a cold cache.

You're right, this irked me too. In fact, i have made a proposal for this exact problem! 😄 https://github.com/godotengine/godot-proposals/issues/11343

Also, I think our discussion about copying will be resolved by benchmarks in tps-demo and voxel.

I don't think benchmarks can really solve this problem because this is a conceptual change. You'll easily find examples where either implementation will be faster. It's up to us to decide which of the two is more important, and personally, i think the CoW semantics are essential enough in some places to warrant using Vector internally.

Jan 05 '25 12:01 Ivorforce