`<functional>`: Avoid double wrapping in `move_only_function` construction
WG21-P2548R6 relaxed some requirements for polymorphic function wrappers by adding wording in [func.wrap.general].
- Let
tbe an object of a type that is a specialization offunction,copyable_function, ormove_only_function, such that the target objectxofthas a type that is a specialization offunction,copyable_function, ormove_only_function. Each argument of the invocation ofxevaluated as part of the invocation oftmay alias an argument in the same position in the invocation oftthat has the same type, even if the corresponding parameter is not of reference type.[Example 1:
move_only_function<void(T)> f{copyable_function<void(T)>{[](T) {}}}; T t; f(t); // it is unspecified how many copies of T are made— end example]
- Recommended practice: Implementations should avoid double wrapping when constructing polymorphic wrappers from one another.
However, if I understand correctly, we can't avoid double wrapping in construction of function, even in vNext, because it's target object is observable via the target member function.
For move_only_function and copyable_function, it seems possible to unwrap in construction, because the target object is not observable and thus can be non-existent under some conditions.
I think its better to treat the allowance as a DR against C++23 (i.e. to implement it for move_only_function unconditionally), because that is ABI-critical and libstdc++ starts doing so recently.
Personal concerns:
- When constructing a
move_only_functionfrom an emptyfunction, we need to keep the throwing-on-invocation behavior, which means that the constructedmove_only_functioncan't be empty. - It might be better to recognize program-defined specializations and avoid invalid unwrapping for them. But since support for program-defined specializations of
functionis already broken, and the program-defined specializations can hardly be helpful, it might be also plausible not to do this.
However, if I understand correctly, we can't avoid double wrapping in construction of
function, even in vNext, because it's target object is observable via thetargetmember function.
I'm not sure. I mean, object composition does not necessarily precludes calls unwrapping.
Imagine move_only_function wraps another move_only_function. Or even many times nested. And imagine that here:
https://github.com/microsoft/STL/blob/da7307d68dbf39a805372e91a740393cbf06595c/stl/inc/functional#L1691-L1693
we instead of this->_Data have some getter that gets inmost data, and this->_Get_invoke() gets inmost callable.
I definitely think we should ignore the potential for program-defined specializations of function. Nobody's going to go to that effort.
I'm not sure. I mean, object composition does not necessarily precludes calls unwrapping.
Good. It seems to me that we can unwrap calls in function::operator() in an ABI-preserving way.
I thought about it for some time.
We should distinguish call unwrapping and composition avoidance.
Ideally we'd have both, but one may be implemented without the other.
They are related, but not completely dependent on each other.
Call unwraping
Call unwrapping is doing as few calls as possible.
For one functional object, there are 3 calls performed:
- Call to exterior interface. This passes parameters according to the signature passed as template parameter. In optimized build, it is expected to be inlined.
- Call via pointer or vTable to the wrapper. This serves type erasure purposes, so it is never inlined. This passes parameters by references.
- Call to inner callable. This passes parameters using
invoke. This may inline or not depending on callable (notably, function pointers will not inline).
For wrapped function object, there are 5 calls performed:
- Call to exterior interface.
- Call via pointer or vTable to the wrapper.
- Call to inner callable == inner exterior interface.
- Call via pointer or vTable to the wrapper.
- Call to inner callable.
WG21-P2548R2 permits striking step 3 in this sequence. With doing only this we can have:
- Call to exterior interface.
- Call via pointer or vTable to the wrapper.
- Skip
- Call via pointer or vTable to the wrapper.
- Call to inner callable.
This call elimination would not usually avoid non-inline call, but it avoids objects copying/moving.
We can go further if the wrappers are compatible:
- Call to exterior interface.
- Skip
- Skip
- Call via pointer or vTable to the wrapper.
- Call to inner callable.
Conversely, this call elimination does not further avoid objects copying/moving, but does avoid one indirect call.
Allocation avoidance
Composition avoidance is effectively allocation avoidance.
When a function object contained in another function object without any composition avoidance, either of the following expected:
- Outer large function containing function object with inner small callable
- Outer large function containing function object with inner pointer to large callable
- Outer large function containing function object with no callable
The avoided composition cases are:
- Outer small function containing original inner small callable. Avoids one allocation.
- Outer large function containing pointer to original inner large callable. Avoids one allocation.
- Outer small function containing pointer to original inner large callable. Avoids one allocation.
- Outer small function containing a no-callable placeholder. Avoids one allocation.
2 and 3 are barely distinguishable ways of large function allocation avoidance.
4 is different from potential "Outer function containing no callable' because it allows outer function to be non-empty and preserving bad_function_call behavior if the inner one throws that (or termination behavior if the inner one does not).
Ideal scenario
We would ideally share machinery of move_only_function, copyable_function and function.
With wrapping in copyable_function or move_only_function, where we don't have .target(), we'd just transfer everything to the other, have full call unwrapping, and full allocation avoidance.
When we have .target(), we can no longer do any allocation avoidance. We can still do call unwrapping to 3 calls.
We can have this ideal scenario for move_only_function containing copyable_function, since copyable_function is not yet implemented (#3803)
For anything involving function, we can have that only in vNext
Current ABI
With inner function, we can't do call unwrapping to 3 calls, as we have real vTable in function and vTable emulation in move_only_function and copyable_function.
We can hypothetically still change the approach in move_only_function to use the same vTable. But I believe the current move_only_function approach is superior. It avoids runtime type information bloat and wasting one pointer to self-reference. See:
- #964
- #969
- #2267
Also vTable emulation is more flexible; we can build it with individual separate pieces. This may become very useful if we do allocation avoidance without calll unwrapping (for functions with distinct signatures)
So what we can do with regards to call unwrapping is unwrapping in 4 calls.
For allocation avoidance, I think we can easily avoid that for large functions. For small functions seems to need further careful analysis, especially regarding alignment, but apparently doable too
Outer function can probably unrwap to 4 calls but not to 3 calls as well.
Allocation avoidance is not available for outer function ever due to .target().
@frederick-vs-ja , apart from function, move_only_function, and copyable_function, there's also function_ref.
function_ref also has the invoke in its thunk pointed by thunk-ptr.
[func.wrap.general] does not mention function_ref. Do you think it should?
[func.wrap.general] does not mention
function_ref. Do you think it should?
This is LWG-4264. I guess it should. But since function_ref has reference semantics and other polymorphic function wrappers have value semantics, I'm a bit afraid that there would be some unexpected effects (as mentioned in LWG-4264).
But since
function_refhas reference semantics and other polymorphic function wrappers have value semantics, I'm a bit afraid that there would be some unexpected effects (as mentioned in LWG-4264).
I see. We have to let the LWG decide then what is more important — the correct reference semantics, or the optimization.
But I still want to leave the door for this optimization open, in case it will be ultimately permitted, and will try to draft an implementation with the optimization enabled, to see that it can work.