ConstructionBase.jl
ConstructionBase.jl copied to clipboard
Using @generated less?
We are relying on @generated and it may create some concerns when using this package (e.g., https://github.com/mauro3/Parameters.jl/issues/50).
I think the biggest issue is that setproperties generates a function for different patch type (e.g., different combination or even ordering of the property names).
It would be nice to write setproperties based on constant-propagatable code (e.g., recursions) and avoid using @generated. However, the major obstacle is that fieldnames cannot be inferred. To workaround this, how about just use
@generated __fieldnames__(::Type{T}) where T = fieldnames(T)
? We already use fieldnames inside @generated so this is not worse than the current situation.
Packages that controls struct definition like Parameters.jl can even define __fieldnames__ and constructorof to completely avoid @generated.
I think we should investigate the following claims before we make a decision on recursion vs generated.
- Claim: Recursion generates as optimal code as
@generated. What I am worried about is that performance could be bad in rare cases, where inline heuristics or const prop fails. - Claim: Recursion is more compiler friendly. We should measure compilation time to see if tis true. I am somewhat sceptical. In the end we want to generate code that is as good as a handwritten constructor call. I don't see how the compiler could get there faster using recursion then generated funciton, where it is essentially already handed over the right thing.
A mechanism to hook into with Parameters.jl to avoid both generated and recursion-hoops would certainly be good.
@mauro3 Actually, I was wrong to present it like __fieldnames__ was the only solution. If you were going to overload __fieldnames__ (and constructorof) via struct-generating macro anyway, it would be solved by defining setproperties instead as well.
@jw3126 These are good points, and interesting ones. I just had a vague impression that Julia core devs generally discourage @generated.
I don't see how the compiler could get there faster using recursion then generated funciton, where it is essentially already handed over the right thing.
I guess for generating code, @generated would be strictly better because we control the end result more explicitly. But maybe @generated function can interact with inference when it appears in the middle of some complex code? Compiler would not be able to run the generator code without precise concrete types for all arguments but partial result may still propagate through normal (const-prop-friendly) code.
There is also if @generated. According to the manual
In this style of definition, the code generation feature is essentially an optional optimization. The compiler will use it if convenient, but otherwise may choose to use the normal implementation instead. This style is preferred, since it allows the compiler to make more decisions and compile programs in more ways, and since normal code is more readable than code-generating code. However, which implementation is used depends on compiler implementation details, so it is essential for the two implementations to behave identically.
I have no practical experience with if @generated though.
I think we should investigate the following claims before we make a decision on recursion vs generated.
* Claim: Recursion generates as optimal code as `@generated`. What I am worried about is that performance could be bad in rare cases, where inline heuristics or const prop fails. * Claim: Recursion is more compiler friendly. We should measure compilation time to see if tis true. I am somewhat sceptical. In the end we want to generate code that is as good as a handwritten constructor call. I don't see how the compiler could get there faster using recursion then generated funciton, where it is essentially already handed over the right thing.
See #22 for some investigations on these claims.
I also opened a question on discourse https://discourse.julialang.org/t/why-not-use-generated/29531
Most of my packages use recursion everywhere. Especially DimensionalData.jl and DynamicGrids.jl. I have a policy against generated because it reduces extensibility. But sometimes you need it anyway when something gets complicated, so I end up using a mix of both.
I think the key heuristic is to use recursion when I understand exactly what is happening inside a function call and know it will compile away, and to use generated if not.
With the pull request my instinct would be that setproperties is too complicated to compile away as well as the generated version, especially the do block. The compiler seems to back away from resolving constant propagation to decide if else blocks at some point. It might work if dispatch was used for everything.
@rafaqz Are you talking about the restriction that the generator body has to be pure? But I guess it is not an issue here since we don't call API functions insde setproperties?
As @jw3126's analysis https://github.com/JuliaObjects/ConstructionBase.jl/pull/22#issuecomment-538684801 shows that there is no benefit for going @generated-less approach (or at least to use my implementation), how about keeping this issue open but postpone actions until someone finds a concrete practical example where it makes sense to use non-@generated version?
Having said that, another hybrid approach may be to define a function asnamedtuple using generated function. We can then let Base.merge do the hard work:
function setproperties(obj, patch::NamedTuple) =
...some argument checking...
return constructorof(typeof(obj))(Tuple(merge(asnamedtuple(obj), patch))...)
end
This only invokes generated function for different types of obj, not for every combination of obj and patch. One benefit of going this way is that asnamedtuple can actually be useful sometimes.
I like the asmergedtuple function and the idea to implement setproperties using it. I agree that we can leave the issue as is until practical need arises (or somebody implements measurably better or equal solution than the current for fun).
FYI some discussion of if @generated started in https://stackoverflow.com/questions/59601652/when-should-i-use-optionally-generated-functions