HDF5.jl
HDF5.jl copied to clipboard
RFC: Initialize runtime globals during __init__
This isn't necessarily a real proposal — I wanted to explore how far we could get with actually doing runtime initialization of the in-principle dynamic global "constants" from the HDF5 library while simultaneously not adding a burden on users to know whether a value is a true constant or a runtime-constant.
As it turns out, I was able to hide the difference from users by using (abusing?) getproperty defined on structs. The basic idea is:
- An overload of
getpropertycan evaluate an expression for each use ofname.symbol, so a largeif-elseifblock can be used to hide whether a value is a true constant (and therefore returnsymbolname) or a dynamic runtime value stored in aRef(and therefore returnsymbolname[]). getpropertycannot be defined on a module, so I've "abused" a singletonstructfor method dispatch. The runtime values are actually stored in a [lightly] name-mangled bare module.- Storing the values within a module and not just reusing the existing top-level names means deprecation bindings can be appropriately set since the representation of using the constants directly does have to change for the
Refcontainers.
- Storing the values within a module and not just reusing the existing top-level names means deprecation bindings can be appropriately set since the representation of using the constants directly does have to change for the
- Likewise, overloading
setindex!allows easily initializing the value during__init__no matter the internal implementation.
Pros:
- This is the "correct" thing to do given runtime-constant values.
- It puts something like defining a HDF5 data type for
Float16on similar footings to all the built-in data types — the datatype is constructed at initialization and just fills in itsRefcontainer. No need for runtime@evalwhich is otherwise necessary to avoid requiring users to do the Ref-indexing.- Note that the last commit is insufficient for
Float16to actually work — there's some changes to uses ofh5t_get_native_typein #341 that I haven't replicated here, and not including those changes means the genericread()doesn't work due to size mismatch on the actual Float16 datatype (2 bytes) versus it's native-ified datatype (4 bytes). I've just included the last commit to show that adding a new "constant" is actually very easy with this scheme.
- Note that the last commit is insufficient for
- The constants are namespaced, so they can actually be more conveniently access by just exporting the top-level "module" names. An overload of
propertynamesalso means you get very convenient tab completion as well.
Cons:
- This adds a non-trivial macro to the precompile path. The macro definitely makes defining this "too clever" hack easier, but it must surely add to the precompile time.
- Likewise, runtime initializing all of the
Refcontainers at__init__must add a bit of load time on normal uses as well. - I've been playing around with this in Julia v1.6 master where Julia seems to constant-propagate the
getpropertys to just the relevant expression, but I haven't checked prior versions for whether this is actually adding a large runtime if-else chain to every use. - It's in general maybe just too clever for its own good :-P
Nice, I haven't fully looked at the PR in detail, but I think we should go for something like this.
I actually prototyped a similar idea before hand over at https://github.com/musm/HDF5.jl/tree/refs, however I used constant global refs, which where initialized at runtime, in a dummy way without magic (which is super nice in terms of the overall user interface and ergonomics). I never fully pushed it through, but this PR got me interested to compare how the different approaches effect package precompilation and load times.
Here are the timings I get (using your gist for timings)
master precompile: 2.271 ± 0.0576
master pkg load: 0.597 ± 0.0195
# https://github.com/musm/HDF5.jl/tree/refs
refs precompile: 2.313 ± 0.0670
refs pkg load: 0.626 ± 0.0389
# this pr
init precompile: 3.116 ± 0.0706
init pkg load: 0.688 ± 0.0316
I'm not too concerned with the precompile times, since the PR here does more codegen . Comparing the pkg load times, it seems that the hit is really not too bad in general. Using const refs, does seem to offer improved precompilation/pkgload times, which I am assuming come from the fact that they are global constants and that this helps compilation since the types are known a priori to runtime initialization.
In summary, the pkgload time hit using this PR is 15%, and using constant refs 5% (https://github.com/musm/HDF5.jl/tree/refs). The constant ref approach, does make using the constants less ergonomic, which is why I think I held off from it, since I couldn't think of a clean interface for easily accessing the constants.
EDIT: I should actually review the PR in detail before commenting, because this does use const refs, I just didn't realize it since I didn't look at the PR in detail 😄
I think in principle it should be possible to segfault master HDF5, by opening multiple HDF5.jl instances from different Julia processes, which would demonstrate why we need such a PR that makes these constants runtime initialized. Although, I haven't been able to successfully to do so.
I just added a new commit which brings down both the precompile and load time just a little bit. The issue was that I was defining the getproperty/setproperty! overloads on the singleton types, but that leads to a huge number of invalidations in Base. By instead defining a singleton instance of the type and then defining the functions on the instance, the invalidations are eliminated, which gives a small speed-up.
The SnoopCompile setup:
julia> using SnoopCompileCore
julia> invs = @snoopr using HDF5;
julia> using SnoopCompile
Before: (commit b6ab18b using singleton types)
julia> trees = invalidation_trees(invs)
1-element Vector{SnoopCompile.MethodInvalidations}:
inserting getproperty(::Type{H5E}, sym::Symbol) in HDF5 invalidated:
backedges: 1: superseding getproperty(x::Type, f::Symbol) in Base at Base.jl:28 with MethodInstance for getproperty(::DataType, ::Symbol) (1220 children)
3 mt_cache
After: (commit f09463a using singleton instances)
julia> trees = invalidation_trees(invs)
SnoopCompile.MethodInvalidations[]
And timing precompile and load from master and the before/after singleton commits:
master precompile: 2.085 ± 0.0499
master pkg load: 0.594 ± 0.0299
# using singleton types
runtime precompile: 2.807 ± 0.0309
runtime pkg load: 0.650 ± 0.0287
# using singleton instances
runtime precompile: 2.542 ± 0.0290
runtime pkg load: 0.637 ± 0.0226
So in the new change H5P is just an alias for the singleton type _H5P.H5P() ?
So in the new change H5P is just an alias for the singleton type _H5P.H5P() ?
Yes.
Yeah I think this is pretty cool. In general, I'm comfortable with pushing through with this, because in principle it is the correct thing to do (runtime initialization) and the ergonomics of using these types is enhanced, albeit behind some chicanery, which I don't think is all that different from, say factorization using (get/set)property in LinearAlgebra. The only 'mischievous' thing here is that we are overloading a singleton instance. Could we piggyback off of this and use these instances for functions as well? I.e. h5r_create -> H5R.create? It feels like since we are going this route perhaps we should just commit to it for the interal api as well. (we could probably piggy back of the genbinding macro.
I wonder if we can block redefinitions of truly constant values or error them (without a runtime or compile time cost), because right now for non-refs things like H5S.runtimeconstant = 3 won't error , but is allowed, which probably should error.
Could we piggyback off of this and use these instances for functions as well? I.e. h5r_create -> H5R.create? It feels like since we are going this route perhaps we should just commit to it for the interal api as well. (we could probably piggy back of the genbinding macro.
It's definitely possible, but I'm a little worried that at some point the long chain of if-elseif statements are going to not optimize well and performance will suffer. It's something that could be looked at, though. (Maybe a two-level getproperty chain would resolve most of that, though — keep constants defined explicitly in getproperty and chain-call to a hidden module and let normal name resolution take place??)
I wonder if we can block redefinitions of truly constant values or error them (without a runtime or compile time cost), because right now for non-refs things like H5S.runtimeconstant = 3 won't error , but is allowed, which probably should error.
Ah, yes, I should definitely add an error path to both getproperty and setproperty!. They used to have one by default by way of chain-calling the type's getfield/setfield!, but since those went away after switching to using an instance, I should add back explicit errors.
It's definitely possible, but I'm a little worried that at some point the long chain of if-elseif statements are going to not optimize well and performance will suffer. It's something that could be looked at, though. (Maybe a two-level getproperty chain would resolve most of that, though — keep constants defined explicitly in getproperty and chain-call to a hidden module and let normal name resolution take place??)
I haven't thought about the implementation in great detail, but that sounds like it could work. I guess that means we would have to hide all of the lower-level functions in same 'private/api' module.
Should we also fully commit to the hack and define show methods :trollface: ?
Base.show(io::IO, ::typeof(H5)) = print("H5")
Should we also fully commit to the hack and define
showmethods :trollface: ?Base.show(io::IO, ::typeof(H5)) = print("H5")
I'd been wondering the same thing.
I guess we should make a decision on the type of 'public' interface we want to go ahead with in the future.
If we commit to this style of using 'dot' to prefix the various HDF5 modules, instead of our current convention of using underscore, then this change is very desirable. If not, this change is still desirable in the sense that it improves initialization of global and makes them much more ergonomic to use instead of trying to figure out which is a runtime global and having to use x[] vs x to reference these constants, and we can tell people to use _ for functions, which admittedly feels less nice, if the global are using . to prefix them.
I'd been wondering the same thing.
I'm kind of interested to see if any other packages have utilized similar in sprit, 'hacks' (that feels a bit too strong a word here, but I think you get my point)