HDF5.jl icon indicating copy to clipboard operation
HDF5.jl copied to clipboard

freeze when `hdf5_type_id` on self-referential datatype

Open lukas-weber opened this issue 1 year ago • 17 comments

For example

using HDF5

struct A
    x::A
end

HDF5.hdf5_type_id(A) # hangs

For a recursive type, the function will keep calling itself, leading to an infinite loop. I am not a specialist, but I think what Julia does internally is to store the field x as a reference even though A is immutable.

As far as I know, there can be no self-reference cycles across multiple structs, so it is easy to detect this within a single call of hdf5_type_id. Maybe for now it is okay to just throw an error and fail gracefully in that case. I can implement that if wanted.

One practical example for this problem appearing is when trying to save Measurement from Measurements.jl. It contains a field of Measurements.Derivatives which is self-referential.

lukas-weber avatar Oct 27 '23 15:10 lukas-weber

I'm trying to figure out if HDF5 compound types support self-referential types like this.

mkitti avatar Oct 27 '23 18:10 mkitti

When I had a short look, I couldn’t find it. If it doesn’t, one approach would be to wrap it in a variable length array type that stores all referenced copies together with indices that show their relation.

lukas-weber avatar Oct 27 '23 18:10 lukas-weber

I just tried this patch and this does not seem to work either:

diff --git a/src/typeconversions.jl b/src/typeconversions.jl
index 1e82c659..a3f3a6b3 100644
--- a/src/typeconversions.jl
+++ b/src/typeconversions.jl
@@ -73,7 +73,8 @@ function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
     dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
     for (idx, fn) in enumerate(fieldnames(T))
         ftype = fieldtype(T, idx)
-        API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), hdf5_type_id(ftype))
+        _hdf5_type_id = ftype == T ? dtype : hdf5_type_id(ftype)
+        API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
     end
     return dtype
 end

I get the following:

julia> using HDF5

julia> struct A
           x::A
       end

julia> HDF5.hdf5_type_id(A)
ERROR: HDF5.API.H5Error: Error adding field x to compound datatype
libhdf5 Stacktrace:
 [1] H5Tinsert: Invalid arguments to routine/Bad value
     can't insert compound datatype within itself

mkitti avatar Oct 27 '23 19:10 mkitti

  1. We could proceed with my patch which essentially allows the HDF5 library itself to error.
  2. We could just detect the condition and error directly. This may allow for us to provide a more direct and friendlier error message.

mkitti avatar Oct 27 '23 20:10 mkitti

Is this even supported by Julia? How do you construct an instance of A?

simonbyrne avatar Oct 29 '23 20:10 simonbyrne

ah, you have to do it via inner constructors... the problem is that you end up with an undef at some point, which I don't think we can really support.

simonbyrne avatar Oct 29 '23 20:10 simonbyrne

julia> mutable struct A
           x::A
           function A()
               self = new()
               self.x = self
               return self
           end
       end

julia> a = A()
A(A(#= circular reference @-1 =#))

julia> pointer_from_objref(a)
Ptr{Nothing} @0x000000770cacabc0

julia> pointer_from_objref(a.x)
Ptr{Nothing} @0x000000770cacabc0

julia> unsafe_load(Ptr{Ptr{Nothing}}(pointer_from_objref(a)))
Ptr{Nothing} @0x000000770cacabc0

For a mutable struct, this just turns into a bunch of pointers.

To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how.

That said I'm not sure if I can think of a way to do that generically.

mkitti avatar Oct 30 '23 01:10 mkitti

Actually, I think there are non-obvious ways to create multi-type self-referential cycles that still freeze the patched hdf5_type_id

mutable struct B
   x::Any
end

struct A
   x::B
end

a = A(B(nothing))
a.x.x=a

So to catch everything, the types encountered so far all need to be kept track of. Serializing this kind of stuff generically seems quite challenging. Maybe it is possible using the HDF5-native reference system?

lukas-weber avatar Oct 30 '23 16:10 lukas-weber

To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how.

One solution would be to only support isbitstypes by default: any other Julia types would require some sort of manual conversion function.

simonbyrne avatar Oct 30 '23 17:10 simonbyrne

I think this would be fine if it errored instead of hung. I think we need a type id cache to break cycles.

mkitti avatar Oct 30 '23 17:10 mkitti

Self-referential structs are not isbitstype so failing for non-isbitstypes also removes any hangs.

lukas-weber avatar Oct 30 '23 18:10 lukas-weber

My current solution is the following.

import HDF5: hdf5_type_id, API

function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
    cache = try
        task_local_storage(:hdf5_type_id_cache)::Dict{DataType,Int}
    catch
        task_local_storage(:hdf5_type_id_cache, Dict{DataType, Int}())
    end
    if haskey(cache, T)
        error("Cannot create a HDF5 datatype with fields containing that datatype.")
    end
    dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
    cache[T] = dtype
    try
        for (idx, fn) in enumerate(fieldnames(T))
            ftype = fieldtype(T, idx)
            _hdf5_type_id = hdf5_type_id(ftype)
            API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
        end
    catch err
        rethrow(err)
    finally
        delete!(cache, T)
    end
    return dtype
end

mkitti avatar Oct 30 '23 18:10 mkitti

julia> struct C{A}
           x::A
       end

julia> struct B{A}
           x::C{A}
       end

julia> struct A
           x::B{A}
       end

julia> HDF5.hdf5_type_id(A)
ERROR: Cannot create a HDF5 datatype with fields containing that datatype.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
    @ Main ./REPL[3]:8
  [3] hdf5_type_id
    @ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
  [4] hdf5_type_id(#unused#::Type{C{A}}, isstruct::Val{true})
    @ Main ./REPL[3]:15
  [5] hdf5_type_id
    @ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
  [6] hdf5_type_id(#unused#::Type{B{A}}, isstruct::Val{true})
    @ Main ./REPL[3]:15
  [7] hdf5_type_id
    @ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
  [8] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
    @ Main ./REPL[3]:15
  [9] hdf5_type_id(#unused#::Type{A})
    @ HDF5 ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71
 [10] top-level scope
    @ REPL[13]:1

mkitti avatar Oct 30 '23 18:10 mkitti

Any feedback on the above?

mkitti avatar Nov 14 '23 16:11 mkitti

It seems pretty bullet proof to me. I’m not an expert on task_local_storage, but also don’t see a simpler way to do without. In any case, it is much better than the current behavior.

lukas-weber avatar Nov 14 '23 17:11 lukas-weber

Can we even write non-isbitstypes? Why not just throw an error?

simonbyrne avatar Nov 14 '23 17:11 simonbyrne

A mutable struct is not a bitstype. I'm not sure if we can write them yet, but I do think it would be possible in the future for us to write some simple mutable structs. If a mutable struct contains all bitstypes, then I think we can map it to a corresponding NamedTuple which we could then write.

julia> struct Foo
           x::Int
       end

julia> mutable struct Bar
           x::Int
       end

julia> HDF5.Datatype(HDF5.hdf5_type_id(Foo))
HDF5.Datatype: H5T_COMPOUND {
      H5T_STD_I64LE "x" : 0;
   }

julia> HDF5.Datatype(HDF5.hdf5_type_id(Bar))
HDF5.Datatype: H5T_COMPOUND {
      H5T_STD_I64LE "x" : 0;
   }

julia> isbitstype(Foo)
true

julia> isbitstype(Bar)
false

julia> to_namedtuple(x::T) where T = NamedTuple{fieldnames(T)}(ntuple(i->getfield(x,i), fieldcount(T)))
to_namedtuple (generic function with 1 method)

julia> to_namedtuple(Bar(5))
(x = 5,)

mkitti avatar Nov 15 '23 16:11 mkitti