JLD.jl
JLD.jl copied to clipboard
A simple stack corruption case with HDF5 reader and JLD writer
I encountered a weird error while writing some julia code for my project and distilled the essence of it as the code below. What it does is basically to read data from a HDF5 file in a separate Task thread and save calculated results to another JLD file.
using JLD
using HDF5
h5open("a.hdf5", "w") do file
a = reshape([1],1,1,1,1)
file["data"] = a
end
function io_task_impl()
while true
file = h5open("a.hdf5", "r")
produce((1,1,file["data"]))
close(file)
produce((1,0,nothing))
end
end
io_task = Task(io_task_impl)
while true
jldopen("aaa.jld", "w") do file
write(file, "arglist", (Float64, Vector([28*28, 300, 10]), 100))
end
d = consume(io_task)
end
The code above produces the following error after a few loops:
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 0:
#000: H5Dio.c line 271 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 352 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 788 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dcontig.c line 580 in H5D__contig_write(): contiguous write failed
major: Dataset
minor: Write failed
#004: H5Dscatgath.c line 678 in H5D__scatgath_write(): datatype conversion failed
major: Dataset
minor: Can't convert datatypes
#005: H5T.c line 4816 in H5T_convert(): data type conversion failed
major: Attribute
minor: Unable to encode value
#006: H5Tconv.c line 2571 in H5T__conv_struct_opt(): unable to convert compound datatype member
major: Datatype
minor: Unable to initialize object
#007: H5T.c line 4816 in H5T_convert(): data type conversion failed
major: Attribute
minor: Unable to encode value
#008: H5Tconv.c line 2172 in H5T__conv_struct(): not a datatype
major: Datatype
minor: Inappropriate type
ERROR: Error writing dataset
in h5d_write at /Users/gloine/.julia/v0.5/HDF5/src/plain.jl:1928
[inlined code] from /Users/gloine/.julia/v0.5/HDF5/src/plain.jl:1803
in write_compound at /Users/gloine/.julia/v0.5/JLD/src/JLD.jl:699
in write at /Users/gloine/.julia/v0.5/JLD/src/JLD.jl:687
in write at /Users/gloine/.julia/v0.5/JLD/src/JLD.jl:509
in anonymous at none:3
in jldopen at /Users/gloine/.julia/v0.5/JLD/src/JLD.jl:245
[inlined code] from /Users/gloine/.julia/v0.5/JLD/src/JLD.jl:243
in anonymous at no file:0
in eval at /Applications/Julia-0.5.0-dev-b0a84f7a3b.app/Contents/Resources/julia/lib/julia/sys.dylib
It sometimes segfaults, and sometimes gives me the error above depending on the code I insert in between. The code writes the same Tuple item to the JLD file every time, so it is weird to have a random failure.
I am using the latest master branch (could be several commits behind) on Mac OS X El Capitan. I read that HDF5 is not thread safe by default. Would installing a thread safe version of HDF5 solve the problem above?
Thanks, Gloine