HDF5.jl icon indicating copy to clipboard operation
HDF5.jl copied to clipboard

Fix linking/unlinking in external.jl file

Open musm opened this issue 5 years ago • 4 comments

There's some hard to track bugs in this file For one, the commented out rm(fn2) is unable to unlink even though the handle is closed.

There's also the common warning when running the file directly that keeps showing up locally and on CI

┌ Warning: temp cleanup
│   exception =
│    IOError: unlink: resource busy or locked (EBUSY)
│    Stacktrace:
│     [1] uv_error
│       @ .\libuv.jl:97 [inlined]
│     [2] unlink(p::String)
│       @ Base.Filesystem .\file.jl:916
│     [3] rm(path::String; force::Bool, recursive::Bool)
│       @ Base.Filesystem .\file.jl:270
│     [4] temp_cleanup_purge(; force::Bool)
│       @ Base.Filesystem .\file.jl:516
│     [5] (::Base.var"#775#776")()
│       @ Base .\initdefs.jl:317
│     [6] _atexit()
│       @ Base .\initdefs.jl:338
│     [7] exit
│       @ .\initdefs.jl:28 [inlined]
│     [8] _start()
│       @ Base .\client.jl:488
└ @ Base.Filesystem file.jl:520

musm avatar Nov 03 '20 17:11 musm

From https://portal.hdfgroup.org/display/HDF5/H5L_CREATE_EXTERNAL:

Restriction: A file close degree property setting (H5P_SET_FCLOSE_DEGREE) in the external link file access property list or in the external link callback function will be ignored. A file opened by means of traversing an external link is always opened with the weak file close degree property setting, H5F_CLOSE_WEAK.

My guess is that the external file reference is still open as Julia is shutting down since close(source_file) isn't enough to actually close the implicit handle to target_file created as the external link is traversed. There is this function, though — https://portal.hdfgroup.org/display/HDF5/H5F_CLEAR_ELINK_FILE_CACHE — which added to the close function for a file might be enough.

jmert avatar Nov 03 '20 18:11 jmert

Trying that out doesn't actually work :-\

jmert avatar Nov 03 '20 18:11 jmert

Interesting. So, I found some breadcrumbs here https://github.com/JuliaIO/HDF5.jl/issues/272 I tried following what h5py is doing, but it isn't working here's the branch: https://github.com/musm/HDF5.jl/tree/test_external_fix 😕

musm avatar Nov 03 '20 21:11 musm

I think I ran into the same problem trying to access a file that contains virtual datasets. From what I can tell, HDF5 can not access virtual datasets with fclose_degree of H5F_CLOSE_STRONG. External links are probably subject to the same limitation.

To make sure it had nothing to to with Julia or HDF5.jl, I wrote a testcase in C:

#include <stdlib.h>
#include <stdio.h>
#include <hdf5.h>

int main(int argc, char *argv[]) {
    hid_t aplid = H5Pcreate(H5P_FILE_ACCESS);
    if (aplid < 0) abort();
    //if (H5Pset_fclose_degree(aplid, H5F_CLOSE_STRONG) < 0) abort();

    hid_t fid = H5Fopen("test2.hdf5", H5F_ACC_RDONLY, aplid);
    if (fid < 0) abort();
    H5Pclose(aplid);

    hid_t did = H5Dopen(fid, "/test_copy", H5P_DEFAULT);
    if (did < 0) abort();

    double buf[4];
    if (H5Dread(did, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL, H5P_DEFAULT, buf) < 0) abort();

    H5Dclose(did);
    H5Fclose(fid);

    for (int i = 0; i < 4; i++)
        printf("%f\n", buf[i]);

    return 0;
}

The test files are created by the following python script:

with h5py.File("/tmp/test1.hdf5", "w") as f:
    f.create_dataset("test", data=np.array([1,2,3,4]))
    
with h5py.File("/tmp/test2.hdf5", "w") as f:
    layout = h5py.VirtualLayout(shape=(4,), dtype=np.double)
    layout[:] = h5py.VirtualSource('/tmp/test1.hdf5', 'test', (4,))
    f.create_virtual_dataset('test_copy', layout)

It works as expected with line 8 commented out. Enabling line 8 generates the same error that I'm seeing in my julia code.

In the latest master (since 1c2ccb5ccef745c0810d56ac482fca287b3e0ee2?), fclose_degree can be overridden by passing fclose_degree=:weak to h5open. Doing that allows me to read my test file with no issues. It may also enable your external data use case.

adambrewster avatar Nov 26 '21 20:11 adambrewster