jhdf
jhdf copied to clipboard
FileChannel usage causes blocking behavior
On windows, there is a known bug where OpenJDK JREs will close java.nio.channels.FileChannel
objects that called FileChannel.map()
insufficiently, resulting in an inability to delete the referenced file, even after closing. As HdfFileChannel
uses this call, it too causes this bug resulting in any hdf file accessed and used practically by jhd5 to suffer the same fate.
for reference: https://bugs.openjdk.org/browse/JDK-4715154
Note: I encountered this with code that's a bit complicated to place here for reproducibility, but this snippet should reproduce the problem; test with an hdf5 that contains a least one group and one dataset at root level (may work with other types, this kind is just the one I tested with):
HdfFile h5 = new HdfFile(YOUR_PATH_HERE);
for (Node n : h5.getChildren().values()){
System.out.println(n.getName());
}
h5.close()
Files.delete(YOUR_PATH_HERE);
You'd expect this to work just fine, but instead you get AccessDeniedException
;
-
jhdf
0.6.8 - Java version 1.8
- Windows 10
Note that open JDK has already announced that while this is a bug, they have no intention of trying to fix it. In the event they overhaul their garbage collection, this windows bug will be addressed at that point.
Thanks for using jHDF and reporting this issue.
This seems like a tricky issue for jHDF to fix as its a bug in Java on Windows. Do you have a suggestion on a fix you would like to see? Maybe I could add an option to disable memory mapping via a property allowing this to be worked around at the expense of performance?
I'm not too sure. I've never gotten into the specifics of how HDF5 works.
On one hand, you could just ignore this problem. After all, jHDF is only a reader right now. If its a user's concern, they can make a copy in a tmp directory and delete their original, letting this be a low priority issue. On the other hand, you could make that process an option in jHDF; effectively adding an option that quietly makes a copy of a provided HDF file, and operates on that. That would be handy for future read/write, but it surely would take up a chunk more space (and time) to copy. And the copy would still exist in the window's temp dir...but it's at least an option.
I do like your solution though, it seems very clean. Maybe the best option is to provide both solutions as options, and if the OS is windows...pick one to default to? Seems a bit overengineered, but I don't have a better suggestion sorry.