JSON.jl
JSON.jl copied to clipboard
use_mmap leaves file open
Calling parsefile with use_mmap=true leaves the file open until the finalizer for the result of mmap_array is called. This can be some time later, so the file typically can't immediately be written.
The fix is messy (not tested, just outline)
function parsefile(filename::AbstractString; ordered::Bool=false, use_mmap=true)
sz = filesize(filename)
open(filename) do io
a = nothing
if use_mmap
a = mmap_array(Uint8, (sz,), io)
s = UTF8String(a)
else
s = readall(io)
end
out = JSON.parse(s, ordered=ordered)
finalize(a)
out
end
end
so I wonder what value mmap is providing here, given that the array is converted to a string immediately anyway?
I don't understand the context of this issue. You say this is a fix but you don't show the code that would be replaced.
First, UTF8String
does not copy the array, it merely wraps it.
Second, the openfile(...) do io ... end
idiom does close the file, I believe.
The issue is not that the file object is kept open, it's that the file is kept mmap'd, which apparently causes issue on windows.
I don't think calling finalize
is a good idea though.
See https://github.com/JuliaLang/julia/issues/12011
@nalimilan Yes, pulling the fix from that thread back to here as it is currently an issue in JSON parsefile that means that if I call parsefile with default arguments, I cannot write to the file for some unknown amount of time. In my case, running in a notebook,I have needed to restart the kernel in order to write the file. It is a common pattern to read a file, do something, and write the same file, so this is a debilitating feature of the current implementation.
@yuyichao "Kept open" vs "kept mmapped" may be a distinction, but the outcome is the same.
@yuyichao I don't understand what you mean by "calling finalize is a [bad] idea". If you mean it might be slow, then that's easily trumped by correctness. If there are other potential difficulties, please state them to see if they can be fixed.
Bottom line: existing code leaks a filehandle or something like it. The do
pattern doesn't prevent it. Options to fix: ignore use_mmap on Windows, or finalize as above. Final option: add a comment like that on readdlm saying "this doesn't work on windows, set use_mmap to true only if you don't care about leaving the file unwriteable". I personally find that quite distasteful.
@dmbates Apologies, I'm new to Julia and to the conventions for issue management. I'm not proposing a patch, just getting feedback on how people would like to handle this issue. The code that is being replaced is the current function parsefile, defined in JSON.jl, i.e.
function parsefile(filename::AbstractString; ordered::Bool=false, use_mmap=true)
sz = filesize(filename)
open(filename) do io
s = use_mmap ? UTF8String(mmap_array(Uint8, (sz,), io)) : readall(io)
JSON.parse(s, ordered=ordered)
end
end
@yuyichao "Kept open" vs "kept mmapped" may be a distinction, but the outcome is the same.
I know. I'm just trying to explain what @dmbates might be confused about.
I don't understand what you mean by "calling finalize is a [bad] idea". If you mean it might be slow, then that's easily trumped by correctness.
More or less, but I think a better way would be to fix the interface of mmap'ed arrays to give a better finalization interface.
Has there been any progress in finalization of mmap'ed arrays in Julia since 2015?
I was visiting these pages in order to report that on Windows using JSON.parsefile
leaves the file locked, but then noticed that the issue had been raised back in 2015. It was good to learn that passing the use_mmap
argument as false
is a work-around (which I confirmed with some quick tests) but this feature is indeed quite debilitating on Windows, as @awf remarked.