bids-validator
bids-validator copied to clipboard
FIX: Close NIfTI files after read, before parsing headers
Replacement of #1583, revival of #691. Could close #675 if driven by open NIfTI files and not other open files.
Opening a separate PR to let CI do its magic and tell me if this is dumb.
Codecov Report
Base: 83.37% // Head: 83.38% // Increases project coverage by +0.00% :tada:
Coverage data is based on head (
02ec049) compared to base (66b7c5a). Patch coverage: 85.71% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## master #1590 +/- ##
=======================================
Coverage 83.37% 83.38%
=======================================
Files 91 91
Lines 3736 3737 +1
Branches 1141 1141
=======================================
+ Hits 3115 3116 +1
Misses 525 525
Partials 96 96
| Impacted Files | Coverage Δ | |
|---|---|---|
| bids-validator/utils/files/readNiftiHeader.js | 68.25% <85.71%> (+0.51%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@yarikoptic Here's a rewrite of #1583 that I believe does what we want. The callback mode of writing is still clunky to me. I'm not sure if I could have written instead:
fs.read(fd, buffer, 0, bytesRead, 0, function () {
+ fs.close(fd, function () {})
if (file.name.endsWith('.nii')) {
Either way, this passes, and I am reasonably confident that it closes the file descriptor as soon as possible, if you want to try your test and watch for open files.
@yarikoptic Here's a rewrite of #1583 that I believe does what we want. The callback mode of writing is still clunky to me. I'm not sure if I could have written instead:
fs.read(fd, buffer, 0, bytesRead, 0, function () { + fs.close(fd, function () {}) if (file.name.endsWith('.nii')) {Either way, this passes, and I am reasonably confident that it closes the file descriptor as soon as possible, if you want to try your test and watch for open files.
You could do that but fs.closeSync(fd) is better since it will close right away instead a future event loop, otherwise it's possible the close is deferred and stacks up depending on how this is called.
Garbage collection closes these pretty quickly anyways but if the explicit file handle close fixes things, it's likely the function closure contributes to slow collection of the file handles. Refactoring this to not use any closures with the file handle would probably also help (putting the file handle only on a function stack) and save a bit of memory in the process.
cool... I will need to recall how I have managed to try it all up last time and either it was not mitigated for me via ulimits already so I would not observe the ultimate problem ;)
@nellh something like this?
function peekFile(path, buffer, count) {
let fd;
let ret;
try {
fd = fs.openSync(path)
ret = fs.readSync(fd, buffer, 0, count, 0)
} finally {
if (fd !== undefined) {
fs.closeSync(fd)
}
}
return ret
}
function extractNiftiFile(file, callback) {
const bytesRead = 1024
const buffer = Buffer.alloc(bytesRead)
try {
ret = peekFile(file.path, buffer, bytesRead)
} catch (err) {
callback({ error: new Issue({ code: 44, file: file }) })
return
}
if (file.name.endsWith('.nii')) {
callback(parseNIfTIHeader(buffer, file))
} else {
try {
const data = pako.inflate(buffer)
callback(parseNIfTIHeader(data, file))
} catch (err) {
callback(handleGunzipError(buffer, file))
}
}
}
I don't know if you can do try/finally like in Python or you need a catch, or if it would be better to inline the peekFile code. Also not really sure about the scoping rules, so I kept the buffer allocation in extractNiftiFile.
@effigies Looks pretty good! try ... finally is newer to standard JavaScript but it is available in all the environments the validator targets now.
You could also use async with a stream instead of blocking I/O, since the caller just needs the callback to run eventually. This will yield to the event loop if the for await (... of stream) blocks waiting for I/O. This also closes the file handle internally after reading the byte range, so it doesn't need to be explicitly called. If there's an error while opening or reading from the file, it calls close on the file handle and throws the error to the try ... catch in extractNiftiFile.
async function peekFile(path, bytes) {
const stream = fs.createReadStream(path, { start: 0, end: bytes })
const blocks = []
for await (const block of stream) {
blocks.push(block)
}
return Buffer.concat(blocks)
}
async function extractNiftiFile(file, callback) {
const bytesRead = 1024
try {
const buffer = await peekFile(file.path, bytesRead)
if (file.name.endsWith('.nii')) {
callback(parseNIfTIHeader(buffer, file))
} else {
try {
const data = pako.inflate(buffer)
callback(parseNIfTIHeader(data, file))
} catch (err) {
callback(handleGunzipError(buffer, file))
}
}
} catch (err) {
callback({ error: new Issue({ code: 44, file: file }) })
}
}
could this PR be finalized/merged/released? I keep coming to https://github.com/bids-standard/bids-validator/issues/675 over and over again just to eventually finally recall about this problem / tentative solution.
Does it work? My impression was we hadn't figured out the issue fully.
could you remind me the trick on how we managed to run the modified code while reusing existing singularity (IIRC) container with all the libs ? I would give it a shot again
I would need to try to work it out again. I don't have a docker command in my shell history where I patched the source.