bids-validator icon indicating copy to clipboard operation
bids-validator copied to clipboard

FIX: Close NIfTI files after read, before parsing headers

Open effigies opened this issue 2 years ago • 10 comments

Replacement of #1583, revival of #691. Could close #675 if driven by open NIfTI files and not other open files.

Opening a separate PR to let CI do its magic and tell me if this is dumb.

effigies avatar Jan 30 '23 21:01 effigies

Codecov Report

Base: 83.37% // Head: 83.38% // Increases project coverage by +0.00% :tada:

Coverage data is based on head (02ec049) compared to base (66b7c5a). Patch coverage: 85.71% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1590   +/-   ##
=======================================
  Coverage   83.37%   83.38%           
=======================================
  Files          91       91           
  Lines        3736     3737    +1     
  Branches     1141     1141           
=======================================
+ Hits         3115     3116    +1     
  Misses        525      525           
  Partials       96       96           
Impacted Files Coverage Δ
bids-validator/utils/files/readNiftiHeader.js 68.25% <85.71%> (+0.51%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Jan 30 '23 22:01 codecov[bot]

@yarikoptic Here's a rewrite of #1583 that I believe does what we want. The callback mode of writing is still clunky to me. I'm not sure if I could have written instead:

      fs.read(fd, buffer, 0, bytesRead, 0, function () {
+       fs.close(fd, function () {})
        if (file.name.endsWith('.nii')) {

Either way, this passes, and I am reasonably confident that it closes the file descriptor as soon as possible, if you want to try your test and watch for open files.

effigies avatar Jan 30 '23 22:01 effigies

@yarikoptic Here's a rewrite of #1583 that I believe does what we want. The callback mode of writing is still clunky to me. I'm not sure if I could have written instead:

      fs.read(fd, buffer, 0, bytesRead, 0, function () {
+       fs.close(fd, function () {})
        if (file.name.endsWith('.nii')) {

Either way, this passes, and I am reasonably confident that it closes the file descriptor as soon as possible, if you want to try your test and watch for open files.

You could do that but fs.closeSync(fd) is better since it will close right away instead a future event loop, otherwise it's possible the close is deferred and stacks up depending on how this is called.

Garbage collection closes these pretty quickly anyways but if the explicit file handle close fixes things, it's likely the function closure contributes to slow collection of the file handles. Refactoring this to not use any closures with the file handle would probably also help (putting the file handle only on a function stack) and save a bit of memory in the process.

nellh avatar Jan 30 '23 22:01 nellh

cool... I will need to recall how I have managed to try it all up last time and either it was not mitigated for me via ulimits already so I would not observe the ultimate problem ;)

yarikoptic avatar Jan 30 '23 23:01 yarikoptic

@nellh something like this?

function peekFile(path, buffer, count) {
  let fd;
  let ret;
  try {
    fd = fs.openSync(path)
    ret = fs.readSync(fd, buffer, 0, count, 0)
  } finally {
    if (fd !== undefined) {
      fs.closeSync(fd)
    }
  }
  return ret
}

function extractNiftiFile(file, callback) {
  const bytesRead = 1024
  const buffer = Buffer.alloc(bytesRead)

  try {
    ret = peekFile(file.path, buffer, bytesRead)
  } catch (err) {
    callback({ error: new Issue({ code: 44, file: file }) })
    return
  }

  if (file.name.endsWith('.nii')) {
    callback(parseNIfTIHeader(buffer, file))
  } else {  
    try {
      const data = pako.inflate(buffer)
      callback(parseNIfTIHeader(data, file))
    } catch (err) {
      callback(handleGunzipError(buffer, file))
    }
  }
}

I don't know if you can do try/finally like in Python or you need a catch, or if it would be better to inline the peekFile code. Also not really sure about the scoping rules, so I kept the buffer allocation in extractNiftiFile.

effigies avatar Jan 31 '23 02:01 effigies

@effigies Looks pretty good! try ... finally is newer to standard JavaScript but it is available in all the environments the validator targets now.

You could also use async with a stream instead of blocking I/O, since the caller just needs the callback to run eventually. This will yield to the event loop if the for await (... of stream) blocks waiting for I/O. This also closes the file handle internally after reading the byte range, so it doesn't need to be explicitly called. If there's an error while opening or reading from the file, it calls close on the file handle and throws the error to the try ... catch in extractNiftiFile.

async function peekFile(path, bytes) {
    const stream = fs.createReadStream(path, { start: 0, end: bytes })
    const blocks = []
    for await (const block of stream) {
        blocks.push(block)
    }
    return Buffer.concat(blocks)
}

async function extractNiftiFile(file, callback) {
    const bytesRead = 1024
    try {
        const buffer = await peekFile(file.path, bytesRead)
        if (file.name.endsWith('.nii')) {
            callback(parseNIfTIHeader(buffer, file))
        } else {
            try {
                const data = pako.inflate(buffer)
                callback(parseNIfTIHeader(data, file))
            } catch (err) {
                callback(handleGunzipError(buffer, file))
            }
        }
    } catch (err) {
        callback({ error: new Issue({ code: 44, file: file }) })
    }
}

nellh avatar Jan 31 '23 04:01 nellh

could this PR be finalized/merged/released? I keep coming to https://github.com/bids-standard/bids-validator/issues/675 over and over again just to eventually finally recall about this problem / tentative solution.

yarikoptic avatar Mar 22 '23 13:03 yarikoptic

Does it work? My impression was we hadn't figured out the issue fully.

effigies avatar Mar 22 '23 13:03 effigies

could you remind me the trick on how we managed to run the modified code while reusing existing singularity (IIRC) container with all the libs ? I would give it a shot again

yarikoptic avatar Mar 23 '23 14:03 yarikoptic

I would need to try to work it out again. I don't have a docker command in my shell history where I patched the source.

effigies avatar Mar 23 '23 14:03 effigies