aicsimageio icon indicating copy to clipboard operation
aicsimageio copied to clipboard

Remote reading "determine_reader" optimization

Open evamaxfield opened this issue 4 years ago • 2 comments

In determining the correct reader for the file provided we currently have two options (as of #224).

  1. Providing reader param to AICSImage (i.e. img = AICSImage("s3://some-file.ext", reader=readers.lif_reader.LifReader)
  2. Not providing a reader, and AICSImage looping over all SUPPORTED_READERS.

Option 1 is the fastest + safest method for loading a file into AICSImage (without using the raw Reader class) for both local and remote. But, for remote, option 2 could be incredibly slow if the users file is supported by a Reader at the end of the SUPPORTED_READERS list.

To add a third route / optimization we could do the following:

  1. Cheaply run through the dict of FORMAT_IMPLEMENTATIONS and compare the last characters of the URI with the formats known extension.
  2. For each one that matches, create a list of those readers.
  3. Try those readers first,
  4. If all fail, roll-back to trying all readers in a loop (current Option 2).

Which would ultimately make the reader router be:

  1. User providing the reader
  2. AICSImageIO naively checking the extension for reader compatibility
  3. AICSImageIO checking all readers for compatibility

evamaxfield avatar Apr 06 '21 03:04 evamaxfield

Alternate implementation combining steps 2 and 3 of the reader router you describe: make a list of all readers, and move those to the top of the list if their file extensions are matches? then iterate through list in its new sorted order. (I.e. modify step 3 to skip readers we already tried)

toloudis avatar Apr 06 '21 16:04 toloudis

The idea of a two-phase test is interesting. I assume that would require two API calls (or some parameter for how hard to try like git diff -M50%). Bio-Formats does have "the suffix suffices" logic, so one could at least loop over those.

joshmoore avatar Apr 07 '21 07:04 joshmoore

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 30 '23 01:03 github-actions[bot]