aicsimageio
aicsimageio copied to clipboard
Remote reading "determine_reader" optimization
In determining the correct reader for the file provided we currently have two options (as of #224).
- Providing
readerparam toAICSImage(i.e.img = AICSImage("s3://some-file.ext", reader=readers.lif_reader.LifReader) - Not providing a reader, and AICSImage looping over all
SUPPORTED_READERS.
Option 1 is the fastest + safest method for loading a file into AICSImage (without using the raw Reader class) for both local and remote. But, for remote, option 2 could be incredibly slow if the users file is supported by a Reader at the end of the SUPPORTED_READERS list.
To add a third route / optimization we could do the following:
- Cheaply run through the dict of
FORMAT_IMPLEMENTATIONSand compare the last characters of the URI with the formats known extension. - For each one that matches, create a list of those readers.
- Try those readers first,
- If all fail, roll-back to trying all readers in a loop (current Option 2).
Which would ultimately make the reader router be:
- User providing the reader
- AICSImageIO naively checking the extension for reader compatibility
- AICSImageIO checking all readers for compatibility
Alternate implementation combining steps 2 and 3 of the reader router you describe: make a list of all readers, and move those to the top of the list if their file extensions are matches? then iterate through list in its new sorted order. (I.e. modify step 3 to skip readers we already tried)
The idea of a two-phase test is interesting. I assume that would require two API calls (or some parameter for how hard to try like git diff -M50%). Bio-Formats does have "the suffix suffices" logic, so one could at least loop over those.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.