ome-zarr-py
ome-zarr-py copied to clipboard
bioformats2raw metadata support
- [x] Add Implicit spec to loop over metadata-less "collections"
- [x] Add Leaf & Root specs
- [x] Support entrypoint-based specs ("ome_zarr.spec")
- [x] Use entrypoint to adder suport for https://github.com/ome/ngff/pull/112
- [ ] add tests for SHOULD/MAY portions of the spec
Part of the investigation of metadata in https://github.com/ome/ngff/issues/104. This "implicit" group is the cheapest form of collection imaginable.
Currently, only groups within the given group (and not arrays or explicit files) will be further parsed.
Codecov Report
Patch coverage: 77.41% and project coverage change: -0.89 :warning:
Comparison is base (
8964374) 84.79% compared to head (28155d5) 83.90%.
:exclamation: Current head 28155d5 differs from pull request most recent head 836dfd2. Consider uploading reports for the commit 836dfd2 to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## master #174 +/- ##
==========================================
- Coverage 84.79% 83.90% -0.89%
==========================================
Files 13 14 +1
Lines 1473 1591 +118
==========================================
+ Hits 1249 1335 +86
- Misses 224 256 +32
| Impacted Files | Coverage Δ | |
|---|---|---|
| ome_zarr/reader.py | 83.52% <65.38%> (-3.18%) |
:arrow_down: |
| ome_zarr/bioformats2raw.py | 86.11% <86.11%> (ø) |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
See https://github.com/ome/ome-zarr-metadata/releases/tag/0.1.0 for an example of an entrypoint. After creating a fake .zgroup under the output of bioformats2raw a.fake /tmp/a.ome.zarr
$ ome_zarr info /tmp/a.ome.zarr/0/test/
/private/tmp/a.ome.zarr/0/test [zgroup]
- metadata
- Implicit (1)
- Leaf (2)
- data
/private/tmp/a.ome.zarr/0 [zgroup]
- metadata
- Multiscales
- Leaf (2)
- data
- (1, 1, 1, 512, 512)
- (1, 1, 1, 256, 256)
/private/tmp/a.ome.zarr [zgroup]
- metadata
- bioformats2raw (3)
- Root (2)
- data
Notice:
- the
Implicitspec scans groups that have no other metadata Leaf/Rootwork their way up and back down a hierarchybioformats2rawreadsOME/METADATA.ome.xml
For an Image in a Plate (12 Wells A-C, 1-4, all wells with labels), without this PR I get:
$ ome_zarr info 251.zarr/A/1/0/
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1/0 [zgroup]
- metadata
- Multiscales
- OMERO
- data
- (3, 1024, 1344)
- (3, 512, 672)
- (3, 256, 336)
- (3, 128, 168)
- (3, 64, 84)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1/0/labels [zgroup] (hidden)
- metadata
- Labels
- data
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1/0/labels/0 [zgroup] (hidden)
- metadata
- Label
- Multiscales
- data
- (1, 1024, 1344)
- (1, 512, 672)
- (1, 256, 336)
- (1, 128, 168)
- (1, 64, 84)
- (1, 32, 42)
and with this PR I get all the sibling A Wells A2, A3, A4, but not B1-B4 or C1-C4. And I don't get labels for those Wells.
$ ome_zarr info 251.zarr/A/1/0/
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1/0 [zgroup]
- metadata
- Multiscales
- OMERO
- Leaf
- data
- (3, 1024, 1344)
- (3, 512, 672)
- (3, 256, 336)
- (3, 128, 168)
- (3, 64, 84)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1/0/labels [zgroup] (hidden)
- metadata
- Labels
- Leaf
- data
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1/0/labels/0 [zgroup] (hidden)
- metadata
- Label
- Multiscales
- Leaf
- data
- (1, 1024, 1344)
- (1, 512, 672)
- (1, 256, 336)
- (1, 128, 168)
- (1, 64, 84)
- (1, 32, 42)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/1 [zgroup]
- metadata
- Well
- Leaf
- data
- (3, 1024, 1344)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A [zgroup]
- metadata
- Implicit
- Leaf
- data
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/2 [zgroup]
- metadata
- Well
- Leaf
- data
- (3, 1024, 1344)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/3 [zgroup]
- metadata
- Well
- Leaf
- data
- (3, 1024, 1344)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr/A/4 [zgroup]
- metadata
- Well
- Leaf
- data
- (3, 1024, 1344)
/Users/wmoore/Desktop/ZARR/data/v4_omero/plates/251.zarr [zgroup]
- metadata
- Plate
- Root
- data
- (3, 768, 1344)
Without this PR, napari 251.zarr/A/1/0/ gives me just the 1 image + labels:

With this PR, I get everything as for 'info' above: All the A wells, but only labels for A1:

and with this PR I get all the sibling
AWells A2, A3, A4, but not B1-B4 or C1-C4. And I don't get labels for those Wells.
@will-moore, you approve of getting the siblings (assuming napari can be fixed)? If so, I'll look why it's only for the one row.
Also, where do you get the labels for this plate? Is this the one you had a script for?
@will-moore, I've reverted the upwards parsing. It seemed like a good strategy but there are currently too many edge cases. I don't have labels on plates for testing at the moment, but I think with the current state along with https://github.com/ome/ome-zarr-metadata/commit/08e12f784e085adbf4ca6d384720443108f96cb6#diff-0bb17e0ecb4ac83835ee3800a1af71a12f644b0ce782c623ba97f8917916250eR54 all the following should be true:
| non-bf2raw | bf2raw | |
|---|---|---|
| HCS | unchanged | unchanged |
| non-HCS | unchanged | now loads all images |
The only other change I can think of is if you pass a group that previously did nothing, it will likely try to load the contents.
In discussing today with @dgault, @sbesson, @jburel and @melissalinkert, there was a case made for at least adding the flag (Leaf) to make it possible for clients to detect that there is more information that needs loading. Additional methods or parameters should then allow that loading.
To improve the codecov results, see https://github.com/zarr-developers/numcodecs/pull/300/files#diff-bc37cd9860eec1facdc18a47798e8a1a2c0ef5dabd999deee049de4a48a5d35fR1 for an option of in-repo testing of entrypoints.
@joshmoore To help address the "don't have labels on plates for testing", I created https://gist.github.com/will-moore/0f4cb6b1fdd60a255fcbb956a54a645e which adds labels to a plate (currently assumes images axes are cyx) by segmenting one of the channels.
I don't know if I'm missing something, maybe not using ome_zarr properly, but it feels quite manual to e.g. iterate through Wells on a Plate - manually parsing JSON, joining paths etc and parse_url() for every Well and every Image.
see a quick use of this functionality:
- https://github.com/ome/ome-zarr-metadata/pull/1
- https://github.com/ome/napari-ome-zarr/pull/47
Migrated the bf2raw implementation from https://github.com/ome/ome-zarr-metadata :
$ bioformats2raw-0.5.0-SNAPSHOT/bin/bioformats2raw 'my&series=2.fake' test_output
$ ome_zarr info test_output/
/opt/ome-zarr-py/test_output [zgroup]
- metadata
- bioformats2raw
- data
/opt/ome-zarr-py/test_output/0 [zgroup]
- metadata
- Multiscales
- data
- (1, 1, 1, 512, 512)
- (1, 1, 1, 256, 256)
/opt/ome-zarr-py/test_output/1 [zgroup]
- metadata
- Multiscales
- data
- (1, 1, 1, 512, 512)
- (1, 1, 1, 256, 256)
Should we also discuss the name of the module itself?
We added an omero block of channel & rendering metadata to the multiscale .zattrs (because it came from omero) but we actually want other tools to read and write this metadata, which may be discouraged by the naming.
In the same way, bioformats2raw.layout is a spec that just happens to be produced originally by bioformats2raw, but it's really a spec that ALL tools should read/write.
I don't know if it's too late to think about a different name there, or if the name has already stuck?
Other than the string bioformats2raw.layout we're pretty free to change things here. (I'd say we definitely don't want to reproduce what we did with omero and we actually need to think about how to make that "transitional" as well)
Ah - yes, too late to change the "bioformats2raw.layout" key because data generated with this already exists.
This pull request has been mentioned on Image.sc Forum. There might be relevant details there:
https://forum.image.sc/t/intermission-ome-ngff-0-4-1-bioformats2raw-0-5-0-et-al/72214/1
This pull request has been mentioned on Image.sc Forum. There might be relevant details there:
https://forum.image.sc/t/saving-volumetric-data-with-voxel-size-colormap-annotations/85537/24