tiatoolbox
tiatoolbox copied to clipboard
Image Resolution wrongly read in TIFFWSIReader
- TIA Toolbox version: 1.2.1
- Python version: 3.8
- Operating System:
Description
According to the Tiff description, the tags XResolution
and YResolution
are of type rational. Meaning that we should compute the fraction. In the code:
https://github.com/TissueImageAnalytics/tiatoolbox/blob/b3ace851ac61cbeee1f382b33925d8a9b0a1be55/tiatoolbox/wsicore/wsireader.py#L3245-L3248
we are only passing the first element. Instead we should pass res_x.value[0]/res_x.value[1]
and res_y.value[0]/res_y.value[0]
I am also having some issues when reading at different pyramid levels: I think (but I am not sure) that these lines: https://github.com/TissueImageAnalytics/tiatoolbox/blob/b3ace851ac61cbeee1f382b33925d8a9b0a1be55/tiatoolbox/wsicore/wsireader.py#L3505-L3521
should look like this (use level location and size instead of baseline)
(
read_level,
level_location,
level_size,
post_read_scale,
baseline_read_size,
) = self.find_read_rect_params(
location=location,
size=size,
resolution=resolution,
units=units,
)
bounds = utils.transforms.locsize2bounds(
location=level_location, size=level_size
)
I am also having some issues when reading at different pyramid levels: I think (but I am not sure) that these lines:
https://github.com/TissueImageAnalytics/tiatoolbox/blob/b3ace851ac61cbeee1f382b33925d8a9b0a1be55/tiatoolbox/wsicore/wsireader.py#L3505-L3521
should look like this (use level location and size instead of baseline)
( read_level, level_location, level_size, post_read_scale, baseline_read_size, ) = self.find_read_rect_params( location=location, size=size, resolution=resolution, units=units, ) bounds = utils.transforms.locsize2bounds( location=level_location, size=level_size )
@rogertrullo Please can you explain these issues?
Keeping this issue open to address this comment https://github.com/TissueImageAnalytics/tiatoolbox/issues/452#issuecomment-1278991551
@shaneahmed , @John-P , when reading at a different resolution other than the baseline, I have some errors. For example, when reading CMU-1-SMALL-REGION (which I converted to ome.tiff using bioformats2raw
and raw2ometiff
) thumbnail
imghe=TIFFWSIReader(path_img)
he_lr = imghe.slide_thumbnail(resolution=5, units="mpp")
I get an img like this:
reading at baseline resolution is fine:
w,h=imghe.info.as_dict()['slide_dimensions']
he_hr=imghe.read_rect((0,0),(w,h))
I think the issue is coming from the fact that we are using location
and size
at baseline resolution instead of the resolution given by find_read_rect_params
@John-P @shaneahmed , another small issue I have seen, is that when reading metadata in here: https://github.com/TissueImageAnalytics/tiatoolbox/blob/b0d59811f67b5439571473e9e40b5a9bb27f8e3b/tiatoolbox/wsicore/wsireader.py#L3291-L3296
for an ome.tiff file self.tiff.is_ome
and self.tiff.pages[0].is_tiled
are both True
. Since we are checking only with if
and not with elif
some metadata read by self._parse_ome_metadata()
will be overwritten by self._parse_generic_tiled_metadata()
making for example objective_power
to be None
Also having an issue with TIFFWSIReader. Certain ome.tiff when read has array shape (H, W, C), while others are in (C, H, W). This throws off your code. I suggest adding a line in there to make sure the array is always channel-last (or channel-first, whichever you prefer).
@jessecanada Hi, I'd like to look into this for you. Are you able to provide a test image?
@jessecanada Hi, I'd like to look into this for you. Are you able to provide a test image?
Hi @John-P
I sent a link to a test file to the tia email address found in setup.py
@rogertrullo
or an ome.tiff file self.tiff.is_ome and self.tiff.pages[0].is_tiled are both True....
This issue has now been addressed in release 1.3.0
Hi @jessecanada (and @shaneahmed), I've done a bit of preliminary investigation for the file you sent over and just wanted to update you on my initial assessment.
It looks like it is not one of the formats supported by openslide or tifffile which we use to read tiff files. However, it may be possible to get something working.
It appears that the axes is this file at in (H, W, C) or YXS order (according to tifffile) and I am able to decode the tiles which are regular RGB JPEGs. Unfortunately, although tifffile can read arbitrary tiff files, it appears to be tripped up a bit by this one. For some reason tifffile doesn't produce a valid zarr view of the first IFD (page), which is what we use internally to decode regions of pixel data. This may be a bug with tifffile as the items in the zarr are simply metadata files instead of pixel data.
One easy option is to decode the whole image to a memory mapped numpy array with tifffile. However, this would have a very long start-up time and use a lot of disk space (~70GB). It is also possible to create an alternative method where we manually read individual tiles from a given page in the TIFF and append/crop the results for the desired region.
It appears that openslide also attempts to read this as a generic tiled tiff but has some memory error internally when generating a thumbnail which I have not further investigate at this moment. However, openslide read_region
appears to function. I can't see an obvious reason why openslide would fail here at the first IFD appears to be a fairly standard JPEG compressed tiled TIFF IFD. I think it may be because it is not able to utilise the additional resolution levels.
@rogertrullo would you mind sending me that file which you have converted and are having issues with?
@jessecanada it appears that simply using the OpenSlideWSIReader
will work for this file. The downside is that only the full resolution level is read. This will mean that reads are reduced resolution will be slower and use a more memory than if the additional resolution levels were being used. This is perhaps why the thumbnail method is failing. I believe that the openslide thumbnail method is simply trying to use a ton of memory to read in the whole image.
import tiatoolbox as tt
reader = tt.wsicore.wsireader.OpenSlideWSIReader("filename.tif")
# Normal methods such as read_region and read_rect should function
region_1 = reader.read_region((0, 0), 0, (1024, 1024))
region_2 = reader.read_rect((0, 0), (1024, 1024), 1, "mpp")
@rogertrullo would you mind sending me that file which you have converted and are having issues with? @John-P I put it here: https://drive.google.com/file/d/1PYNzfCvwElRUiLTDlqV1iqfcgiIc-cH_/view?usp=sharing
I think we should keep an eye on https://github.com/openslide/openslide/pull/344. As this may help resolve some of the problems.
Unfortunately, although tifffile can read arbitrary tiff files, it appears to be tripped up a bit by this one. For some reason tifffile doesn't produce a valid zarr view of the first IFD (page), which is what we use internally to decode regions of pixel data. This may be a bug with tifffile as the items in the zarr are simply metadata files instead of pixel data.
Could you open a tifffile issue or make the file available to me?