python-bioformats
python-bioformats copied to clipboard
Incorrect namespaces detection leads to incomplete/ incorrect metadata readout
Hi everyone!
First of all: Great and very much appreciated bioformats wrapper!
I encountered the following issue:
My aim is to filter my images based on the channel. The images were acquired on a Molecular Devices microscope and the channel information is hidden in the OME xml in Pixels.Channel.Name = "TL20". Using bioformats in Fiji works perfectly fine and uses the 2016-6 namespaces.
When I use python-bioformats as follows
import javabridge
import bioformats
path = r"path\\to\\20160130-corning-all-spheroids-p2-095hps_A01_w1.TIF"
omeXml = bioformats.get_omexml_metadata(path)
the information displayed is partially wrong (PhysicalSize) and incomplete (Channel Name is missing). I noticed that in the python-bioformats xml namespaces 2015-01 is used. I tried to track down where things are going wrong and found that in omexml.py the default namespace is 2013-06, which then is replaced by the top-level namespaces in get_namespaces. I tried downloading the .xsd file directly from https://www.openmicroscopy.org/Schemas/OME/2016-06/ and to manually associate the schema, but also failed to display the correct information.
Is there a way to manually correct the namespaces in python-bioformats? Am I actually on the right track?
For some reason I cannot upload a .zip with .tifs and the .xmls here, I've uploaded it here: https://github.com/FannyGeorgi/SampleData And some environment information: I am using python 2.7, javabridge from http://www.lfd.uci.edu/%7Egohlke/pythonlibs/#javabridge and bioformats 0.1.14 with JRE 1.8.0_131 in Windows 10.
Thanks for your help! Fanny
Hey @FannyGeorgi ,
I'm working on updating the XML in #83 . Once that PR is merged, would you mind giving it a try? I hope it'll solve the issue you're having.
@mcquin I came across the same issue. The file format is imagexpress data as Fanny saved in the SampleData folder. Channel Name is missing when using python-bioformats to retrieve the metadata.
I am using python 3.8 and python-bioformats 4.0.4, so this updates doesn't seem to fix it.
When using meta = bioformats.get_omexml_metadata(path=f1) to read the embedded XML in the tiff file, the Channel Name is supposed to be between Channel ID and SamplesPerPixel, but it's not there. <Channel ID="Channel:0:0" SamplesPerPixel="1">
Thanks for your help!