HistoQC
HistoQC copied to clipboard
Refactor BaseImage Class to support reading multiple WSI format such DICOM By using Abstract Base Class
Description
Currently, HistoQC is using OpenSlide as WSI reader to manipulate whole-slide images. Unfortunately, OpenSlide is not really maintained anymore. It makes HistoQC only supports older file formats such svs, ndpi, big-tiff and can't handle the newer file formats such as DICOM, philips and new 3dhistec versions because of openslide.
Issues & Potential solutions
- DICOM support -> use pydicom + wsidicom to create a new dicom reader
- the newer file formats support -> use bioformats to create a new bioformat reader
- sustainable support of the latest WSI formats -> refactor BaseImage and create an abstract reader by using Abstract Base Class (ABC)
- Bounding Box? (HistoQC support bounding box but I'm not sure if other SWI format has it or not)
General Class Diagram
classDiagram
class BaseImage{
-SWIImageReader reader
-BoundingBox bbox
-[int] dimensions
-List levels
-String magnification
-int level_count
-List~int~ level_downsamples
-List~int~ level_dimensions
+getBoundingBox()
+GetThumbnail()
+getTheBestThumbnail()
+getTheBestLevelForDownsample()
+readRegion()
}
class BoundingBox{
+int x
+int y
+int width
+int height
}
class Config{
+bool enableBoundingBox
+String imageWorkSize
+enum maskStatistics
+bool enableBoundingBox
}
class MaskStatistics{
<<enumeration>>
relative2mask
absolute
relative2image
}
class WSIImageReader~ABC~{
<<abstract>>
+getLevels()
+getDimensions()
+getLevelDimensions()
+getLevelDownsamples()
+getThumbnail()
+getRegion()
}
class DICOMReader~WSIImageReader~{
}
class OpenSlideReader~WSIImageReader~{
}
class BioformatReader~WSIImageReader~{
}
BaseImage *-- WSIImageReader
BaseImage *-- BoundingBox
BaseImage *-- Config
Config *-- MaskStatistics
WSIImageReader <|-- DICOMReader
WSIImageReader <|-- OpenSlideReader
WSIImageReader <|-- BioformatReader
Readers Dependency
Reader | Libraries | Dependency |
---|---|---|
OpenSlideReader | OpenSlide | ??? |
DICOMReader | wsidicom, pydicom | numpy, Pillow |
BioformatReader | python-bioformats | javabridge, JVM |
Yup, this looks to be inline with what i was thinking, thanks! top priority is the dicom component (using wsidicom), and then we can regroup on the other components
Yup, this looks to be inline with what i was thinking, thanks! top priority is the dicom component (using wsidicom), and then we can regroup on the other components
Might be similar to https://github.com/choosehappy/HistoQC/pull/221 (fork deleted sadly).
Since libraries like wsidicomizer and TiffSlide mostly mimic the interface of openslide, it is possible to simply encapsulate the osh
object (currently the openslide handle) as a into a base class while providing a unified set of interfaces for methods such as read_region
, and provide a factory method to instantiate the handle correspondingly.
This way, the modification of class BaseImage and other modules could be minimized (the osh is deeply coupled within most of the qc modules).
Note that how meta info is stored may be different across each library (for slides that are not supported by openslide) and therefore it may not be trivial to simply implement osh.properties to adapt functions like getMag.
Openslide has now incorporated DICOM support: https://openslide.org/news/.
Let's keep an eye on this. The latest openslide version is currently available via ppa repository only. Once the ubuntu repository is updated, consider allowing openslide to handle dicom images natively instead of using wsidicom with a custom DICOM handle.
Openslide has now incorporated DICOM support: https://openslide.org/news/.
Let's keep an eye on this. The latest openslide version is currently available via ppa repository only. Once the ubuntu repository is updated, consider allowing openslide to handle dicom images natively instead of using wsidicom with a custom DICOM handle.
Customization of cache is also intriguing as it directly affects how fast functions like read_region can perform. But it may also be nice to make the choice of image backend optional similar to QuPath does - therefore users may choose based on what's the best for their own environment from as many options as possible, and it is always beneficial to remove the direct coupling between qc modules and openslide APIs anyway.
Hi everyone, while I was trying to use HistoQC on DICOM files, I stumbled across this Issue and was wondering what the current status is.
To my knowledge Openslide 4.0.0 and at least the latest version OpenSlide Python 1.3.1 can read DICOM files as stated here and here. I could only find an older version of OpenSlide Python in your code. Do you plan to update the version anytime soon or switch to bioformats/wsidicom at some point?
Best wishes, Daniela
Hi everyone,
while I was trying to use HistoQC on DICOM files, I stumbled across this Issue and was wondering what the current status is.
To my knowledge Openslide 4.0.0 and at least the latest version OpenSlide Python 1.3.1 can read DICOM files as stated here and here.
I could only find an older version of OpenSlide Python in your code. Do you plan to update the version anytime soon or switch to bioformats/wsidicom at some point?
Best wishes,
Daniela
Update your openslide binary to 4.0.0 (if Windows then you also need to place the binaries in the bin folder we created under the histoqc path), and the openslide-python wrapper may work. New openslide does not break any back port compatibility iirc.