HistoQC icon indicating copy to clipboard operation
HistoQC copied to clipboard

Refactor BaseImage Class to support reading multiple WSI format such DICOM By using Abstract Base Class

Open nanli-emory opened this issue 1 year ago • 6 comments

Description

Currently, HistoQC is using OpenSlide as WSI reader to manipulate whole-slide images. Unfortunately, OpenSlide is not really maintained anymore. It makes HistoQC only supports older file formats such svs, ndpi, big-tiff and can't handle the newer file formats such as DICOM, philips and new 3dhistec versions because of openslide.

Issues & Potential solutions

  1. DICOM support -> use pydicom + wsidicom to create a new dicom reader
  2. the newer file formats support -> use bioformats to create a new bioformat reader
  3. sustainable support of the latest WSI formats -> refactor BaseImage and create an abstract reader by using Abstract Base Class (ABC)
  4. Bounding Box? (HistoQC support bounding box but I'm not sure if other SWI format has it or not)

General Class Diagram

classDiagram
    
    class BaseImage{
        -SWIImageReader reader
        -BoundingBox bbox
        -[int] dimensions

        -List levels
        -String magnification
        -int level_count
        -List~int~ level_downsamples
        -List~int~ level_dimensions
        +getBoundingBox()
        +GetThumbnail()
        +getTheBestThumbnail()
        +getTheBestLevelForDownsample()
        +readRegion()
    }
    class BoundingBox{
        +int x
        +int y
        +int width
        +int height

    }
    class Config{
        +bool enableBoundingBox
        +String imageWorkSize
        +enum maskStatistics
        +bool enableBoundingBox
    }
    class MaskStatistics{
        <<enumeration>>
        relative2mask
        absolute
        relative2image
    }
    class WSIImageReader~ABC~{
        <<abstract>>
        +getLevels()
        +getDimensions()
        +getLevelDimensions()
        +getLevelDownsamples()
        +getThumbnail()
        +getRegion()
    }

    class DICOMReader~WSIImageReader~{
    

    }

    class OpenSlideReader~WSIImageReader~{

    }

    class BioformatReader~WSIImageReader~{

    }
    BaseImage *-- WSIImageReader
    BaseImage *-- BoundingBox
    BaseImage *-- Config
    Config *-- MaskStatistics
    WSIImageReader <|-- DICOMReader
    WSIImageReader <|-- OpenSlideReader
    WSIImageReader <|-- BioformatReader

Readers Dependency

Reader Libraries Dependency
OpenSlideReader OpenSlide ???
DICOMReader wsidicom, pydicom numpy, Pillow
BioformatReader python-bioformats javabridge, JVM

nanli-emory avatar May 18 '23 20:05 nanli-emory

Yup, this looks to be inline with what i was thinking, thanks! top priority is the dicom component (using wsidicom), and then we can regroup on the other components

choosehappy avatar May 19 '23 13:05 choosehappy

Yup, this looks to be inline with what i was thinking, thanks! top priority is the dicom component (using wsidicom), and then we can regroup on the other components

Might be similar to https://github.com/choosehappy/HistoQC/pull/221 (fork deleted sadly). image

Since libraries like wsidicomizer and TiffSlide mostly mimic the interface of openslide, it is possible to simply encapsulate the osh object (currently the openslide handle) as a into a base class while providing a unified set of interfaces for methods such as read_region, and provide a factory method to instantiate the handle correspondingly. This way, the modification of class BaseImage and other modules could be minimized (the osh is deeply coupled within most of the qc modules).

Note that how meta info is stored may be different across each library (for slides that are not supported by openslide) and therefore it may not be trivial to simply implement osh.properties to adapt functions like getMag.

CielAl avatar May 30 '23 07:05 CielAl

Openslide has now incorporated DICOM support: https://openslide.org/news/.

Let's keep an eye on this. The latest openslide version is currently available via ppa repository only. Once the ubuntu repository is updated, consider allowing openslide to handle dicom images natively instead of using wsidicom with a custom DICOM handle.

jacksonjacobs1 avatar Nov 09 '23 18:11 jacksonjacobs1

Openslide has now incorporated DICOM support: https://openslide.org/news/.

Let's keep an eye on this. The latest openslide version is currently available via ppa repository only. Once the ubuntu repository is updated, consider allowing openslide to handle dicom images natively instead of using wsidicom with a custom DICOM handle.

Customization of cache is also intriguing as it directly affects how fast functions like read_region can perform. But it may also be nice to make the choice of image backend optional similar to QuPath does - therefore users may choose based on what's the best for their own environment from as many options as possible, and it is always beneficial to remove the direct coupling between qc modules and openslide APIs anyway.

CielAl avatar Nov 09 '23 18:11 CielAl

Hi everyone, while I was trying to use HistoQC on DICOM files, I stumbled across this Issue and was wondering what the current status is.

To my knowledge Openslide 4.0.0 and at least the latest version OpenSlide Python 1.3.1 can read DICOM files as stated here and here. I could only find an older version of OpenSlide Python in your code. Do you plan to update the version anytime soon or switch to bioformats/wsidicom at some point?

Best wishes, Daniela

DanielaSchacherer avatar Apr 09 '24 13:04 DanielaSchacherer

Hi everyone,

while I was trying to use HistoQC on DICOM files, I stumbled across this Issue and was wondering what the current status is.

To my knowledge Openslide 4.0.0 and at least the latest version OpenSlide Python 1.3.1 can read DICOM files as stated here and here.

I could only find an older version of OpenSlide Python in your code. Do you plan to update the version anytime soon or switch to bioformats/wsidicom at some point?

Best wishes,

Daniela

Update your openslide binary to 4.0.0 (if Windows then you also need to place the binaries in the bin folder we created under the histoqc path), and the openslide-python wrapper may work. New openslide does not break any back port compatibility iirc.

CielAl avatar Apr 09 '24 13:04 CielAl