TriPath icon indicating copy to clipboard operation
TriPath copied to clipboard

Data processing for PCa_Bx_3Dpathology dataset

Open Dylan-H-Wang opened this issue 6 months ago • 0 comments

Hi,

Thank you for the wonderful work!

I am still confused about running the algorithm on the PCa_Bx_3Dpathology dataset.

This is what I have done so far

  1. Download the data from https://www.cancerimagingarchive.net/collection/pca_bx_3dpathology as you mentioned in the paper. The WSI data in the dataset is saved as a HDF5 file giving two channel information (nuclei and cyto).
  2. Run the Step 1: Tissue segmentation & patching

It will throw error messages. I tried to debug and find some questions:

  1. It seems like the ThreeDimImage using read_img function is deal with metadata with .dat extension and image files with .dcm or compatible with opencv. But the raw data contained in pca_bx_3dpathology is h5py file with .xml file as the metadata. How did you preprocess the raw data?
  2. You mentioned in the paper that "we replicate the nuclear channel data across the first two channels and set the eosin channel as the third". Does it also applied when you are applying segmentation for the image? Since I have checked the segmentation implementation in SerialTwoDimImage, it was implemented through opencv in _getBinarizedImage() and should only applied for RGB or grayscale images. Did you convert the two-channel (nuclei and eosin) to RGB first before segmentation?
  3. In process_list_seg.csv, it only provides settings for c001-A, does it apply to all other files, i.e., c001-B, ..., c050-E.

Thank you!

Dylan-H-Wang avatar Aug 19 '24 04:08 Dylan-H-Wang