ASAP icon indicating copy to clipboard operation
ASAP copied to clipboard

[multiresolutionimageinterface] Speed up patch loading

Open prerakmody opened this issue 2 years ago • 2 comments

Hi, I am attempting to write a as-fast-as-possible (tensorflow/python) dataloader for WSI patches. I looked in the issues for keywords like "fast", "speed", "accelerate", but did not find any best practices.

This is what i have tried for CAMELYON 16 dataset. Maybe the maintainers/community can provide some insights?

# Import ASAP lib first!
import sys
sys.path.append('C:\\Program Files\\ASAP 2.1\\bin')
import multiresolutionimageinterface as mir
reader = mir.MultiResolutionImageReader()

# Step 1 - Loop over random anchor points "pre-selected" from whole-slides-images

# res = {patient_key1: KEY_POINTS: [[x1,y1], [x2,y2], ....]}
patch_width   = ...
patch_height  = ...
patient_level = ...
 
for patient_key in res:
    
    path_img  = ...
    path_mask = ...
    wsi_img   = reader.open(str(path_img)) 
    wsi_mask  = reader.open(str(path_mask))
    ds_factor = wsi_mask.getLevelDownsample(patient_level)
    
    # Step 2 - Loop over points for a particular patient
    for point in res[patient_key][KEY_POINTS]:
        
        wsi_patch_mask  = np.array(wsi_mask.getUCharPatch(point[0]) * ds_factor, point[1] * ds_factor, patch_width, patch_height, patient_level))
        wsi_patch_img   = np.array(wsi_img.getUCharPatch( point[0]) * ds_factor, point[1] * ds_factor, patch_width, patch_height, patient_level))

        yield(wsi_patch_img, wsi_patch_mask)

Full code can be found here

My concern is that since I am loading so many patches from the same patient (with some randomization). And then once a fixed set of patches N is loaded from a patient, I move on to the next patient. Is it not possible to speed the patch loading for a patient? Or should I load the whole image at once, but that may lead to memory overflow?

prerakmody avatar Nov 24 '22 13:11 prerakmody