pyvips icon indicating copy to clipboard operation
pyvips copied to clipboard

How to generate patches from coordinate list by a multiprocessing way?

Open ZeningZeng opened this issue 2 years ago • 1 comments

Hi, recently, I tried to use pyvips to generate WSI patches, but I encountered a problem. Given a coordinate list shaped like N × 2, where N is the number of patches. How to generate patches by a multiprocessing way? I have tried using PyTorch's Dataloader to implement it.

class SingleWSIDataset(Dataset):
    def __init__(self, cor_list, wsi, patch_size):
        super().__init__()
        self.wsi = wsi
        self.patch_size = patch_size
        self.transform = T.Compose([T.ToTensor(), T.Resize((224, 224), antialias=True), T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])])
        self.cor_list = cor_list

    def __len__(self):
        return len(self.cor_list)

    def __getitem__(self, idx):
        x, y = self.cor_list[idx]
        tile = self.wsi.crop(x, y, self.patch_size, self.patch_size).numpy()[..., :3]
        tile = self.transform(tile)
        return tile
wsidataset = SingleWSIDataset(cor_list, wsi, 512)
loader = DataLoader(wsidataset, 256, True, num_workers=0, pin_memory=True)

However, when num_workers>0, CPU utilization reduced to 0 and the program is constantly running but no error reported. When num_workers=0, it works. I don't know what caused this problem. Can you tell me the reason or provide me with a better method?

ZeningZeng avatar Dec 27 '23 16:12 ZeningZeng

Hi @z1186464862,

tile = self.wsi.crop(x, y, self.patch_size, self.patch_size).numpy()[..., :3]

This will be very slow -- it's much better to pass the rgb flag to openslideload and then not drop the alpha.

How large are your tiles? If they are small (eg. 32x32 pixels), then fetch can be much faster than crop.

You can get much better performance if you sort the coordinate list, perhaps you do this.

It's hard to comment on code fragments in detail without being able to run then or see the context. It's best to post a complete runnable program I can test.

jcupitt avatar Dec 27 '23 17:12 jcupitt