cucim icon indicating copy to clipboard operation
cucim copied to clipboard

[QST] How to enable cucim compatibility mode - Debian 9

Open diricxbart opened this issue 2 years ago • 3 comments

Installed cucim version: 22.4.0 Our system has 4 V100 GPU's and is running on Debian 9.

Tue Apr 19 22:17:23 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.60.02    Driver Version: 510.60.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:61:00.0 Off |                    0 |
| N/A   38C    P0    58W / 300W |   3986MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  Off  | 00000000:62:00.0 Off |                    0 |
| N/A   34C    P0    40W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  Off  | 00000000:89:00.0 Off |                    0 |
| N/A   33C    P0    38W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  Off  | 00000000:8A:00.0 Off |                    0 |
| N/A   34C    P0    43W / 300W |      3MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

On that Debian 9 OS we run the NVidia Ubuntu CUDA Docker image:

FROM nvcr.io/nvidia/cuda:11.6.2-devel-ubuntu20.04
...
RUN pip3 install --upgrade cucim

When loading an svs image using cucim, I get:

[Error] cuFileHandleRegister fd: 36 (/data/<my_file>.svs), status: internal error. Would work with cuCIM's compatibility mode.

Installing GDS on my native Debian 9 OS is not possible: gds-tools are only available for debian10 and debian11, but not for debian9: https://developer.download.nvidia.com/compute/cuda/repos/ And this system is used by multiple users, so updating OS version is non-trivial...

Reverting to a cucim version before enabling GDS support is also not an option, since this doesn't support SVS files yet.

So I tried disabling GDS according to these instructions: https://docs.nvidia.com/gpudirect-storage/troubleshooting-guide/index.html#enable-comp-mode

The first instruction (removing the nvidia-fs kernel driver) however yields:

sudo rmmod nvidia-fs
rmmod: ERROR: Module nvidia_fs is not currently loaded

This is confirmed not to be loaded using lsmod | grep nvidia

My /etc/cupy.json file also has ['properties']['allow_compat_mode'] set to true

So I would expect to be running in compatibility mode, which is not what the above error message is implying.

How would I get cucim 22.4.0 working in GDS compatibility mode? Or would you have any other advice (excl. updating OS :-) )

Thanks in advance...

diricxbart avatar Apr 19 '22 20:04 diricxbart

Hi @diricxbart !

When loading an svs image using cucim

Could you please share the line of the code you used to load svs image to understand the context?

AFAIU, the below message wouldn't happen unless you called read_region() method with device='cuda' parameter or you have used cuCIM's filesystem package (cucim.clara.filesystem).

[Error] cuFileHandleRegister fd: 36 (/data/<my_file>.svs), status: internal error. Would work with cuCIM's compatibility mode.

There is not much benefit to specifying device='cuda' in read_region() method to use GDS+nvJPEG at this moment (we need to improve the performance).

If you have used fs.open() method, it is not for loading an image file (it is for reading a block of file) and you can use rp option to not use GDS.

import cucim.clara.filesystem as fs

fd = fs.open("input/image.tif", "r")
fs.close(fd)  # same with fd.close()

# Open file without using GDS
fd2 = fs.open("input/image.tif", "rp")
fs.close(fd2)  # same with fd2.close()

gigony avatar Apr 22 '22 00:04 gigony

Hi Gigon,

Thank you for your response...

This code (so indeed with setting device to cuda):

slide = CuImage('/data/pathology/TCGA-G7-A8LE-01A-01-TS1.39D4D79F-6CE4-441C-8EBD-42323F7B9C11.svs')
img = slide.read_region(level=3, device='cuda')

Results in this error:

[Error] cuFileHandleRegister fd: 101 (/data/pathology/TCGA-G7-A8LE-01A-01-TS1.39D4D79F-6CE4-441C-8EBD-42323F7B9C11.svs), status: internal error. Would work with cuCIM's compatibility mode.

If I set device='cpu, then it does work properly...

Could you please elaborate on the purpose / impact / consequences of the device property? My understanding was that setting device to 'cuda' results in a cupy array on the GPU, while 'cpu' results in a numpy array on CPU?

diricxbart avatar Apr 26 '22 11:04 diricxbart

Hi @diricxbart

By default, read_region() method doesn't use GPU-accelerated libraries to decode compressed image data (jpeg/jpeg2000) in .svs or .tif format and you don't need to specify device="cpu" in the method.

cuCIM optimizes TIFF(-like) image loader so its read is faster than other libraries 'with CPU'.

Using GPU-accelerated GPU decoding libraries is particularly useful when

  • image to decode is large, and
  • decoding multiple images in parallel (batch loading)

And, not using GPU(CUDA)-based image loader is particularly useful when

  • GPU resource needs to be used for training
  • using multi-process data loader while training
    • https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

From v22.02.00, we introduce multithreading and batch processing feature in read_region() and leverage GDS and NVJpeg to load and decode JPEG-compressed image data if device="cuda" is given.

https://github.com/rapidsai/cucim/wiki/release_notes_v22.02.00#2-supporting-multithreading-and-batch-processing

cuCIM now supports loading the entire image with multi-threads. It also supports batch loading of images.

If device parameter of read_region() method is "cuda", it loads a relevant portion of the image file (compressed tile data) into GPU memory using cuFile(GDS, GPUDirect Storage), then decompress those data using nvJPEG's Batched Image Decoding API.

Current implementations are not efficient and performance is poor compared to CPU implementations. However, we plan to improve it over the next versions.

Since we are not utilizing CUDA streams and GPU memory well, current implementations are not efficient and we would like to improve it.

For this reason, we are not recommending using device="cuda" for now.

If you want to move loaded data to GPU, please convert CuImage object (the output of read_region() method) to CuPy array by cupy.asarray() method.

import cupy as cp

slide = CuImage('/data/pathology/TCGA-G7-A8LE-01A-01-TS1.39D4D79F-6CE4-441C-8EBD-42323F7B9C11.svs')
img = slide.read_region(level=3) # you can add `, num_workers=8` to load the image with 8 threads.

img_gpu = cp.asarray(img)

Using Cache feature would make it faster when loading multiple patches from arbitrarily locations and sizes in a WSI image.


import cupy as cp

CuImage.cache('per_process', memory_capacity=1024)  # Using 1GB of system memory for cache.

slide = CuImage('/data/pathology/TCGA-G7-A8LE-01A-01-TS1.39D4D79F-6CE4-441C-8EBD-42323F7B9C11.svs')
img = slide.read_region((100, 100), (256, 256) , level=0)
img = slide.read_region((110, 110), (256, 256) , level=0)  # much faster this time as it uses cached tile data from cache.

gigony avatar Apr 26 '22 22:04 gigony

Debian 9 is EOL:

  • https://wiki.debian.org/LTS
  • https://www.debian.org/releases/stretch/

If this is still occurring with a more recent OS, let's open a new issue with reproducer taking into account the feedback already given here

Thanks all! 🙏

jakirkham avatar Apr 16 '24 20:04 jakirkham