pyvips best compression settings for a wsi viewer

hi, I have a project to make wsi images using frames captured from a camera on microscope, my maker code uses opencv and numpy to store final array and update it with every frame I have some questions: 1- when I'm saving final file what's best settings to save as pyramid and make it suitable to show in a viewer for analysis without quality loss?(I was saving with opencv imwrite function first but when I needed to rotate final image it takes double memory to make rotated image then saves it so I decided to do this with pyvips) 2-I wanted to know if there is any other method that can speedup this process? or reduce memory usage because right now I'm making a 50k x 50k * 3 array at start of program and just update it or pad it when i get to edges, and it takes 7.5gb memory and when pad is running it goes up to double memory but after pad is finished it goes down to real size. 3-is it normal when I try to open these images with windows photo viewer it takes several seconds to minutes to open and show it? this is the code I'm using to rotate and save the tiff image saved with opencv imwrite:

image = pyvips.Image.new_from_file('../wsi.tiff')
image = image.rotate(90)
image.tiffsave('../wsi2.tiff', tile=True, compression='jpeg', bigtiff=False, pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)

Aug 03 '24 12:08 sinamcr7

Hello @sinamcr7,

I think I'd use Q=85, that's roughly what slide scanner companies use. You could possibly use 512x512 tiles, though 256x256 is a good choice. 1024x1024 is probably too large.
You could move more of your array assembly processing to pyvips, it should speed it up and drop memory use. I would look at merge and friends:

https://www.libvips.org/API/current/libvips-mosaicing.html#vips-merge

It does a pair-wise join of two images with a feathered edge. You can merge your $n camera images and save as pyramidal tiff in one step with no intermediates.
Windows photo viewer is not designed for large images. It will just decompress the entire thing to RAM and then paint the screen from that, so it'll be very slow and need colossal amounts of memory. I made a simple image viewer which should be quick:

https://github.com/jcupitt/vipsdisp

There's a windows build here:

https://github.com/jcupitt/vipsdisp/releases/tag/v3.0.4

Just unzip somewhere and doubleclick the exe. The keyboard shortcuts are useful:

https://github.com/jcupitt/vipsdisp?tab=readme-ov-file#shortcuts

It supports things like colour management for slide images, which can be handy if you have a profile for your camera and microscope.

Aug 03 '24 12:08 jcupitt

I've just realised you already have the entire image in memory as a numpy array, is that right?

In which case you can simply do:

image = pyvips.Image.new_from_array(big_numpy_array)
image = image.rot90()
image.tiffsave("some-filename.tif",
               compression="jpeg",
               Q=85,
               tile=True,
               tile_width=256,
               tile_height=256,
               pyramid=True)

libvips and numpy will share the memory, so there will (I think!) be no copy and no extra memuse.

.rot90() does a fixed 90 degree rotate, so it's faster and more accurate than .rotate(90).

You could also use pyvips to save as OME-TIFF, though it'll be slower. Have you looked at QuPath?

Aug 03 '24 14:08 jcupitt

thanks for suggestions yes I make image array at program launch so it takes 7.5gb memory at start, its to reduce need for padding later, is it better to convert all operations to vips or opencv is enough? no I didn't try QyPath yet I'm interested in using vips for making wsi if it reduces memory usage, I have an gui with pyqt to show windows while making wsi, currently when I reach edges I have to pad with np.pad and it takes some time to do it so it adds delays to gui, so I chose a big array with a optimal starting point to avoid pads as much as I can, is there any workaround to solve this issue?

Aug 04 '24 10:08 sinamcr7

I'd stick to opencv if your code is working. pyvips ought to be able to make the pyramidal tiff directly from the numpy array with only a relatively small amount of extra memory.

It depends how you are making the slide image. What corrections do you apply to frames from the microscope? How accurate is your stage? We'd need to get into a lot of detail before I could answer.

Aug 04 '24 12:08 jcupitt

ok thanks then I'll keep opencv part, I also have issue with np.pad, when my app reaches that it takes double of array memory and after several seconds it goes down to actual memory that it should take, is there any faster and low memory method to pad an array?

Aug 04 '24 13:08 sinamcr7

Sorry, no, array resize needs to reallocate memory, and that means it must double.

You could change to a tiled array. Cut your image into eg. 1024x1024 tiles and keep a meta-array of references to them. Now you can pad by just making a new column of tiles on the right and nulling the references to the old column of tiles on the left.

Aug 04 '24 13:08 jcupitt

... though a tiled array will make save difficult, of course.

I've built several imaging systems like this. I've always done it in two stages: first, scan the camera over the slide and capture a set of frames to your storage. You can keep a low-res version in memory to show the user what the slide looks like during the scan. You can do geometric and radiometric correction at this stage. You examine the overlap areas and estimate the frame positions from the content.

Second, have an "assembly" phase which reads the corrected tiles from your storage, merges them together using offsets computed in stage 1, possibly does some extra corrections for lighting consistency or colour, and saves as a pyramidal tiff. pyvips would be a reasonable choice for the assembly stage.

Commercial slide scanners usually work in the same way, though some very high throughput systems will have a huge area of RAM for assembly rather than temporary files on your drive.

Aug 04 '24 13:08 jcupitt

well our algorithm needs that big array for some calculations, if we remove that part I don't know how I should compute exact locations and overlaps, also without that how can I show thumbnail and current frame to user?

also I tried vips tiffsave with jpeg and q=85, q=100 and opencv imwrite, but there wasn't much difference in quality and it seems opencv imwrite has more quality, is that right or I did something wrong? here's code:

cv2.imwrite('wsi.tiff', self.image)
image = cv2.cvtColor(self.image, cv2.COLOR_BGR2RGB)
image = numpy_to_pyvips(image)
image = image.rot90()
image.tiffsave('wsi2.tiff', tile=True, compression='jpeg', pyramid=True, Q=85, tile_width=256, tile_height=256, properties=True)
image.tiffsave('wsi3.tiff', tile=True, compression='jpeg', pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)

Aug 05 '24 07:08 sinamcr7

The scanners I've made have driven the stage in an approximate grid, then examined the overlaps to find a set of exact offsets. The set of offsets are an overdetermined linear system, so you can use eg. least mean square to find a set of frame positions which minimise overall positioning error.

Friends have made system which calibrate the stepper motors instead, so they use a known target, then drive the stage over the field grabbing frames and refining a pair of XY positioning tables. This can work very well, but calibration takes a long time and you have to redo it fairly often as the threads in the stages wear.

You can keep eg. a 10k x 10k image in memory plus the current frame to show the user, then generate the full 100k x 100k image during save.

The settings mostly affect file size. I see:

$ vips copy CMU-1.svs x-85.tif[compression=jpeg,tile,pyramid,Q=85]
$ vips copy CMU-1.svs x-100.tif[compression=jpeg,tile,pyramid,Q=100]
$ vips copy CMU-1.svs x.tif
$ ls -l x* CMU-1.svs
-rw-r--r-- 1 john john  794542522 Aug  5 13:37 x-100.tif
-rw-r--r-- 1 john john  174346312 Aug  5 13:33 x-85.tif
-rw-r--r-- 1 john john 6056321460 Aug  5 13:38 x.tif
-rw-r--r-- 1 john john  177552579 Feb 13  2021 CMU-1.svs

The uncompressed TIFF will obviously be the best quality, but 6GB is far too large to be practical. Almost no users will want to use it. Q100 is 10x smaller, Q85 is 40x smaller. Aperio (the slide scanner that makes SVS format slides) is using Q85.

Aug 05 '24 12:08 jcupitt

oh you're talking about motorized scanners, I'm working with manual wsi scanning, user should be able to see area around FOV, FOV itself for focus check and a thumbnail of big image so user can see when its reaching edges and need to wait for pad, and if there is any errors he will use tools to clear wrong frames up to last correct frame, then start scanning again

Aug 05 '24 13:08 sinamcr7

Ah OK, I've never tried with a manual stage, I agree that would need a different approach.

Other friends have made interactive scanners using a manual stage, but I don't know what technical solution they used. Probably allocating the complete image in memory before starting.

Aug 05 '24 13:08 jcupitt

Hi @jcupitt --

looking back at release libvips version 8.15.1 I noticed that Q >=90 is not supported.

But in the statement above by @sinamcr7 (copied here):

image.tiffsave('wsi3.tiff', tile=True, compression='jpeg', pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)

I'm curious if Q=100 will result in a false image?

Much appreciated.

Sep 20 '24 02:09 nahilsobh

8.15.1 fixed that bug, so Q100 should be fine.

Sep 20 '24 07:09 jcupitt

I used 8.15.3 with this image: http://merovingio.c2rmf.cnrs.fr/iipimage/PalaisDuLouvre.tif vips --version vips-8.15.3 vips im_vips2tiff PalaisDuLouvre.tif output_image.tif:jpeg:90,tile:256x256,pyramid and got falsely colored image. Thx.

Sep 20 '24 13:09 nahilsobh

Screen Shot 2024-09-20 at 8 46 04 AM

Sep 20 '24 13:09 nahilsobh

I see:

$ vips --version
vips-8.15.1
$ vips im_vips2tiff PalaisDuLouvre.tif output_image.tif:jpeg:90,tile:256x256,pyramid

I can view it in vd and eog:

Here's the file it generated:

www.rollthepotato.net/~john/output_image.tif

It could be a bug in your image viewer -- try downloading and viewing the version I made.

Sep 20 '24 15:09 jcupitt

You are using the old vips7 syntax, which is deprecated.

The new CLI syntax is:

$ vips tiffsave PalaisDuLouvre.tif output_image.tif --compression jpeg --Q 90 --tile-width 256 --tile-height 256 --pyramid

You can also write:

$ vips copy PalaisDuLouvre.tif output_image.tif[compression="jpeg",Q=90,tile-width=256,tile-height=256,pyramid]

The new interface is often faster, so it's worth changing over. It's easier to read as well, of course.

Sep 20 '24 15:09 jcupitt

I'm getting this output in qupath

Sep 20 '24 16:09 nahilsobh

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img = mpimg.imread("output_image.tif")
plt.imshow(img)
plt.show()

and got this

Sep 20 '24 16:09 nahilsobh

I tried with Q=89 and got this

Sep 20 '24 16:09 nahilsobh

I think v 8.15.3 can not take Q>=90 and generate the correct compressed image.

Sep 20 '24 16:09 nahilsobh

Did you download the image I made and test that?

I tried that image in QuPath 0.5 and I see:

What version of QuPath are you using?

Sep 20 '24 16:09 jcupitt

Sep 20 '24 16:09 nahilsobh

Maybe QuPath on mac is using some system library that can't read these files? In any case, it's clearly a bug in your image viewer, the file is correct.

Sep 20 '24 16:09 jcupitt

But how come the python image viewers showed the same false colors as qupath

Sep 20 '24 16:09 nahilsobh

I suppose it's also picking up a buggy library. Did you download the version of the image that I made?

Sep 20 '24 16:09 jcupitt

I'll try on my mac.

Sep 20 '24 16:09 jcupitt

Thanks.

Sep 20 '24 16:09 nahilsobh

I tried in macos Preview and the image looks fine. Did you download the version of the image that I made? What do you see with that exact file in macos Preview?

Sep 20 '24 16:09 jcupitt

It displayed fine. What recommendation to install vips on my mac to reproduce your results? thx

Sep 20 '24 16:09 nahilsobh