best compression settings for a wsi viewer
hi, I have a project to make wsi images using frames captured from a camera on microscope, my maker code uses opencv and numpy to store final array and update it with every frame I have some questions: 1- when I'm saving final file what's best settings to save as pyramid and make it suitable to show in a viewer for analysis without quality loss?(I was saving with opencv imwrite function first but when I needed to rotate final image it takes double memory to make rotated image then saves it so I decided to do this with pyvips) 2-I wanted to know if there is any other method that can speedup this process? or reduce memory usage because right now I'm making a 50k x 50k * 3 array at start of program and just update it or pad it when i get to edges, and it takes 7.5gb memory and when pad is running it goes up to double memory but after pad is finished it goes down to real size. 3-is it normal when I try to open these images with windows photo viewer it takes several seconds to minutes to open and show it? this is the code I'm using to rotate and save the tiff image saved with opencv imwrite:
image = pyvips.Image.new_from_file('../wsi.tiff')
image = image.rotate(90)
image.tiffsave('../wsi2.tiff', tile=True, compression='jpeg', bigtiff=False, pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)
Hello @sinamcr7,
-
I think I'd use Q=85, that's roughly what slide scanner companies use. You could possibly use 512x512 tiles, though 256x256 is a good choice. 1024x1024 is probably too large.
-
You could move more of your array assembly processing to pyvips, it should speed it up and drop memory use. I would look at
mergeand friends:https://www.libvips.org/API/current/libvips-mosaicing.html#vips-merge
It does a pair-wise join of two images with a feathered edge. You can merge your $n camera images and save as pyramidal tiff in one step with no intermediates.
-
Windows photo viewer is not designed for large images. It will just decompress the entire thing to RAM and then paint the screen from that, so it'll be very slow and need colossal amounts of memory. I made a simple image viewer which should be quick:
https://github.com/jcupitt/vipsdisp
There's a windows build here:
https://github.com/jcupitt/vipsdisp/releases/tag/v3.0.4
Just unzip somewhere and doubleclick the exe. The keyboard shortcuts are useful:
https://github.com/jcupitt/vipsdisp?tab=readme-ov-file#shortcuts
It supports things like colour management for slide images, which can be handy if you have a profile for your camera and microscope.
I've just realised you already have the entire image in memory as a numpy array, is that right?
In which case you can simply do:
image = pyvips.Image.new_from_array(big_numpy_array)
image = image.rot90()
image.tiffsave("some-filename.tif",
compression="jpeg",
Q=85,
tile=True,
tile_width=256,
tile_height=256,
pyramid=True)
libvips and numpy will share the memory, so there will (I think!) be no copy and no extra memuse.
.rot90() does a fixed 90 degree rotate, so it's faster and more accurate than .rotate(90).
You could also use pyvips to save as OME-TIFF, though it'll be slower. Have you looked at QuPath?
thanks for suggestions yes I make image array at program launch so it takes 7.5gb memory at start, its to reduce need for padding later, is it better to convert all operations to vips or opencv is enough? no I didn't try QyPath yet I'm interested in using vips for making wsi if it reduces memory usage, I have an gui with pyqt to show windows while making wsi, currently when I reach edges I have to pad with np.pad and it takes some time to do it so it adds delays to gui, so I chose a big array with a optimal starting point to avoid pads as much as I can, is there any workaround to solve this issue?
I'd stick to opencv if your code is working. pyvips ought to be able to make the pyramidal tiff directly from the numpy array with only a relatively small amount of extra memory.
It depends how you are making the slide image. What corrections do you apply to frames from the microscope? How accurate is your stage? We'd need to get into a lot of detail before I could answer.
ok thanks then I'll keep opencv part, I also have issue with np.pad, when my app reaches that it takes double of array memory and after several seconds it goes down to actual memory that it should take, is there any faster and low memory method to pad an array?
Sorry, no, array resize needs to reallocate memory, and that means it must double.
You could change to a tiled array. Cut your image into eg. 1024x1024 tiles and keep a meta-array of references to them. Now you can pad by just making a new column of tiles on the right and nulling the references to the old column of tiles on the left.
... though a tiled array will make save difficult, of course.
I've built several imaging systems like this. I've always done it in two stages: first, scan the camera over the slide and capture a set of frames to your storage. You can keep a low-res version in memory to show the user what the slide looks like during the scan. You can do geometric and radiometric correction at this stage. You examine the overlap areas and estimate the frame positions from the content.
Second, have an "assembly" phase which reads the corrected tiles from your storage, merges them together using offsets computed in stage 1, possibly does some extra corrections for lighting consistency or colour, and saves as a pyramidal tiff. pyvips would be a reasonable choice for the assembly stage.
Commercial slide scanners usually work in the same way, though some very high throughput systems will have a huge area of RAM for assembly rather than temporary files on your drive.
well our algorithm needs that big array for some calculations, if we remove that part I don't know how I should compute exact locations and overlaps, also without that how can I show thumbnail and current frame to user?
also I tried vips tiffsave with jpeg and q=85, q=100 and opencv imwrite, but there wasn't much difference in quality and it seems opencv imwrite has more quality, is that right or I did something wrong? here's code:
cv2.imwrite('wsi.tiff', self.image)
image = cv2.cvtColor(self.image, cv2.COLOR_BGR2RGB)
image = numpy_to_pyvips(image)
image = image.rot90()
image.tiffsave('wsi2.tiff', tile=True, compression='jpeg', pyramid=True, Q=85, tile_width=256, tile_height=256, properties=True)
image.tiffsave('wsi3.tiff', tile=True, compression='jpeg', pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)
The scanners I've made have driven the stage in an approximate grid, then examined the overlaps to find a set of exact offsets. The set of offsets are an overdetermined linear system, so you can use eg. least mean square to find a set of frame positions which minimise overall positioning error.
Friends have made system which calibrate the stepper motors instead, so they use a known target, then drive the stage over the field grabbing frames and refining a pair of XY positioning tables. This can work very well, but calibration takes a long time and you have to redo it fairly often as the threads in the stages wear.
You can keep eg. a 10k x 10k image in memory plus the current frame to show the user, then generate the full 100k x 100k image during save.
The settings mostly affect file size. I see:
$ vips copy CMU-1.svs x-85.tif[compression=jpeg,tile,pyramid,Q=85]
$ vips copy CMU-1.svs x-100.tif[compression=jpeg,tile,pyramid,Q=100]
$ vips copy CMU-1.svs x.tif
$ ls -l x* CMU-1.svs
-rw-r--r-- 1 john john 794542522 Aug 5 13:37 x-100.tif
-rw-r--r-- 1 john john 174346312 Aug 5 13:33 x-85.tif
-rw-r--r-- 1 john john 6056321460 Aug 5 13:38 x.tif
-rw-r--r-- 1 john john 177552579 Feb 13 2021 CMU-1.svs
The uncompressed TIFF will obviously be the best quality, but 6GB is far too large to be practical. Almost no users will want to use it. Q100 is 10x smaller, Q85 is 40x smaller. Aperio (the slide scanner that makes SVS format slides) is using Q85.
oh you're talking about motorized scanners, I'm working with manual wsi scanning, user should be able to see area around FOV, FOV itself for focus check and a thumbnail of big image so user can see when its reaching edges and need to wait for pad, and if there is any errors he will use tools to clear wrong frames up to last correct frame, then start scanning again
Ah OK, I've never tried with a manual stage, I agree that would need a different approach.
Other friends have made interactive scanners using a manual stage, but I don't know what technical solution they used. Probably allocating the complete image in memory before starting.
Hi @jcupitt --
looking back at release libvips version 8.15.1 I noticed that Q >=90 is not supported.
But in the statement above by @sinamcr7 (copied here):
image.tiffsave('wsi3.tiff', tile=True, compression='jpeg', pyramid=True, Q=100, tile_width=256, tile_height=256, properties=True)
I'm curious if Q=100 will result in a false image?
Much appreciated.
8.15.1 fixed that bug, so Q100 should be fine.
I used 8.15.3 with this image: http://merovingio.c2rmf.cnrs.fr/iipimage/PalaisDuLouvre.tif vips --version vips-8.15.3 vips im_vips2tiff PalaisDuLouvre.tif output_image.tif:jpeg:90,tile:256x256,pyramid and got falsely colored image. Thx.
I see:
$ vips --version
vips-8.15.1
$ vips im_vips2tiff PalaisDuLouvre.tif output_image.tif:jpeg:90,tile:256x256,pyramid
I can view it in vd and eog:
Here's the file it generated:
www.rollthepotato.net/~john/output_image.tif
It could be a bug in your image viewer -- try downloading and viewing the version I made.
You are using the old vips7 syntax, which is deprecated.
The new CLI syntax is:
$ vips tiffsave PalaisDuLouvre.tif output_image.tif --compression jpeg --Q 90 --tile-width 256 --tile-height 256 --pyramid
You can also write:
$ vips copy PalaisDuLouvre.tif output_image.tif[compression="jpeg",Q=90,tile-width=256,tile-height=256,pyramid]
The new interface is often faster, so it's worth changing over. It's easier to read as well, of course.
I'm getting this output in qupath
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img = mpimg.imread("output_image.tif")
plt.imshow(img)
plt.show()
and got this
I tried with Q=89
and got this
I think v 8.15.3 can not take Q>=90 and generate the correct compressed image.
Did you download the image I made and test that?
I tried that image in QuPath 0.5 and I see:
What version of QuPath are you using?
Maybe QuPath on mac is using some system library that can't read these files? In any case, it's clearly a bug in your image viewer, the file is correct.
But how come the python image viewers showed the same false colors as qupath
I suppose it's also picking up a buggy library. Did you download the version of the image that I made?
I'll try on my mac.
Thanks.
I tried in macos Preview and the image looks fine. Did you download the version of the image that I made? What do you see with that exact file in macos Preview?
It displayed fine. What recommendation to install vips on my mac to reproduce your results? thx