Downsampling in Z dimension
I'm working on large images which require a lot of memory.
This is a sample image: 850x1900 ushort, 1 band, grey16, tiffload width: 850 height: 1900 bands: 1 interpretation: grey16 n-pages: 791
I want to downsample this image to 0.5x times the original image, which is (425, 950, 395).
using pyvips I am successfully doing the rescale to (425, 950, 791)
I do not want to load the image in memory, as I have images which larger than this size eg: (7175, 8910, 2636)
Is this something I can do using vips?
Hi @nooneswarup,
Wow that IS big. What's the exact tiff format? Are they tiled? That would make it a bit easier.
Yes, it shouldn't be too hard. I suppose I'd do a x2 shrink first, then average pairs of pages.
Oh, you have an odd number of pages in your test image. How do you plan to handle that?
For the odd number of pages, I was planning to drop a slice or if its possible to round them up, similar to how Fiji does the rescaling. Is there something you suggest using vips?
I am loading the image using n=-1 and used to crop to split them into different pages. I wonder if there a different approach which I can use for doing the same for Z dimension scaling?
pages = [] for y in range(0, image.height, page_height): cropped_image = image.crop(0, y, image.width, page_height) pages.append(cropped_image)
I made a test image like this:
$ vips copy nipguide.pdf[dpi=300,n=-1] x.tif[tile,compression=jpeg]
$ vipsheader -a x.tif
x.tif: 2480x3508 uchar, 3 bands, srgb, tiffload
width: 2480
height: 3508
bands: 3
format: uchar
coding: none
interpretation: srgb
xoffset: 0
yoffset: 0
xres: 11.811
yres: 11.811
filename: x.tif
vips-loader: tiffload
n-pages: 58
resolution-unit: in
bits-per-sample: 8
orientation: 1
Then with this test prog:
#!/usr/bin/env python3
import sys
import pyvips
# load and split to an array of images
image = pyvips.Image.new_from_file(sys.argv[1])
pages = image.pagesplit()
# x2 xy shrink
pages = [page.resize(0.5) for page in pages]
# average pairs of pages
pages = [((pages[i] + pages[i + 1]) / 2).cast(pages[i].format)
for i in range(0, len(pages), 2)]
# join the pages again
image = pyvips.Image.arrayjoin(pages, across=1)
# set the page height so the tiff saver knows how to break the image into
# pages
image.set("page-height", pages[0].height)
image.write_to_file(sys.argv[2])
I can run:
$ /usr/bin/time -f %M:%e ~/try/zshrink.py x.tif[n=-1] x2.tif[tile,compression=jpeg]
317952:46.65
$ vipsheader -a x2.tif
x2.tif: 1240x1754 uchar, 3 bands, srgb, tiffload
width: 1240
height: 1754
bands: 3
format: uchar
coding: none
interpretation: srgb
xoffset: 0
yoffset: 0
xres: 11.811
yres: 11.811
filename: x2.tif
vips-loader: tiffload
n-pages: 29
resolution-unit: in
bits-per-sample: 8
orientation: 1
So it runs in about 40s and needs a peak of 300mb of memory. The whole image is 2480 * 3508 * 58 * 3 bytes, or 1.5gb, so it's streaming it, not just loading the image into memory. You could get that down a bit by reducing the size of the threadpool (this PC has 32 hardware threads).
That's with tiled tiff. If you have a simple strip tiff, it's a little harder, I think you'd need to use sequential mode and a tilecache.
Hi @jcupitt I have used the test image and tried it without the jpeg compression. I have dropped images to make the z dimension even. (My end goal here is to make OME-Zarr files, I know vips still does not have support for that). Thanks!
For the original image I tried to do it by opening it sequential: true. It works upto a certain number of n-pages. I get an error as follow:
(process:39391): GLib-GObject-CRITICAL **: 12:37:06.749: value "10005930" of type 'gint' is invalid or out of range for property 'top' of type 'gint'
Error: unable to call crop
extract_area: parameter top not set
(crop function has a limit of gint - 10000000) fixed: reading about 1000 slices at once and concat after cropping
Ah! I should have noticed. Yes, libvips has a sanity check limiting it to 10m pixels in any axis.
8.16 makes this limit configurable with VIPS_MAX_COORD. You can use eg.
$ VIPS_MAX_COORD=100m ./some-pyvips-prog.py
To run with a higher limit.
Thank you! I was able to figure it out but I have been getting this:
Error: unable to call VipsForeignSaveTiffFile
: out of order read -- at line 8944, but line 144 requested
The out of order read happens if you are using sequential for your input files.
The best fix is to make sure they are tiled TIFF.
You can also not use sequential mode, but then you will see very long startup times and very high disc usage.
It might be possible to just expand the tile cache, but that depends on your code and your images files, you'd need to share a complete example.
Unfortunately it is not a tiled TIFF. I do not mind high disk usage or startup times.
I also do not see a 8.16 release for the vips.
8.16 is the development version. You can run git master, or wait a few months.