pyvips icon indicating copy to clipboard operation
pyvips copied to clipboard

Compute mask and sample from it

Open gabrieldernbach opened this issue 2 years ago • 4 comments

I need to match the color distribution between some images using a custom algorithm. In order to estimate the distribution I would like to sample rgb values from the image. Typically there are very large white areas that I need to exclude first.

Can this be done with pyvips? If so can you tell me how?

gabrieldernbach avatar Jul 27 '21 14:07 gabrieldernbach

Hi @gabrieldernbach,

It should be easy, yes. Could you give some more detail? Do you mean you want a histogram of pixels within a mask? Or do you want to sample random points within a masked area?

jcupitt avatar Jul 27 '21 14:07 jcupitt

It is the latter, I want to sample random points within a masked area.

I aim for an np.array of shape n_samples, n_channel for each, the source and the target image to be matched. On this subsample I want to infer a mapping, e.g. via optimal-transport, sparse non negative matrix factorization, etc.

More context: Currently, I unpack my tif file into tiles, learn a mapping on a subset of pixels, map each tile individually, and rebuild the tif. I feel this is duplicating the pyvips functionality. But broadcasting a vectorized function over the an image is probably a separate thread

gabrieldernbach avatar Jul 27 '21 17:07 gabrieldernbach

I think you'd probably need to fetch every tile in the image, then mask it before passing it to your network or whatever you are using.

You could generate the mask with pyvips I guess, something like (untested):

image = ...
# a one-band uint8 image with 255 for areas which are white
mask = (image == 255).bandand()

# region fetch is usually faster than crop / write_to_memory, though it depends on the tile size
image_region = pyvips.Region.new(image)
mask_region = pyvips.Region.new(mask)

# fetch a 128x128 block of pixels from each ... you can wrap a numpy array around these
# pointers
image_bytes = image_region.fetch(100, 100, 128, 128)
mask_bytes = mask_region.fetch(100, 100, 128, 128)

jcupitt avatar Jul 29 '21 11:07 jcupitt

Discussion of fetch with some benchmarks, if you've not seen it: https://github.com/libvips/pyvips/issues/100#issuecomment-493960943

jcupitt avatar Jul 29 '21 11:07 jcupitt