Pillow icon indicating copy to clipboard operation
Pillow copied to clipboard

Add support for high bit depth multichannel images

Open wiredfool opened this issue 9 years ago • 44 comments

Pillow (and PIL) is currently able to open 8 bit per channel multi-channel images (such as RGB) but is able to open higher bit depth images (e.g. I16, I32, or Float32 images) if they are single channel (e.g., grayscale).

Previous References

This has been requested many times: #1828, #1885, #1839, #1602, and farther back.

Requirements

  • We should be able to support common GIS formats as well as high bit depth RGB(A) images.
  • At least 4 channels, but potentially more (see #1839)
  • Different pixel formats, including I16, I32, and Float.
  • There should be definitions for the array interface to exchange images with numpy/scipy
  • There should be enough support to read and write TIFFs and raw image data.
  • Support for resize, crop, and convert operations at the very least.

Background Reference Info

The rough sequence for image loading is:

  • Image file is opened

  • Each of the ImagePlugin _accept functions have a chance to look at the first few bytes to determine if they should attempt to open the file

  • The *ImagePlugin._open method is called giving the image plugin a chance to read more of the image and determine if it still wants to consider it a valid image of it's particular type. If it does, it passes back a tile definition which includes a decoder and an image size.

  • If there is a successful _open call, at some point later *ImagePlugin._load may be called on the image, which runs the decoder producing a set of bytes in a raw mode. This is where things like compression are handled, but the output of the decoder is not necessarily what we're storing in our internal structures.

  • The image is unpacked (Unpack.c) from the raw mode (e.g. I16;BS) into a storage (Storage.c) mode (I).

  • It's now possible to operate on the image (e.g. crop, pixel access, etc)

    There are 3 (or 4) image data pointers, as defined in Imaging.h:

struct ImagingMemoryInstance {

    /* Format */
    char mode[IMAGING_MODE_LENGTH]; /* Band names ("1", "L", "P", "RGB", "RGBA", "CMYK", "YCbCr", "BGR;xy") */
    int type;       /* Data type (IMAGING_TYPE_*) */
    int depth;      /* Depth (ignored in this version) */
    int bands;      /* Number of bands (1, 2, 3, or 4) */
    int xsize;      /* Image dimension. */
    int ysize;

    /* Colour palette (for "P" images only) */
    ImagingPalette palette;

    /* Data pointers */
    UINT8 **image8; /* Set for 8-bit images (pixelsize=1). */
    INT32 **image32;    /* Set for 32-bit images (pixelsize=4). */

    /* Internals */
    char **image;   /* Actual raster data. */
    char *block;    /* Set if data is allocated in a single block. */

    int pixelsize;  /* Size of a pixel, in bytes (1, 2 or 4) */
    int linesize;   /* Size of a line, in bytes (xsize * pixelsize) */

    /* Virtual methods */
    void (*destroy)(Imaging im);
};

The only one that is guaranteed to be set is **image, which is an array of pointers to row data.

Changes Required

  • Definitions for all of the modes that we're planning, and potentially a [format];MB[#bands] style generic mode.

Core Imaging Structure

  • The imaging structure has the fields required to add the additional channels. (type, bands, pixelsize, linesize)
  • The **image pointer can be used for any width of pixel.
  • We may or may not want to set the **image32 pointer.
  • Currently type of IMAGING_TYPE_INT32 and IMAGING_TYPE_FLOAT32 imply 1 band. This will change.
  • Consider promoting int16 to IMAGING_TYPE_INT16

Storage

  • Updates to Storage.c, Unpack.c, Pack.c, Access.c, PyAccess.py, and Convert.c

Ways to Help

We need a better definition of the format requirements. What are the various types of images that are used in GIS, Medical, or other fields that we'd want to interpret? We need small, redistributable versions of images that we can test against.

[in progress]

wiredfool avatar May 05 '16 16:05 wiredfool

I'm having the same problem with 16 bit single-channel paletted TIFFs, created by GDAL. It would be "really" nice if Pillow could play nicely with GIS and scientific image formats, as GDAL is a pain in the ass and I'd rather not use it.

tiffinfo as follows:

TIFFReadDirectory: Warning, Unknown field with tag 33550 (0x830e) encountered. TIFFReadDirectory: Warning, Unknown field with tag 33922 (0x8482) encountered. TIFFReadDirectory: Warning, Unknown field with tag 34735 (0x87af) encountered. TIFFReadDirectory: Warning, Unknown field with tag 34737 (0x87b1) encountered. TIFFReadDirectory: Warning, Unknown field with tag 42113 (0xa481) encountered. TIFF Directory at offset 0x34293c6 (54694854) Image Width: 10774 Image Length: 12577 Bits/Sample: 16 Sample Format: unsigned integer Compression Scheme: LZW Photometric Interpretation: palette color (RGB from colormap) Samples/Pixel: 1 Rows/Strip: 1 Planar Configuration: single image plane Color Map: (present) Tag 33550: 4.999617,4.999789,0.000000 Tag 33922: 0.000000,0.000000,0.000000,679006.067110,9955209.915048,0.000000 Tag 34735: 1,1,0,7,1024,0,1,1,1025,0,1,1,1026,34737,22,0,2049,34737,7,22,2054,0,1,9102,3072,0,1,32736,3076,0,1,9001 Tag 34737: WGS 84 / UTM zone 36S|WGS 84| Tag 42113: 0 Predictor: horizontal differencing 2 (0x2)

terramars avatar May 23 '16 15:05 terramars

Any updates on this?

bodokaiser avatar Mar 14 '17 11:03 bodokaiser

Unfortunately, no.

wiredfool avatar Mar 14 '17 21:03 wiredfool

@wiredfool what do you think about to add the support of multichannel images as sequence of Image ? For example, 4 channels image with uint16 is represented (more less equivalently) by ['<PIL.Image.Image image mode=I;16 size=... >', '<PIL.Image.Image image mode=I;16 size=...>', ..., '<PIL.Image.Image image mode=I;16 size=...>']. I mean by that, maybe, to provide a class inheriting from Image and tuple and override all method to work on a tuple of images... Sure that it looks like a hack, however it could unlock more features (and create issues :) ) at least while working with Image.fromarray.

vfdev-5 avatar Feb 20 '18 21:02 vfdev-5

To do anything useful with it, we'd have to have support in the C layer, so it would have to be at the core imaging layer, and especially Unpack/Pack.

wiredfool avatar Feb 21 '18 07:02 wiredfool

@wiredfool following your "Ways to help",

We need a better definition of the format requirements. What are the various types of images that are used in GIS, Medical, or other fields that we'd want to interpret?

For GIS, as there is a huge amount of different formats (for example, gdal format list), this can be left for GIS libraries as gdal, rasterio etc. However, a support of Image.fromarray on input multi-channel (3,4,5,...) arrays of dtype np.uint16, np.float32 would be, imho, essential.

We need small, redistributable versions of images that we can test against.

For GIS imagery, this can be easily created manually with gdal, rasterio.

I would like to give a hand on this, so, feel free to ask me.

vfdev-5 avatar Feb 21 '18 21:02 vfdev-5

PIL cannot handle processing multi-channel images. They get truncated to 3-ch images if you perform any transformation using PIL. #3160

edowson avatar Jun 07 '18 18:06 edowson

What is the status of this issue? It has been almost three years since the first proposal. I am unfortunately unable to provide any help since I have zero experience with coding in C, but I am among the people that is awaiting support for e.g. multi-channel floating-point images (with possibilities for negative pixel values). This especially useful in deep learning, where it is preferable to have all values normalized with zero mean. PIL has some really awesome ImageOps, which is one of the reasons for wanting this support.

bthorsted avatar Feb 15 '19 14:02 bthorsted

@bjtho08 No updates.


https://github.com/python-pillow/Pillow/issues/2485 links to a multipage RGB TIFF containing float64 values.

hugovk avatar Feb 17 '19 14:02 hugovk

Please fix the issue with multi-channel 16 bit images. Thank you!

omaghsoudi avatar Jul 05 '19 01:07 omaghsoudi

I'm closing my other issue since I realize it is a duplicate of this one. Here is an example multichannel tiff dataset to work with for testing once folks get around to tackling this. It's publicly available data from NASA and the USGS: https://ucsb.box.com/s/taz9fb3rcur1d24bt6s7g6cw2ynkw747

Link to my issue with details: https://github.com/python-pillow/Pillow/issues/3984

Thanks for tracking this and all the hard work, I appreciate it!

rbavery avatar Jul 24 '19 21:07 rbavery

I can't even open 3 channel tif with PIL Image.open....

icml-compbio avatar May 21 '20 16:05 icml-compbio

@icml-compbio please open a new issue with more details about your problem, including the image that is failing for you

radarhere avatar May 21 '20 21:05 radarhere

Any update on this ? It seems to be a quite common issue. Currently multi-channels np.uint16 are not supported.

Conchylicultor avatar Apr 09 '21 12:04 Conchylicultor

I'm also surprised that float32 RGB or RGBA files are not supported. These are standard for VFX and post production when using a linear workflow, have been for over a decade, and should be supported, whether it's with TIFF files or OpenEXR. They need to be supported and not clamp values to 0-1 unless we choose for them to be. uint8 and even 16 does not suffice as they have hard limits and less precision. Currently Pillow sees their shapes as (1,1,3) regardless of dimensions.

I would also encourage you to make it possible to easily save layered TIFF and EXR files that can be read by Adobe apps, like Photoshop and After Effects, and other industry standard tools used for compositing. This is where these file formats excel and why EXR was created. There is no library available that makes this easy as far as I've been able to find and the libraries that do it require you to manually setup the tags, which is not trivial unless you have knowledge of how this low-level stuff works. Seems a bit much to have to work out byte code to save discrete file layers in 2021 that Photoshop can read.

I've turned to imageio and tifffile, and other libraries made for geo data, and while they support float RGB without clamping, they don't spit out layered files, only multipage, which Adobe and other host apps do not support. I'm still banging away trying to get tags to work.

It's honestly very strange to me that Pillow does not support these things since it's the primary image library everyone uses and it has a nice short syntax and a lot of features that make it great.

And yes, I blame Adobe for using bizarre tagging, but it's what many of us have to work with to get the job done and/or stay employed, and we need layers and float32 RGB.

KeygenOld avatar Dec 15 '21 20:12 KeygenOld

Thanks for the work so far on this issue. Here's another datapoint on possible weird bit-depth formats that comes up in some of the microscopy data we handle.

The instrument we use outputs false-colored "grayscale" images that have asymmetric channel bitdepths (trimmed imagemagick output):

Format: TIFF (Tagged Image File Format)
  Mime type: image/tiff
  Geometry: 1920x1440+0+0
  Colorspace: sRGB
  Type: TrueColor
  Endianness: LSB
  Depth: 16-bit
  Channel depth:
    Red: 1-bit
    Green: 16-bit
    Blue: 1-bit

This is detected by Pillow/TiffImagePlugin as having symmetric bitdepths:

(II, 2, (1,), 1, (16, 16, 16), ()): ("RGB", "RGB;16L"),

I can upload one of these images if it is helpful.

This file gets truncated to 8-bit as above with our Python pipelines and other tools like CellProfiler which use Pillow. We've avoided the issue in our pipeline by having an ImageMagick preprocessing step prior to Pillow-dependent steps.

I unsuccessfully tried tracking down why the TiffImagePlugin is loading it symmetrically, but it's kind of moot anyway until full loading of high bit depth multichannel images is there anyway. Technically, I think these could be loaded by having a special rawtype that loaded the 16-bit channel into the 32-bit buffer, but that would require assumptions like dropping the (in this case, uniformly-zero) 1-bit channels.

meson800 avatar Aug 08 '22 21:08 meson800

I can upload one of these images if it is helpful.

Please do.

cgohlke avatar Aug 08 '22 23:08 cgohlke

I can upload one of these images if it is helpful.

Please do.

.tif isn't an allowable upload so here's a .zip of a .tif asymmetric_bit_depth.zip

meson800 avatar Aug 09 '22 01:08 meson800

The ImageMagick output is a little confusing. The first image in the file is a simple 3 samples RGB image with 16 bit per sample , i.e. the BitsPerSample tag value is (16, 16, 16), not (1, 1, 16). The fact that two channels only contain zero values shouldn't concern the TIFF reader.

cgohlke avatar Aug 09 '22 01:08 cgohlke

Here is a test case for a 16 bit PNG image generated with GIMP.

import io
import base64
from PIL import Image

# Create a PIL.Image from a base64-encoded string
image = Image.open(io.BytesIO(base64.b64decode("""
iVBORw0KGgoAAAANSUhEUgAAAAcAAAACEAYAAADEDxojAAAAQ0lEQVQI10WMWw0AIBRCz91MYBcD
WMI89LGCFcyEH1cnG48PABvA/gQ7PVOCC0mS7PLLETakQq2tjQFzrrX3u9Cd9zg0Ai9H03VKQwAA
AABJRU5ErkJggg==""")))

expected_image_data = [
    # First row
    (0xffff, 0x0000, 0x0000, 0xffff), # R
    (0x0000, 0xffff, 0x0000, 0xffff), # G
    (0x0000, 0x0000, 0xffff, 0xffff), # B
    (0x0000, 0x0000, 0x0000, 0xffff), # Black
    (0xffff, 0xffff, 0xffff, 0xffff), # White
    (0x0000, 0x0000, 0x0000, 0x0000), # Transparent
    (0x8080, 0x8080, 0x8080, 0xffff), # Gray
    # Second row
    (0xffff, 0xffff, 0x0000, 0xffff), # Yellow
    (0xffff, 0x0000, 0xffff, 0xffff), # Fuchsia
    (0x0000, 0xffff, 0xffff, 0xffff), # Cyan
    (0x1212, 0x3434, 0x5656, 0xffff), # Darkish blue
    (0xaaaa, 0xbbbb, 0xcccc, 0xffff), # Grayish blue
    (0xffff, 0xffff, 0xffff, 0x8000), # White 50 % transparency
    (0xffff, 0xffff, 0xffff, 0x4000), # White 25 % transparency
]

assert image.mode == "RGBA"
assert image.size == (7, 2)
assert list(image.getdata()) == expected_image_data

The test currently passes only if the image data is truncated to 8 bits:

# Truncate to 8 bits as long as Pillow does not support 16 bit PNGs
expected_image_data = [tuple(x >> 8 for x in px) for px in expected_image_data]

OpenCV and PyPNG can load this file and decode it to the expected image data.

You can write the image file to disk with the following bash command:

echo 'iVBORw0KGgoAAAANSUhEUgAAAAcAAAACEAYAAADEDxojAAAAQ0lEQVQI10WMWw0AIBRCz91MYBcDWMI89LGCFcyEH1cnG48PABvA/gQ7PVOCC0mS7PLLETakQq2tjQFzrrX3u9Cd9zg0Ai9H03VKQwAAAABJRU5ErkJggg==' | base64 -d > '16_bit_rgba.png'

99991 avatar Aug 25 '22 10:08 99991

I think the ImagingMemoryInstance struct needs to store the size of each band. I think that's what depth is for, but it's not currently being used. Every band would have to be the same size to work with our current code, so we would probably want to scale everything to the largest band in the image. I don't see a way to mix integer and floating point bands in the same image, so we would have to pick one and convert the rest.

Another thing that would be useful is to store the index of the alpha band, if there is one. Some of the code needs to know which band this is, and currently they figure this out based on the image mode. Having it as a property of the image would simplify that.

Yay295 avatar Aug 28 '22 14:08 Yay295

Just here to echo @KeygenLLC 's comment above, RGB/RGBA images in 16 or 32-bit float is nearly mandatory for VFX, and we've been removing any usage of PIL we can find in tools we bring in, in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

(We're also working with the EXR file format)

herronelou avatar May 15 '23 23:05 herronelou

Focusing specifically on the mode definition:

char mode[IMAGING_MODE_LENGTH]; /* Band names ("1", "L", "P", "RGB", "RGBA", "CMYK", "YCbCr", "BGR;xy") */

Definitions for all of the modes that we're planning, and potentially a [format];MB[#bands] style generic mode.

Do we have a complete list of all supported modes strings (essentially the grammar) that would support all use cases? Something like saying (this may be wrong, this is just how I'm interpreting the earlier discussions):

  • [format] can be U8, I16, U16, F32 etc (pixel format of a single channel).
  • [bands] is optional and specifies the channel names/interpretation (RGB, YCbCr etc). If it's not specified I would assume it's just grayscale.

As for the prior art, ffmpeg (which I'm more familiar with) has AVPixFmtDescriptor which handles the memory layout for all their use cases ; the equivalent of "modes" are then defined as the av_pix_fmt_descriptors static array. Is this sort of mechanism something that would be useful to reuse? Do we need to support packed/planar formats or half-resolution chroma planes?

What about extra embedded images that may have different dimensions e.g. thumbnails or auxilliary depth/gain map/matte images? Should they be supported at all, and if so how?

fxthomas avatar Feb 25 '24 12:02 fxthomas

the equivalent of "modes" are then defined as the av_pix_fmt_descriptors static array

The closest thing to that in Pillow I think would be this:

https://github.com/python-pillow/Pillow/blob/274924e64f8b53f46d04b122fe5d959f848a99b0/src/libImaging/Storage.c#L44-L225

Yay295 avatar Feb 25 '24 17:02 Yay295

Maybe can make some progress on this in 2024, pending acceptance of https://github.com/AcademySoftwareFoundation/tac/issues/631

aclark4life avatar Mar 19 '24 22:03 aclark4life

in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

@herronelou Can you (or anyone?) say any more about the convenience of PIL and how meaningful > 8 bit multichannel support in PIL would be? Would you switch back to PIL if this feature were added and would you expect an uptick in usage from VFX studios in general? I got interested in VFX recently so I'm especially curious about this issue now.

aclark4life avatar Apr 15 '24 22:04 aclark4life

I can just say that for GIS, if you want to deal with tiffs that aren't extremely simple you're stuck going into gdal internals to do anything, even just read them into an array. I'm still sad 7 years later I had to waste time learning that tool and couldn't just do Image.open on them. Maybe someone else implemented it by now but I doubt.

On Mon, Apr 15, 2024, 3:44 PM Jeffrey A. Clark @.***> wrote:

in favor of dealing directly with np.arrays and cv2 functions for manipulating the data as images. It's not as convenient as what PIL offers but 8bit is a deal breaker.

@herronelou https://github.com/herronelou Can you (or anyone?) say any more about the convenience of PIL and how meaningful > 8 bit multichannel support in PIL would be? Would you switch back to PIL if this feature were added and would you expect an uptick in usage from VFX studios in general? I got interested in VFX recently so I'm especially curious about this issue now.

— Reply to this email directly, view it on GitHub https://github.com/python-pillow/Pillow/issues/1888#issuecomment-2057935869, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHA64K6XBSEB6M3GBVL2ZTY5RJ6JAVCNFSM4CDASTI2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBVG44TGNJYGY4Q . You are receiving this because you commented.Message ID: @.***>

terramars avatar Apr 15 '24 22:04 terramars

@aclark4life For the most part, VFX studios tend to work with EXR file formats. Internally most of our softwares process in 32bit float, although saving the resulting images in 16bit float is usually enough, except for a small number of specific data passes that we tend to store in other channels.

I've not been doing much personally recently that could have used PIL, the main cases I've run into when I posted were external tools we brought into our pipeline that used PIL for their image reading, and we had to strip it away so we could run our 16bit float images through without the loss caused by going through 8bit, so yes, absolutely, if PIL supported those natively we wouldn't need to go out of our way to strip PIL away when somebody uses it, which would be great.

herronelou avatar Apr 15 '24 23:04 herronelou

PIL is used in many ML frameworks for reading images, like FastAI and detectron2 and countless ML projects. When someone tries to use these frameworks or projects as examples with their high bit depth multichannel images, often the first thing to cause grief is this issue. On multiple occasions I've had to rewrite image data loaders for ML because Pillow does not support multichannel float32 tifs. This imagery is really common in geospatial analysis, most satellite imagery comes in high bit depth.

rbavery avatar Apr 17 '24 17:04 rbavery