vision icon indicating copy to clipboard operation
vision copied to clipboard

Discrepancy in output of torchvision.io.read_image vs PIL.Image

Open 8uurg opened this issue 2 years ago • 4 comments

🐛 Describe the bug

Some images from the imagenetv2 dataset (downloadable here) contain nonzero differences when loaded using torchvision.io.read_image, with some images containing large differences in pixel values.

import torch
import torchvision.io
import numpy as np
from PIL import Image

def loadimage_pil(path):
    return torch.tensor(np.array(Image.open(path).convert("RGB"))).permute(2, 0, 1)

def loadimage_torchio(path):
    return torchvision.io.read_image(path, torchvision.io.ImageReadMode.RGB)

# assuming archive is unpacked in the same folder as script - change accordingly.
filepath = "./imagenetv2-matched-frequency-format-val/455/aaaf43c110a10aabce09700a6a3cfb2622b4847a.jpeg"
print(f"loading '{filepath}'")
img_pil = loadimage_pil(filepath)
img_tio = loadimage_torchio(filepath)
difference = img_pil.to(float) - img_tio.to(float)

error = torch.sqrt(torch.mean(torch.square(difference)))
print(error)
# > tensor(6.3985, dtype=torch.float64)

When loading the file used in the example imagenetv2-matched-frequency-format-val/455/aaaf43c110a10aabce09700a6a3cfb2622b4847a.jpeg the printed error value is 6.3985.

Versions

Collecting environment information...
PyTorch version: 2.1.0
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Fedora Linux 36 (Thirty Six) (x86_64)
GCC version: (GCC) 12.2.1 20220819 (Red Hat 12.2.1-2)
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0] (64-bit runtime)
Python platform: Linux-6.0.8-200.fc36.x86_64-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.7.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
<snip>

Nvidia driver version: 520.56.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True


Versions of relevant libraries:
[pip3] mypy==1.2.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.4
[pip3] torch==2.1.0
[pip3] torch-tb-profiler==0.4.3
[pip3] torchaudio==2.1.0
[pip3] torchinfo==1.8.0
[pip3] torchvision==0.16.0
[pip3] triton==2.1.0
[conda] blas                      1.0                         mkl    conda-forge
[conda] libblas                   3.9.0            16_linux64_mkl    conda-forge
[conda] libcblas                  3.9.0            16_linux64_mkl    conda-forge
[conda] libjpeg-turbo             2.0.0                h9bf148f_0    pytorch
[conda] liblapack                 3.9.0            16_linux64_mkl    conda-forge
[conda] liblapacke                3.9.0            16_linux64_mkl    conda-forge
[conda] mkl                       2022.1.0           h84fe81f_915    conda-forge
[conda] numpy                     1.24.4                   pypi_0    pypi
[conda] pytorch                   2.1.0           py3.10_cuda11.8_cudnn8.7.0_0    pytorch
[conda] pytorch-cuda              11.8                 h7e8668a_5    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torch-tb-profiler         0.4.3                    pypi_0    pypi
[conda] torchaudio                2.1.0               py310_cu118    pytorch
[conda] torchinfo                 1.8.0              pyhd8ed1ab_0    conda-forge
[conda] torchtriton               2.1.0                     py310    pytorch
[conda] torchvision               0.16.0              py310_cu118    pytorch

8uurg avatar Nov 02 '23 11:11 8uurg

Hi @8uurg , I guess these come from diverging version of libjpeg[-turbo].

Could you please share the PIL version, as well as the output of ldd _imaging.so (from https://stackoverflow.com/a/24397115)

Could you also please share the output of

import torch
import torchvision

print(f"{torch.ops.image._jpeg_version() = }")
print(f"{torch.ops.image._is_compiled_against_turbo() = }")

NicolasHug avatar Nov 02 '23 12:11 NicolasHug

Hi @8uurg , I guess these come from diverging version of libjpeg[-turbo].

Could you please share the PIL version, as well as the output of ldd _imaging.so (from https://stackoverflow.com/a/24397115)

PIL.__version__ = '9.3.0'

ldd <path-to-virtualenv>/lib/python3.10/site-packages/PIL/_imaging.cpython-310-x86_64-linux-gnu.so 
        linux-vdso.so.1 (0x00007fff6d3eb000)
        libjpeg.so.9 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../libjpeg.so.9 (0x00007f1b34cad000)
        libz.so.1 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../libz.so.1 (0x00007f1b34c93000)
        libtiff.so.5 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../libtiff.so.5 (0x00007f1b34c06000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f1b34800000)
        libwebp.so.7 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../.././libwebp.so.7 (0x00007f1b34b71000)
        libzstd.so.1 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../.././libzstd.so.1 (0x00007f1b34a9f000)
        liblzma.so.5 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../.././liblzma.so.5 (0x00007f1b34a76000)
        libLerc.so => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../.././libLerc.so (0x00007f1b34764000)
        libdeflate.so.0 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../.././libdeflate.so.0 (0x00007f1b34a66000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f1b34686000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f1b34d6d000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1b34a5f000)
        libstdc++.so.6 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../././libstdc++.so.6 (0x00007f1b344d2000)
        libgcc_s.so.1 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../././libgcc_s.so.1 (0x00007f1b34a46000)

Could you also please share the output of

import torch
import torchvision

print(f"{torch.ops.image._jpeg_version() = }")
print(f"{torch.ops.image._is_compiled_against_turbo() = }")
torch.ops.image._jpeg_version() = 80
torch.ops.image._is_compiled_against_turbo() = True

8uurg avatar Nov 02 '23 12:11 8uurg

Thanks for the output

    libjpeg.so.9 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../libjpeg.so.9 (0x00007f1b34cad000)

I think that's it: PIL is relying on libjpeg while torchvision is relying on libjpeg-turbo. They're both jpeg-compliant and, from past experiments, models aren't sensitive to these decoding differences. I think that if you installed PIL 10, you'd get turbo for PIL as well, and have results that are closer to the torchvision ones.

NicolasHug avatar Nov 02 '23 13:11 NicolasHug

Thanks for the information!

I was investigating this because there was a small change in the validation performance of a model after changing how the images were loaded. The difference was not too big (single sample got changed, I think), but I was expecting the result to be identical. For those following in my footsteps: when installed through conda in my environment, Pillow 10 doesn't seem to link libjpeg-turbo.

When I install Pillow via pip, I can confirm that it is indeed a difference between libjpeg and libjpeg-turbo

8uurg avatar Nov 02 '23 13:11 8uurg