Pillow icon indicating copy to clipboard operation
Pillow copied to clipboard

test_separate_tables failure with Pillow 11.2.1

Open kulikjak opened this issue 6 months ago • 4 comments

Hi, I am in a process of updating pillow we deliver with Oracle Solaris to version 11.2.1, and I am hitting the following failure:

===================================================== FAILURES ======================================================
_________________________________________ TestFileJpeg.test_separate_tables _________________________________________

self = <Tests.test_file_jpeg.TestFileJpeg object at 0x7fd12ce1a8a0>

    def test_separate_tables(self) -> None:
        im = hopper()
        data = []  # [interchange, tables-only, image-only]
        for streamtype in range(3):
            out = BytesIO()
            im.save(out, format="JPEG", streamtype=streamtype)
            data.append(out.getvalue())

        # SOI, EOI
        for marker in b"\xff\xd8", b"\xff\xd9":
            assert marker in data[1]
            assert marker in data[2]
        # DHT, DQT
        for marker in b"\xff\xc4", b"\xff\xdb":
>           assert marker in data[1]
E           AssertionError: assert b'\xff\xc4' in b'\xff\xd8\xff\xdb\x00C\x00\x08\x06\x06\x07\x06\x05\x08\x07\x07\x07\t\t\x08\n\x0c\x14\r\x0c\x0b\x0b\x0c\x19\x12\x13\x0....342\x
ff\xdb\x00C\x01\x08\t\t\x0c\x0b\x0c\x18\r\r\x182!\x1c!22222222222222222222222222222222222222222222222222\xff\xd9'

Only the b"\xff\xc4" marker is problematic - when I remove it, the test passes again.

I am getting the same error with all Pythons we have (3.9, 3.11 and 3.13) and with both libjpeg 9e and 9f.

Unfortunately, I am unsure what might be wrong here. Would you have some pointers/suggestions as for where to look?

What are your OS, Python and Pillow versions?

  • OS: Oracle Solaris
  • Python: 3.9.22, 3.11.12, 3.13.3
  • Pillow: 11.2.1
--------------------------------------------------------------------
Pillow 11.2.1
Python 3.13.3 (main, May 16 2025, 09:09:42) [GCC 14.2.0]
--------------------------------------------------------------------
Python executable is /usr/bin/python3.13
System Python files loaded from /usr
--------------------------------------------------------------------
Python Pillow modules loaded from /userland-gate/components/python/pillow/build/prototype/i386/usr/lib/python3.13/vendor-packages/PIL
Binary Pillow modules loaded from /userland-gate/components/python/pillow/build/prototype/i386/usr/lib/python3.13/vendor-packages/PIL
--------------------------------------------------------------------
--- PIL CORE support ok, compiled for 11.2.1
*** TKINTER support not installed
--- FREETYPE2 support ok, loaded 2.13.3
--- LITTLECMS2 support ok, loaded 2.16
--- WEBP support ok, loaded 1.3.2
*** AVIF support not installed
--- JPEG support ok, compiled for 9.0
--- OPENJPEG (JPEG2000) support ok, loaded 2.5.3
--- ZLIB (PNG/ZIP) support ok, loaded 1.3.1
--- LIBTIFF support ok, loaded 4.6.0
*** RAQM (Bidirectional Text) support not installed
*** LIBIMAGEQUANT (Quantization method) support not installed
--- XCB (X protocol) support ok
--------------------------------------------------------------------

kulikjak avatar Jun 17 '25 15:06 kulikjak

Only the b"\xff\xd8" marker is problematic - when I remove it, the test passes again.

I think you meant b'\xff\xc4'. That's the marker value in your error. As you can see from the comments, this is the DHT marker. The test came from #7491.

I was able to replicate the error by using libjpeg rather than libjpeg-turbo on Ubuntu - https://github.com/radarhere/Pillow/actions/runs/15720832336/job/44301259065

So an immediate solution would be to switch to libjpeg-turbo.

radarhere avatar Jun 17 '25 22:06 radarhere

I've created https://github.com/python-pillow/Pillow/pull/9025 to only check the DHT marker if libjpeg-turbo is being used.

radarhere avatar Jun 18 '25 12:06 radarhere

I think you meant b'\xff\xc4'. That's the marker value in your error.

Ah, yes - sorry about that. I fixed it above.

Thank you for the pointers - I investigated further, and interstingly, it fails both assertions - that is, it is not present in data[1], and it is present in data[2].

The problem is apparently that when jpeg_write_tables/jpeg_suppress_tables is called, the dc_huff_tbl_ptrs and ac_huff_tbl_ptrs are NULL and thus writing/suppression is a noop. Only when jpeg_start_compress is called, those tables are created (jpeg_start_compress -> prepare_for_pass -> start_pass_huff). Turbo seems to initialize the object slightly differently and tables are apparently created sooner (though I didn't test that, just compared the code). But none of that seems to point to any issue in Pillow.

Thank you for the #9025, and I will report my findings to libjpeg.

kulikjak avatar Jun 18 '25 12:06 kulikjak

Thanks. Let us know what they say.

radarhere avatar Jun 18 '25 12:06 radarhere

Hi, sorry it took so long, but I contacted Guido from libjpeg and here are his answers to my questions (merging important notes from several emails - hope it makes sense):

The major operating systems have native arithmetic coding support in current versions.

Atrithmetic coded JPEG files don't have DHT markers. Therefore, the JPEG library does not setup default Huffman tables in jpeg_set_defaults() before jpeg_start_compress().

The default Huffman tables are no longer required, because the image may be arithmetic coded or otherwise the default tables are created on the fly when not available.

The application is required to set the entropy coding choice (either Huffman or arithmetic coding) between the jpeg_set_defaults() and jpeg_start_compress() calls. Therefore we don't want to unnecessarily setup default Huffman tables prematurely in jpeg_set_defaults() when they are not required for arithmetic coding. Only in jpeg_start_compress() can we know the actual entropy coding choice and create the tables if necessary.

The conclusion is simply that you should not require the presence of DHT tables in any JPEG datastream.

In the separated tables/image case, [Huffman tables] now belong to the image part rather than the tables part, so that DHT and DAC markers behave in the same way. This may be revised altogether, but is currently consistent for DHT and DAC.

So, in conclusion, it works works as expected.

Guido agreed that the libjpeg documentation which states that "the tables-only file should contain DHT markers" indeed needs to be updated (and potentially some other parts as well), because that is no longer correct (the documentation was apprently written back when only Huffman and no arithmetic coding was supported by JPEG).

kulikjak avatar Dec 01 '25 11:12 kulikjak

Thanks for getting back to us about this.

radarhere avatar Dec 01 '25 12:12 radarhere