Pillow icon indicating copy to clipboard operation
Pillow copied to clipboard

Saving PNG images with PIL is 4 times slower than saving them with OpenCV

Open apacha opened this issue 3 years ago • 15 comments

What did you do?

I want to save an image to disk and noticed a severe performance bottleneck for the part of my code that used PIL for saving images compared to a similar part in my codebase that uses OpenCV to save the images.

What did you expect to happen?

I expected both methods to be somewhat similar in performance.

What actually happened?

PIL was at least four times slower than converting the PIL.Image into an numpy array and storing the array using cv2.imwrite.

What are your OS, Python and Pillow versions?

  • OS: MacOS 12.1
  • Python: 3.9
  • Pillow: 9.0.0

Here is the benchmark code that I used:

import time
import cv2
import numpy
from PIL import Image
from tqdm import tqdm
from PIL.ImageDraw import ImageDraw

if __name__ == '__main__':
    image = Image.new("RGB", (4000, 2800))
    image_draw = ImageDraw(image)
    image_draw.rectangle((10, 20, 60, 120), fill=(230, 140, 25))
    trials = 50

    t1 = time.time()
    for i in tqdm(range(trials)):
        image.save("tmp1.png")
    t2 = time.time()
    print(f"Total time for PIL: {t2 - t1}s ")

    t1 = time.time()
    for i in tqdm(range(trials)):
        image_array = numpy.array(image)
        image_array = cv2.cvtColor(image_array, cv2.COLOR_RGB2BGR)
        cv2.imwrite("tmp2.png", image_array)
    t2 = time.time()
    print(f"Total time for OpenCV: {t2 - t1}s ")

    img1 = cv2.imread("tmp1.png")
    img2 = cv2.imread("tmp2.png")
    print(f"Images are equal: {numpy.all(img1 == img2)}")

which produced

100%|██████████| 50/50 [00:26<00:00,  1.91it/s]
Total time for PIL: 26.21s 
100%|██████████| 50/50 [00:06<00:00,  8.00it/s]
Total time for OpenCV: 6.24s 
Images are equal: True

The produced images are slightly different in file-size so potentially there is a more sophisticated compression being used.

Here are the two (black) images I obtained tmp1.png (PIL-image, 33KB) tmp2.png (OpenCV-image, 38KB)

My questions are:

  • Why is PIL so much slower?
  • Is there some way how I can speed up PIL to match the performance of OpenCV in this scenario (other than converting to a numpy array and using OpenCV), e.g., by providing extra parameters to the save method?

Interestingly, if I switch from PNG to JPG, the results are flipped and PIL is faster than OpenCV:

# saving as "tmp1.jpg" and "tmp2.jpg" instead
100%|██████████| 50/50 [00:04<00:00, 11.46it/s]
Total time for PIL: 4.37s 
100%|██████████| 50/50 [00:08<00:00,  6.11it/s]
Total time for OpenCV: 8.17s 

could this be a problem in the PNG encoding library?

apacha avatar Jan 25 '22 13:01 apacha

Here are some thoughts for you.

We allow the compression level to be set when saving PNGs - if I change your code to image.save("tmp1.png", compress_level=1) on my machine, Pillow is almost as fast as OpenCV.

We also allow setting the compression type when saving PNGs - image.save("tmp1.png", compress_type=3) on my machine, Pillow is evenly matched with OpenCV, sometimes faster, sometimes slower.

radarhere avatar Jan 25 '22 23:01 radarhere

Thanks for the hints, I tried to adapt my code and I'm getting the following results:

# image.save("tmp1.png", compress_type=3)
100%|██████████| 50/50 [00:16<00:00,  2.98it/s]
Total time for PIL: 16.7s 
100%|██████████| 50/50 [00:05<00:00,  9.34it/s]
Total time for OpenCV: 5.35s 

and

# image.save("tmp1.png", compress_level=1)
100%|██████████| 50/50 [00:15<00:00,  3.15it/s]
Total time for PIL: 15.90s 
100%|██████████| 50/50 [00:05<00:00,  9.64it/s]
Total time for OpenCV: 5.19s 

which both is still pretty far away from the OpenCV results and leading to resulting image sizes of 33KB (PIL - compress_type=3), and 147KB (PIL - compress_level=1) vs 38KB (OpenCV). Maybe it's a MacOS specific issue?

apacha avatar Jan 28 '22 14:01 apacha

No, it's not macOS specific. I am also a macOS user.

radarhere avatar Feb 14 '22 12:02 radarhere

I recall looking into this a while ago, so my memory may be a bit sketchy, but something I noticed.

For whatever reason, PIL implements its own PNG filtering (as opposed to use something like libpng).
The code lacks any SIMD, and looks relatively unoptimised (e.g. multiple passes over the data without any cache blocking), so even if you set zlib compression to 0, it's still awfully slow, because most of the CPU time is spent on filtering.

It'd be nice if PNG encoding could use a more speed optimised library.

animetosho avatar May 22 '22 02:05 animetosho

May have spoke too soon in my previous comment. Considering how long PNG has been around, I thought that the popular libraries would be reasonably well optimised, however, looking into this, it seems like they're predominantly optimised for decode only, not encode. Somewhat surprising to me, but not completely unreasonable I guess.

So I'm not sure why OpenCV is faster here - they appear to be using libpng for PNG creation, which doesn't use SIMD for encoding. Maybe the compiler's auto-vectorizer just happens to work there?

Regardless, I did find two speed focused encoders, fpng and fpnge, which only surfaced relatively recently. I made some changes to the latter to make it more usable and made a quick-and-dirty Python module for it.
Being a speed focused encoder, which sacrifices some compression for performance, it's likely not suitable for integrating into Pillow. But if it helps, there's direct support for exporting a PIL.Image to PNG, if great compression isn't a high priority.

Whilst writing this comment, I came across Python bindings for fpng. I haven't tried this myself, but it may also be worth checking out.

animetosho avatar Jun 24 '22 09:06 animetosho

Doing a basic investigation, I found that https://github.com/python-pillow/Pillow/blob/7ff05929b665653de4b124929cb50e59ccf5c888/src/libImaging/ZipEncode.c#L279 is where most of the time is spent.

I thought maybe changing a setting in deflateInit2 could be improve the situation, but it has the maximum memLevel, the largest windowBits without using gzip encoding, and we've already discussed changing the compression level and type.

radarhere avatar Oct 18 '22 11:10 radarhere

Thanks for looking into it. If you set compression level to 0, is that still where most of the time is spent?

animetosho avatar Oct 18 '22 12:10 animetosho

Yes.

radarhere avatar Oct 18 '22 21:10 radarhere

That indeed is very surprising. Would be interesting to know what it's spending all its time on, even when it's doing no compression.

animetosho avatar Oct 19 '22 12:10 animetosho

Not sure if anything has changed in the meantime, but for me PIL is waaaay faster than OpenCV.

Python 3.10.6 cv2 4.6.0 PIL version 9.2.0

Using compression level 9, and I moved the image_array outside of the loop to make the benchmark more fair.

import time
import cv2
import numpy
from PIL import Image
from PIL.ImageDraw import ImageDraw

if __name__ == '__main__':
    image = Image.new("RGB", (4000, 2800))
    image_draw = ImageDraw(image)
    image_draw.rectangle((10, 20, 60, 120), fill=(230, 140, 25))
    trials = 20

    t1 = time.time()
    for i in range(trials):
        image.save("tmp1.png", compress_level=9)
    t2 = time.time()
    print(f"Total time for PIL: {t2 - t1}s ")

    compression_level = [cv2.IMWRITE_PNG_COMPRESSION, 9]
    image_array = numpy.array(image)
    image_array = cv2.cvtColor(image_array, cv2.COLOR_RGB2BGR)

    t1 = time.time()
    for i in range(trials):
        cv2.imwrite("tmp2.png", image_array, compression_level)
    t2 = time.time()
    print(f"Total time for OpenCV: {t2 - t1}s ")

    img1 = cv2.imread("tmp1.png")
    img2 = cv2.imread("tmp2.png")
    print(f"Images are equal: {numpy.all(img1 == img2)}")

Total time for PIL: 5.534168481826782s Total time for OpenCV: 9.936758279800415s Images are equal: True

Compress level 3 (which is OpenCV standard) gives me this: Total time for PIL: 2.8487443923950195s Total time for OpenCV: 7.241240978240967s Images are equal: True

MathijsNL avatar Jan 19 '23 11:01 MathijsNL

Pillow is indeed faster in the code from the previous comment - but that isn't because anything changed, but just because the previous comment is using compression, whereas the original post isn't using compression.

If the previous comment shows an acceptable comparison for compression, then Pillow slower without compression, but faster with.

radarhere avatar Aug 07 '23 07:08 radarhere

When I'm running the code from https://github.com/python-pillow/Pillow/issues/5986#issuecomment-1396849470, I still obtain the following result (averaged the numbers from three runs):

Total time for PIL: 10.6s 
Total time for OpenCV: 6.6s 
Images are equal: True

would be interesting to understand, why the results are so contradicting, when running the same code on different machines.

apacha avatar Aug 07 '23 08:08 apacha

whereas the original post isn't using compression.

That is simply not true. Doing so would result in a file size as big as a BMP image. I modified the drawing code slightly so it isn't just almost black only.

The table below shows the file size for each compression option. Using the default option gives a file size that is 2.7x smaller with PIL. Also interesting to see that the default value (so not specifying compression at all) you can see clearly that PIL uses compression 6 by default because the file size is the same. OpenCV on the other hand is somewhere in between compression level 3 and 4.

Compressing with level 0 gives an image that is about the size of the image (4000x2800x3=33600000 + some png headers).

Compress Level OpenCV Size PIL Size
Default 113301 42045
0 33656757 33613677
1 160752 160484
2 160233 159916
3 159575 159291
4 41927 41745
5 42023 41840
6 42226 42045
7 42212 42030
8 41786 41578
9 41792 41585
    image = Image.new("RGB", (4000, 2800))
    image_draw = ImageDraw(image)
    image_draw.rectangle((10, 20, 60, 120), fill=(230, 140, 25))

    num_squares = 100

    for _ in range(num_squares):
        x1 = random.randint(0, 3950)
        y1 = random.randint(0, 2750)
        x2 = x1 + random.randint(10, 150)
        y2 = y1 + random.randint(10, 150)
        
        fill_color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))
        image_draw.rectangle((x1, y1, x2, y2), fill=fill_color)

would be interesting to understand, why the results are so contradicting, when running the same code on different machines.

As for the save times, there does seem to be a lot of speed difference between Pillow 9.0 and 9.2! I just noticed that I used that version in my earlier tests and that is probably the cause of the difference in time measurement.

So my conclusion so far is that Pillow 9.0 does indeed have something odd going on with PNG save times, which can be resolved by updating to Pillow 9.1.0 or higher, just tested that as well.

MathijsNL avatar Aug 07 '23 09:08 MathijsNL

whereas the original post isn't using compression.

That is simply not true. Doing so would result in a file size as big as a BMP image.

Perhaps I could have said that the code in the original post didn't specify compression.

radarhere avatar Aug 07 '23 09:08 radarhere

Just for another data point, when I was still using Adobe Photoshop last year, their "best compression" option which would try multiple compression parameters to try to find the optimal (or at least the best within some amount of time) for an image took something like 10s for relatively small (~2560x1600ish) images. Lossless JPEG2000 was actually a decent amount faster aside from being smaller. I remember reading a long time ago that finding the optimal PNG compression involved trying multiple parameters for compression within a range since they couldn't be predicted accurately in advance but things may have changed since then.

NeedsMoar avatar Sep 06 '23 17:09 NeedsMoar