Pillow
Pillow copied to clipboard
Saving PNG images with PIL is 4 times slower than saving them with OpenCV
What did you do?
I want to save an image to disk and noticed a severe performance bottleneck for the part of my code that used PIL for saving images compared to a similar part in my codebase that uses OpenCV to save the images.
What did you expect to happen?
I expected both methods to be somewhat similar in performance.
What actually happened?
PIL was at least four times slower than converting the PIL.Image into an numpy array and storing the array using cv2.imwrite.
What are your OS, Python and Pillow versions?
- OS: MacOS 12.1
- Python: 3.9
- Pillow: 9.0.0
Here is the benchmark code that I used:
import time
import cv2
import numpy
from PIL import Image
from tqdm import tqdm
from PIL.ImageDraw import ImageDraw
if __name__ == '__main__':
image = Image.new("RGB", (4000, 2800))
image_draw = ImageDraw(image)
image_draw.rectangle((10, 20, 60, 120), fill=(230, 140, 25))
trials = 50
t1 = time.time()
for i in tqdm(range(trials)):
image.save("tmp1.png")
t2 = time.time()
print(f"Total time for PIL: {t2 - t1}s ")
t1 = time.time()
for i in tqdm(range(trials)):
image_array = numpy.array(image)
image_array = cv2.cvtColor(image_array, cv2.COLOR_RGB2BGR)
cv2.imwrite("tmp2.png", image_array)
t2 = time.time()
print(f"Total time for OpenCV: {t2 - t1}s ")
img1 = cv2.imread("tmp1.png")
img2 = cv2.imread("tmp2.png")
print(f"Images are equal: {numpy.all(img1 == img2)}")
which produced
100%|██████████| 50/50 [00:26<00:00, 1.91it/s]
Total time for PIL: 26.21s
100%|██████████| 50/50 [00:06<00:00, 8.00it/s]
Total time for OpenCV: 6.24s
Images are equal: True
The produced images are slightly different in file-size so potentially there is a more sophisticated compression being used.
Here are the two (black) images I obtained tmp1.png (PIL-image, 33KB) tmp2.png (OpenCV-image, 38KB)
My questions are:
- Why is PIL so much slower?
- Is there some way how I can speed up PIL to match the performance of OpenCV in this scenario (other than converting to a numpy array and using OpenCV), e.g., by providing extra parameters to the save method?
Interestingly, if I switch from PNG to JPG, the results are flipped and PIL is faster than OpenCV:
# saving as "tmp1.jpg" and "tmp2.jpg" instead
100%|██████████| 50/50 [00:04<00:00, 11.46it/s]
Total time for PIL: 4.37s
100%|██████████| 50/50 [00:08<00:00, 6.11it/s]
Total time for OpenCV: 8.17s
could this be a problem in the PNG encoding library?
Here are some thoughts for you.
We allow the compression level to be set when saving PNGs - if I change your code to image.save("tmp1.png", compress_level=1)
on my machine, Pillow is almost as fast as OpenCV.
We also allow setting the compression type when saving PNGs - image.save("tmp1.png", compress_type=3)
on my machine, Pillow is evenly matched with OpenCV, sometimes faster, sometimes slower.
Thanks for the hints, I tried to adapt my code and I'm getting the following results:
# image.save("tmp1.png", compress_type=3)
100%|██████████| 50/50 [00:16<00:00, 2.98it/s]
Total time for PIL: 16.7s
100%|██████████| 50/50 [00:05<00:00, 9.34it/s]
Total time for OpenCV: 5.35s
and
# image.save("tmp1.png", compress_level=1)
100%|██████████| 50/50 [00:15<00:00, 3.15it/s]
Total time for PIL: 15.90s
100%|██████████| 50/50 [00:05<00:00, 9.64it/s]
Total time for OpenCV: 5.19s
which both is still pretty far away from the OpenCV results and leading to resulting image sizes of 33KB (PIL - compress_type=3), and 147KB (PIL - compress_level=1) vs 38KB (OpenCV). Maybe it's a MacOS specific issue?
No, it's not macOS specific. I am also a macOS user.
I recall looking into this a while ago, so my memory may be a bit sketchy, but something I noticed.
For whatever reason, PIL implements its own PNG filtering (as opposed to use something like libpng).
The code lacks any SIMD, and looks relatively unoptimised (e.g. multiple passes over the data without any cache blocking), so even if you set zlib compression to 0, it's still awfully slow, because most of the CPU time is spent on filtering.
It'd be nice if PNG encoding could use a more speed optimised library.
May have spoke too soon in my previous comment. Considering how long PNG has been around, I thought that the popular libraries would be reasonably well optimised, however, looking into this, it seems like they're predominantly optimised for decode only, not encode. Somewhat surprising to me, but not completely unreasonable I guess.
So I'm not sure why OpenCV is faster here - they appear to be using libpng for PNG creation, which doesn't use SIMD for encoding. Maybe the compiler's auto-vectorizer just happens to work there?
Regardless, I did find two speed focused encoders, fpng and fpnge, which only surfaced relatively recently. I made some changes to the latter to make it more usable and made a quick-and-dirty Python module for it.
Being a speed focused encoder, which sacrifices some compression for performance, it's likely not suitable for integrating into Pillow. But if it helps, there's direct support for exporting a PIL.Image to PNG, if great compression isn't a high priority.
Whilst writing this comment, I came across Python bindings for fpng. I haven't tried this myself, but it may also be worth checking out.
Doing a basic investigation, I found that https://github.com/python-pillow/Pillow/blob/7ff05929b665653de4b124929cb50e59ccf5c888/src/libImaging/ZipEncode.c#L279 is where most of the time is spent.
I thought maybe changing a setting in deflateInit2
could be improve the situation, but it has the maximum memLevel
, the largest windowBits
without using gzip encoding, and we've already discussed changing the compression level and type.
Thanks for looking into it. If you set compression level to 0, is that still where most of the time is spent?
Yes.
That indeed is very surprising. Would be interesting to know what it's spending all its time on, even when it's doing no compression.
Not sure if anything has changed in the meantime, but for me PIL is waaaay faster than OpenCV.
Python 3.10.6 cv2 4.6.0 PIL version 9.2.0
Using compression level 9, and I moved the image_array outside of the loop to make the benchmark more fair.
import time
import cv2
import numpy
from PIL import Image
from PIL.ImageDraw import ImageDraw
if __name__ == '__main__':
image = Image.new("RGB", (4000, 2800))
image_draw = ImageDraw(image)
image_draw.rectangle((10, 20, 60, 120), fill=(230, 140, 25))
trials = 20
t1 = time.time()
for i in range(trials):
image.save("tmp1.png", compress_level=9)
t2 = time.time()
print(f"Total time for PIL: {t2 - t1}s ")
compression_level = [cv2.IMWRITE_PNG_COMPRESSION, 9]
image_array = numpy.array(image)
image_array = cv2.cvtColor(image_array, cv2.COLOR_RGB2BGR)
t1 = time.time()
for i in range(trials):
cv2.imwrite("tmp2.png", image_array, compression_level)
t2 = time.time()
print(f"Total time for OpenCV: {t2 - t1}s ")
img1 = cv2.imread("tmp1.png")
img2 = cv2.imread("tmp2.png")
print(f"Images are equal: {numpy.all(img1 == img2)}")
Total time for PIL: 5.534168481826782s Total time for OpenCV: 9.936758279800415s Images are equal: True
Compress level 3 (which is OpenCV standard) gives me this: Total time for PIL: 2.8487443923950195s Total time for OpenCV: 7.241240978240967s Images are equal: True
Pillow is indeed faster in the code from the previous comment - but that isn't because anything changed, but just because the previous comment is using compression, whereas the original post isn't using compression.
If the previous comment shows an acceptable comparison for compression, then Pillow slower without compression, but faster with.
When I'm running the code from https://github.com/python-pillow/Pillow/issues/5986#issuecomment-1396849470, I still obtain the following result (averaged the numbers from three runs):
Total time for PIL: 10.6s
Total time for OpenCV: 6.6s
Images are equal: True
would be interesting to understand, why the results are so contradicting, when running the same code on different machines.
whereas the original post isn't using compression.
That is simply not true. Doing so would result in a file size as big as a BMP image. I modified the drawing code slightly so it isn't just almost black only.
The table below shows the file size for each compression option. Using the default option gives a file size that is 2.7x smaller with PIL. Also interesting to see that the default value (so not specifying compression at all) you can see clearly that PIL uses compression 6 by default because the file size is the same. OpenCV on the other hand is somewhere in between compression level 3 and 4.
Compressing with level 0 gives an image that is about the size of the image (4000x2800x3=33600000 + some png headers).
Compress Level | OpenCV Size | PIL Size |
---|---|---|
Default | 113301 | 42045 |
0 | 33656757 | 33613677 |
1 | 160752 | 160484 |
2 | 160233 | 159916 |
3 | 159575 | 159291 |
4 | 41927 | 41745 |
5 | 42023 | 41840 |
6 | 42226 | 42045 |
7 | 42212 | 42030 |
8 | 41786 | 41578 |
9 | 41792 | 41585 |
image = Image.new("RGB", (4000, 2800))
image_draw = ImageDraw(image)
image_draw.rectangle((10, 20, 60, 120), fill=(230, 140, 25))
num_squares = 100
for _ in range(num_squares):
x1 = random.randint(0, 3950)
y1 = random.randint(0, 2750)
x2 = x1 + random.randint(10, 150)
y2 = y1 + random.randint(10, 150)
fill_color = (random.randint(0, 255), random.randint(0, 255), random.randint(0, 255))
image_draw.rectangle((x1, y1, x2, y2), fill=fill_color)
would be interesting to understand, why the results are so contradicting, when running the same code on different machines.
As for the save times, there does seem to be a lot of speed difference between Pillow 9.0 and 9.2! I just noticed that I used that version in my earlier tests and that is probably the cause of the difference in time measurement.
So my conclusion so far is that Pillow 9.0 does indeed have something odd going on with PNG save times, which can be resolved by updating to Pillow 9.1.0 or higher, just tested that as well.
whereas the original post isn't using compression.
That is simply not true. Doing so would result in a file size as big as a BMP image.
Perhaps I could have said that the code in the original post didn't specify compression.
Just for another data point, when I was still using Adobe Photoshop last year, their "best compression" option which would try multiple compression parameters to try to find the optimal (or at least the best within some amount of time) for an image took something like 10s for relatively small (~2560x1600ish) images. Lossless JPEG2000 was actually a decent amount faster aside from being smaller. I remember reading a long time ago that finding the optimal PNG compression involved trying multiple parameters for compression within a range since they couldn't be predicted accurately in advance but things may have changed since then.