blurhash-python add an optional max_size argument to encode() for speed & convenience

Providing max_size to encode() will first resize the image to max_size (by creating an Image thumbnail), and will then create the blurhash. Resizing first will achieve much faster hashing for large images.

Jan 19 '21 01:01 m4hmmd

Hello @m4hmmd, thank you for the pull request!

Resizing first will achieve much faster hashing for large images.

This is interesting. I wonder if it would make sense to resize always to improve performance. I think the resulting blurhash shouldn't change much if the image is resized first. Do you have more details on this? I'd be interested in the image size, max_size you've found effective, and difference in speed. Anyway, I'll have to test this.

There's one improvement needed to the code changes you've suggested. New tests should be added since you've added a new parameter to the public interface. The new tests should at least verify that the encode function works with the default value () like all tests do now, and also with some valid non-default value. It may be a good idea to test invalid parameter values as well, although the Pillow library should handle those and raise appropriate exceptions. One minor change I'd suggest is to think if the max_size parameter could be renamed as something more descriptive, but it might not be easy to find one if resizing the image primarily affects performance and not the resulting blurhash. That's why I'm thinking about resizing by default.

Jan 19 '21 08:01 lautat

Hello @lautat !

Here is a brief test I ran on some images (size in Bytes, times in seconds): Screen Shot 2021-01-19 at 5 41 37 PM Note: images 1-5 were illustrations, whereas 6 was a photo.

I resize to 50x50, and the resulting blurhashes are for all intents and purposes indistinguishable. In particular, I compute blurhash on images sent via chat in my app, so I'd want to have the hash computation time <1s at the very worst. I agree that resizing by default might be better. About the name, max_size can be indeed confusing, maybe it can be renamed as resize_to.

Jan 19 '21 14:01 m4hmmd

Apologies, I completely forgot to get back to this. Since #17, encode can be called with an image object. It is possible to get the same result by loading the image with Pillow and resizing the image before passing it to encode. README contains the following example:

import blurhash
from PIL import Image

with Image.open("image.jpg") as image:
    image.thumbnail((100, 100))
    hash = blurhash.encode(image, x_components=4, y_components=3)

Thus, I'll close this pull request.

May 09 '23 09:05 lautat