scikit-image icon indicating copy to clipboard operation
scikit-image copied to clipboard

transform.resize is prohibitively slow (x15 compared to OpenCV or PIL)

Open kwikwag opened this issue 1 year ago • 9 comments

Description:

I want to use skimage as my main image-processing library (aside from PIL), but it is prohibitively slow for resizing. Ideally I will be able to resize when I have an ndarray (as opposed to a PIL Image), and converting to and from PIL incurs memory copies.

I tried three orders, and got

nn slowdown vs PIL: 14.8
nn slowdown vs OpenCV: 17.8
nn slowdown vs SciPy: 1.2
bilinear slowdown vs PIL: 3.6
bilinear slowdown vs OpenCV: 47.8
bilinear slowdown vs SciPy: 3.2
bicubic slowdown vs PIL: 16.4
bicubic slowdown vs OpenCV: 138.8
bicubic slowdown vs SciPy: 1.1

Following is a comparison chart for downscaling a 512x512 image using four methods (code further down in issue). cv2 timing is invisible because it's so fast, comparatively. bench_resize

Way to reproduce:

# %%
import cProfile
import re
from pathlib import Path

import skimage.transform
import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy.ndimage as ndi
import seaborn as sns
from PIL import Image
from tqdm import tqdm

# %%
im = (np.random.uniform(0, 255, (512, 512))).astype(np.uint8)
pil_im = Image.fromarray(im)

# %%
def scipy_resize(im, size, order=0):
    output_shape = (size[1], size[0])
    input_shape = im.shape
    zoom_factors = 1/np.divide(input_shape, output_shape)
    return ndi.zoom(im, zoom_factors, order=order, mode='constant', cval=0, grid_mode=True)

CV2_ORDER_MAP = {
    0: cv2.INTER_NEAREST,
    1: cv2.INTER_LINEAR,
    2: cv2.INTER_CUBIC,
}
def cv2_resize(im, size, order=0):
    return cv2.resize(im, size, interpolation=CV2_ORDER_MAP[order])

PIL_ORDER_MAP = {
    0: Image.NEAREST,
    1: Image.BILINEAR,
    2: Image.BICUBIC,
}
def pil_resize(im, size, order=0):
    return im.resize(size, PIL_ORDER_MAP[order])

def skimage_resize(im, size, order=0):
    return skimage.transform.resize(im, size, order=order, anti_aliasing=False),

out_dir = Path('temp/bench_resize')
out_dir.mkdir(parents=True, exist_ok=True)
tests = {}
for order_name, order in (("nn", 0), ("bilinear", 1), ("bicubic", 2)):
    for size in ((91, 180), (128, 128)):
        for method in [cv2_resize, skimage_resize, pil_resize, scipy_resize]:
            size_name = 'x'.join(str(x) for x in size)
            method_name = re.sub('_resize$', '', method.__name__)
            tests[order_name, size_name, method_name] = (method, dict(size=size, order=order))


results = {}
progress = tqdm(tests.items())
for key, (f, kwargs) in progress:
    progress.set_postfix(dict(zip(('quality', 'size', 'method'), key)))
    with cProfile.Profile() as pr:
        x = pil_im if key[-1] == "pil" else im
        for _ in range(100):
            f(x, **kwargs)
        pr.dump_stats(out_dir/('bench_resize_' + '_'.join(key) + '.profile'))
    results[key] = %timeit -o -q f(x, **kwargs)
results

# %%
df = pd.DataFrame([
    [*key, timing*1e6]
    for key, result in results.items()
    for timing in result.timings
], columns=['quality', 'size', 'method', 'timing'])
df

# %%
sns.set_theme(style="whitegrid")

g = sns.catplot(
    data=df, kind="bar",
    x="quality", y="timing", hue="method",
    col='size',
    # hue_order=["method", "size"],
    errorbar="sd", palette="dark", alpha=.6, height=6,
)
g.despine(left=True)
g.set_axis_labels("", "Duration [µs]")
g.legend.set_title("")

# %%
g = sns.catplot(
    data=df[df['method'].isin({'pil', 'cv2'})], kind="bar",
    x="quality", y="timing", hue="method",
    col='size',
    # hue_order=["method", "size"],
    errorbar="sd", palette="dark", alpha=.6, height=6,
)
g.despine(left=True)
g.set_axis_labels("", "Duration [µs]")
g.legend.set_title("")

# %%
is_pil = df['method']=='pil'
is_cv2 = df['method']=='cv2'
is_skimage = df['method']=='skimage'
is_scipy = df['method']=='scipy'
for quality in ['nn', 'bilinear', 'bicubic']:
    choice = df['quality']==quality
    print(quality, "slowdown vs PIL:", round(df[is_skimage & choice].timing.min() / df[is_pil & choice].timing.max(), 1))
    print(quality, "slowdown vs OpenCV:", round(df[is_skimage & choice].timing.min() / df[is_cv2 & choice].timing.max(), 1))
    print(quality, "slowdown vs SciPy:", round(df[is_skimage & choice].timing.min() / df[is_scipy & choice].timing.max(), 1))
    

Version information:

SciKit-Image '0.20.0' on Windows 10 Anaconda Python 3.9.16

kwikwag avatar Jun 27 '23 17:06 kwikwag

Thanks for the detailed performance report! scikit-image uses scipy.ndimage.zoom under the hood, so it's not surprising that the results are similar. I am curious why Pillow's resizing is so much faster. I tracked Pillow's resize to _imaging.c#L1822 but didn't yet look further.

I only tested for some cases, but the output from Pillow seems close with scikit-image's. As Pillow is a default dependency I don't find it unreasonable to look into updating our approach and rely more on Pillows implementation. I also noticed that they have their own implementation for Affine transforms.

lagru avatar Jun 28 '23 09:06 lagru

Thank you @kwikwag for this report. It is an extension of #3122. I refactored your script to take into account two more parameters input and output dtypes:

Code here
import cProfile
import re
from pathlib import Path

import skimage as ski
import cv2
import numpy as np
import scipy.ndimage as ndi
import seaborn as sns
from PIL import Image
from tqdm import tqdm
import pandas as pd
from time import time
import matplotlib.pyplot as plt


CV2_ORDER_MAP = {
    0: cv2.INTER_NEAREST,
    1: cv2.INTER_LINEAR,
    2: cv2.INTER_CUBIC,
}

PIL_ORDER_MAP = {
    0: Image.NEAREST,
    1: Image.BILINEAR,
    2: Image.BICUBIC,
}


def scipy_resize(im, size, order=0):
    output_shape = (size[1], size[0])
    input_shape = im.shape
    zoom_factors = 1/np.divide(input_shape, output_shape)
    return ndi.zoom(im, zoom_factors, order=order, mode='constant',
                    cval=0, grid_mode=False)


def cv2_resize(im, size, order=0):
    return cv2.resize(im, size, interpolation=CV2_ORDER_MAP[order])


def pil_resize(im, size, order=0):
    return im.resize(size, PIL_ORDER_MAP[order])


def skimage_resize(im, size, order=0):
    return ski.transform.resize(im, size, order=order, mode="constant",
                                anti_aliasing=False)


def get_results(im, size_list, rep=200):

    pil_im = Image.fromarray(im)
    methods_list = [cv2_resize, skimage_resize, pil_resize, scipy_resize]
    out_dir = Path('temp/bench_resize')
    out_dir.mkdir(parents=True, exist_ok=True)
    tests = {}

    for order_name, order in (("nn", 0), ("bilinear", 1), ("bicubic", 2)):
        for size in size_list:
            for method in methods_list:
                size_name = 'x'.join(str(x) for x in size)
                method_name = re.sub('_resize$', '', method.__name__)
                tests[order_name, size_name, method_name] = (method,
                                                             dict(size=size,
                                                                  order=order))

    results = {}
    progress = tqdm(tests.items(), leave=False)
    for key, (f, kwargs) in progress:

        if key not in results:
            results[key] = []
        progress.set_postfix(dict(zip(('quality', 'size', 'method'), key)))
        with cProfile.Profile() as pr:
            x = pil_im if key[-1] == "pil" else im
            t0 = time()
            for _ in range(rep):
                out = f(x, **kwargs)
            t1 = time()
            pr.dump_stats(
                out_dir/('bench_resize_' + '_'.join(key) + '.profile'))
        results[key].append((t1-t0)/rep)
        results[key].append(np.array(out).dtype.name)

    return results


def get_dataframe(results):

    # %%
    df = pd.DataFrame([
        [*key, timing*1e6, out_dtype]
        for key, (*result, out_dtype) in results.items()
        for timing in result
    ], columns=['quality', 'size', 'method', 'timing', 'out_dtype'])
    return df


def plot_results(df, dtype):

    sns.set_theme(style="whitegrid")

    g = sns.catplot(
        data=df, kind="bar",
        x="quality", y="timing", hue="method",
        col='size',
        errorbar="sd", palette="dark", alpha=.6, height=6,
    )
    g.despine(left=True)
    g.set_axis_labels("", "Duration [µs]")
    g.legend.set_title("")
    g.fig.suptitle(f"{dtype=}")

    g = sns.catplot(
        data=df[df['method'].isin({'pil', 'cv2'})], kind="bar",
        x="quality", y="timing", hue="method",
        col='size',
        errorbar="sd", palette="dark", alpha=.6, height=6,
    )
    g.despine(left=True)
    g.set_axis_labels("", "Duration [µs]")
    g.legend.set_title("")
    g.fig.suptitle(f"{dtype=}")


def print_results(df):

    is_pil = df['method'] == 'pil'
    is_cv2 = df['method'] == 'cv2'
    is_skimage = df['method'] == 'skimage'
    is_scipy = df['method'] == 'scipy'
    for size in sorted(set(df['size'])):  # ['91x180', '128x128']:
        tqdm.write(f"\n\t\t{size=}")
        for quality in sorted(set(df['quality'])):
            choice = np.logical_and(df['quality'] == quality,
                                    df['size'] == size)
            ski_timing = df[is_skimage & choice].timing.min()
            cv2_timing = df[is_cv2 & choice].timing.max()
            pil_timing = df[is_pil & choice].timing.max()
            spy_timing = df[is_scipy & choice].timing.max()

            ski_dtype = set(df[is_skimage & choice].out_dtype.values).pop()
            cv2_dtype = set(df[is_cv2 & choice].out_dtype.values).pop()
            pil_dtype = set(df[is_pil & choice].out_dtype.values).pop()
            spy_dtype = set(df[is_scipy & choice].out_dtype.values).pop()

            tqdm.write(f"\t\t\t{quality} slowdown vs PIL: "
                       # f"{ski_timing:.1f} vs {pil_timing:.1f} µs -> "
                       f"{ski_timing / pil_timing:.1f} "
                       f"({ski_dtype} vs {pil_dtype})")
            tqdm.write(f"\t\t\t{quality} slowdown vs openCV: "
                       # f"{ski_timing:.1f} vs {cv2_timing:.1f} µs -> "
                       f"{ski_timing / cv2_timing:.1f} "
                       f"({ski_dtype} vs {cv2_dtype})")
            tqdm.write(f"\t\t\t{quality} slowdown vs scipy: "
                       # f"{ski_timing:.1f} vs {spy_timing:.1f} µs -> "
                       f"{ski_timing / spy_timing:.1f} "
                       f"({ski_dtype} vs {spy_dtype})")


if __name__ == "__main__":

    for size in [512]:
        tqdm.write(f"\nInput {size=}")
        im = np.random.uniform(0, 255, (size, size))

        for dtype in tqdm(['uint8', 'float32', 'float64']):
            tqdm.write(f"\tInput {dtype=}")

            if dtype == 'float32':
                im = im / 255

            im = im.astype(dtype)

            results = get_results(im,
                                  size_list=((91, 180), (128, 128)))
            df = get_dataframe(results)
            # plot_results(df, dtype)
            print_results(df)

    plt.show()
Results here
Input size=512
                                                                                                          
	Input dtype='uint8'
                                                                                                          
                size='128x128'                                                                            
                        bicubic slowdown vs PIL: 11.1 (float64 vs uint8)                                  
                        bicubic slowdown vs openCV: 155.1 (float64 vs uint8)                              
                        bicubic slowdown vs scipy: 1.2 (float64 vs uint8)                                 
                        bilinear slowdown vs PIL: 1.7 (float64 vs uint8)                                  
                        bilinear slowdown vs openCV: 35.4 (float64 vs uint8)                              
                        bilinear slowdown vs scipy: 2.0 (float64 vs uint8)                                
                        nn slowdown vs PIL: 15.8 (uint8 vs uint8)                                         
                        nn slowdown vs openCV: 26.0 (uint8 vs uint8)                                      
                        nn slowdown vs scipy: 1.4 (uint8 vs uint8)                                        
                                                                                                          
		size='91x180'
                        bicubic slowdown vs PIL: 10.3 (float64 vs uint8)                                  
                        bicubic slowdown vs openCV: 117.5 (float64 vs uint8)                              
                        bicubic slowdown vs scipy: 1.2 (float64 vs uint8)                                 
                        bilinear slowdown vs PIL: 1.7 (float64 vs uint8)                                  
                        bilinear slowdown vs openCV: 31.7 (float64 vs uint8)                              
                        bilinear slowdown vs scipy: 2.0 (float64 vs uint8)                                
                        nn slowdown vs PIL: 22.7 (uint8 vs uint8)                                         
                        nn slowdown vs openCV: 31.3 (uint8 vs uint8)                                      
                        nn slowdown vs scipy: 1.9 (uint8 vs uint8)                                        
                                                                                                          
	Input dtype='float32'
                                                                                                          
                size='128x128'                                                                            
                        bicubic slowdown vs PIL: 8.8 (float32 vs float32)                                 
                        bicubic slowdown vs openCV: 308.7 (float32 vs float32)                            
                        bicubic slowdown vs scipy: 1.0 (float32 vs float32)                               
                        bilinear slowdown vs PIL: 1.2 (float32 vs float32)                                
                        bilinear slowdown vs openCV: 33.8 (float32 vs float32)                            
                        bilinear slowdown vs scipy: 1.5 (float32 vs float32)                              
                        nn slowdown vs PIL: 17.1 (float32 vs float32)                                     
                        nn slowdown vs openCV: 27.0 (float32 vs float32)                                  
                        nn slowdown vs scipy: 1.8 (float32 vs float32)                                    
                                                                                                          
		size='91x180'
                        bicubic slowdown vs PIL: 8.6 (float32 vs float32)                                 
                        bicubic slowdown vs openCV: 306.3 (float32 vs float32)                            
                        bicubic slowdown vs scipy: 1.0 (float32 vs float32)                               
                        bilinear slowdown vs PIL: 1.2 (float32 vs float32)                                
                        bilinear slowdown vs openCV: 8.9 (float32 vs float32)                             
                        bilinear slowdown vs scipy: 1.5 (float32 vs float32)                              
                        nn slowdown vs PIL: 14.2 (float32 vs float32)                                     
                        nn slowdown vs openCV: 22.7 (float32 vs float32)                                  
                        nn slowdown vs scipy: 1.8 (float32 vs float32)                                    
                                                                                                          
	Input dtype='float64'
                                                                                                          
                size='128x128'                                                                            
                        bicubic slowdown vs PIL: 9.8 (float64 vs float32)                                 
                        bicubic slowdown vs openCV: 63.7 (float64 vs float64)                             
                        bicubic slowdown vs scipy: 1.2 (float64 vs float64)                               
                        bilinear slowdown vs PIL: 1.3 (float64 vs float32)                                
                        bilinear slowdown vs openCV: 30.7 (float64 vs float64)                            
                        bilinear slowdown vs scipy: 1.7 (float64 vs float64)                              
                        nn slowdown vs PIL: 19.9 (float64 vs float32)                                     
                        nn slowdown vs openCV: 19.4 (float64 vs float64)                                  
                        nn slowdown vs scipy: 2.2 (float64 vs float64)                                    
                                                                                                          
		size='91x180'
                        bicubic slowdown vs PIL: 10.3 (float64 vs float32)                                
                        bicubic slowdown vs openCV: 78.2 (float64 vs float64)                             
                        bicubic slowdown vs scipy: 1.2 (float64 vs float64)                               
                        bilinear slowdown vs PIL: 1.3 (float64 vs float32)                                
                        bilinear slowdown vs openCV: 15.5 (float64 vs float64)                            
                        bilinear slowdown vs scipy: 1.7 (float64 vs float64)                              
                        nn slowdown vs PIL: 17.8 (float64 vs float32)                                     
                        nn slowdown vs openCV: 15.7 (float64 vs float64)                                  
                        nn slowdown vs scipy: 2.2 (float64 vs float64) 
You can also play with input size but it seems that the results are consistent with this parameter. Things to notice:

skimage (+) vs others

  • skimage gives full control on image border management,
  • skimage automaticly manages anti-aliasing with best practices depending on scaling factor, input data type and interpolation order,
  • PIL only outputs uint8 and float32 arrays (when a double precision array is provided, PIL cast it to single precision),

skimage (-) vs others

  • when order>0 input image is converted to float if it is not already (probably one root of the problem),
  • Much slower :confounded: .

openCV is highly optimized (with great support from Intel), so such a performance gap is not a surprise, so I would not say that this issue is a bug...

rfezzani avatar Jun 28 '23 16:06 rfezzani

Even though OpenCV is optimized, comparing with PIL still shows a similar gap in performance. It seems conversion to float is not a major part of the problem for order==0 and order==2 as using PIL with float32 still produces results x15 and x8 faster (respectively) in these instances. Beyond that in order==0 and order==1 there is a still a x1.5 pentaly over SciPy.

If I get a chance I'll try taking look at SciPy's implementation, PIL's and OpenCV's to understand if there is a clear underlying cause... Not sure I will though.

kwikwag avatar Jun 28 '23 20:06 kwikwag

On second thought - if you are able to profile the performance of zoom_shift that will give a better clue as to what is going on. I wonder if there is a redundant copy going on?

kwikwag avatar Jun 28 '23 20:06 kwikwag

And another P.S. - I just noticed OpenAI's Gym repo was looking at rescaling specifically (https://github.com/openai/gym/issues/2341) and they reference another library lycon2, turned into tinyscaler (https://github.com/Farama-Foundation/tinyscaler) which seems super-relevant.

kwikwag avatar Jun 28 '23 20:06 kwikwag

Great points @rfezzani! Removing the bug label and adding potential "enhancement" instead.

lagru avatar Jun 29 '23 14:06 lagru

In the meantime, what do you think about pointing out

img = np.array(Image.fromarray(im).resize(size))

in the docstring as a fast alternative for if not all the bells and whistles of skimage.transfom.resize are needed?

lagru avatar Jun 29 '23 15:06 lagru

@lagru, this is in fact a good option :wink:

rfezzani avatar Jun 29 '23 15:06 rfezzani

Hello scikit-image core devs! There hasn't been any activity on this issue for more than 180 days. I have marked it as "dormant" to make it easy to find. To our contributors, thank you for your contribution and apologies if this issue fell through the cracks! Hopefully this ping will help bring some fresh attention to the issue. If you need help, you can always reach out on our forum If you think that this issue is no longer relevant, you may close it, or we may do it at some point (either way, it will be done manually).

github-actions[bot] avatar Dec 27 '23 02:12 github-actions[bot]