sharp icon indicating copy to clipboard operation
sharp copied to clipboard

It would be nice to have a super fast 100x faster than pixelmatch function

Open jasonkhanlar opened this issue 5 years ago • 8 comments

What are you trying to achieve? Use the sharp library to compare an input image to hundreds/thousands of images to find the closest match at speeds 100x faster than https://github.com/mapbox/pixelmatch

Have you searched for similar feature requests? https://github.com/lovell/sharp/issues/1067 https://github.com/lovell/sharp/issues/1067#issuecomment-580969548 (my comment)

What would you expect the API to look like? I'm not sure yet, but I prepared a proof of concept that is exciting to mention (at least for me)

What alternatives have you considered? None.

Is there a sample image that helps explain? No.

jasonkhanlar avatar Feb 01 '20 00:02 jasonkhanlar

Perhaps you could publish the logic in https://github.com/lovell/sharp/issues/1067#issuecomment-580969548 as a separate module?

lovell avatar Feb 01 '20 10:02 lovell

Something like dhash would be interesting to try:

https://github.com/Nakilon/dhash-vips/blob/master/lib/dhash-vips.rb

It does approximately:

  • Chop the image into a 8 x 8 grid (like a chessboard)
  • For each of the 64 grid squares, is the average of pixels in that square lighter or darker than the overall image average? Set that square to 1 or 0
  • Turn the 64 bits into a 64-bit uint

Now you have a fingerprint of the large-scale structure of the image that you can store in a DB. To compare two fingerprints, EOR them together and count the number of set bits. A smaller number is a better match.

jcupitt avatar Feb 14 '20 09:02 jcupitt

@jcupitt thank you for mentioning ..D Though you have actually described aHash, not dHash. aHash compares the pixel to the image average, dHash compares direct neighbours. idHash is like dHash but checks only the most remarkable half of pairs. There is no commonly accepted set of test images or universal metric that would measure how accurately it works. Each algorithm has own speed in hashing and in comparing of the hashes. And it also depends on what are your input images -- pixel based algorithms like aHash, dHash, idHash and maybe pHash will work enough well with common photos but not for stars sky (there must be specialized algorithms that know what the "star" is) or country flags (should not be color blind). It's like sorting algorithms -- you are free to use any of them but some might suit your data better than others. As the issue title says the function is called "pixelmatch" the aHash probably fits but it's colorblind. And if after resizing to 8x8 images are "the same" maybe we actually really need to know if there are any minor differences. Only after you compare the 8x8 hashes you can decide that you really needed 32x32 but there is no way to get it because essentially you stored only the hash and the original may be inaccessible. While on the stage of fingerprinting you can't know if you will lack the depth on the next stage.

Nakilon avatar Apr 13 '20 01:04 Nakilon

@jasonkhanlar did you ever expand on your matching implementation as @lovell suggested?

FoxxMD avatar Oct 08 '21 14:10 FoxxMD

+1 for this... has there been any update on this concept of image comparison with sharp? Thanks!

GitMurf avatar Jan 03 '23 07:01 GitMurf

I don't have heaps of images to test how performant this is, but I took the logic from https://github.com/lovell/sharp/issues/1067#issuecomment-580969548 and wrote a self-contained async function in TypeScript that will output the diff of each comparison image:

import sharp, {Sharp} from 'sharp';

export async function imageDiff(
    baselineImage: Sharp,
    comparisonImages: ReadonlyArray<Sharp>,
    threshold = 0.15,
): Promise<number[]> {
    /** Does this need to be called again with `true` after the function is done? */
    sharp.cache(false);
    const rawBaseline = await baselineImage.ensureAlpha().raw().toBuffer();

    const imageDiffs = await Promise.all(
        comparisonImages.map(async (comparisonImage, index): Promise<number> => {
            let diffPixels = 0;
            const rawComparison = await comparisonImage.raw().toBuffer();

            if (rawBaseline.equals(rawComparison)) {
                /** No need to check further if the buffers are exactly equal. */
                return 0;
            } else {
                const metadata = await comparisonImage.metadata();
                if (!metadata.channels) {
                    /** Idk what the implications are of channels being undefined. */
                    throw new Error(`No channels at index ${index}`);
                }

                for (let i = 0; i < rawBaseline.length; i += metadata.channels) {
                    if (
                        metadata.channels === 4 &&
                        rawComparison[i + 3] === 0 &&
                        rawBaseline[i + 3] === 0
                    ) {
                        /** Skip fully transparent pixels. */
                        continue;
                    } else if (
                        metadata.channels === 4 &&
                        (rawComparison[i + 3] === 0 || rawBaseline[i + 3] === 0)
                    ) {
                        /**
                         * If one image's pixel is fully transparent, then only compare threshold of
                         * alpha channel.
                         */
                        if (
                            Math.abs(rawComparison[i + 3]! - rawBaseline[i + 3]!) >
                            255 * threshold
                        ) {
                            diffPixels++;
                        }
                    } else {
                        if (
                            Math.abs(rawComparison[i]! - rawBaseline[i]!) +
                                Math.abs(rawComparison[i + 1]! - rawBaseline[i + 1]!) +
                                Math.abs(rawComparison[i + 2]! - rawBaseline[i + 2]!) +
                                Math.abs(rawComparison[i + 3]! - rawBaseline[i + 3]!) >
                            255 * threshold
                        ) {
                            diffPixels++;
                        }
                    }
                    /**
                     * For more useful calculations of color distances/thresholds, see:
                     * https://www.npmjs.com/package/d3-color
                     * https://www.npmjs.com/package/d3-color-difference
                     */
                }
            }

            return diffPixels;
        }),
    );

    return imageDiffs;
}

electrovir avatar Oct 20 '23 13:10 electrovir

This looks great! Do you have any rough benchmarks on how long this takes for different image sizes?

GitMurf avatar Oct 20 '23 14:10 GitMurf

I'm operating on images that are nearly all the same size at the moment, but here are some super rough benchmarks on an M1 Max MacBook Pro running Node.js v20 (for a single image compared to another single image):

  • 300x300: 3-16 milliseconds
  • 300x400: 12-16 milliseconds

Performance is not my constraint (currently at least), so I haven't run any rigorous benchmarks or optimizations.

electrovir avatar Oct 23 '23 12:10 electrovir