sharp icon indicating copy to clipboard operation
sharp copied to clipboard

Aborted when loading a specific png in ARM64 environments

Open tamaina opened this issue 2 years ago • 9 comments

Possible bug

Is this a possible bug in a feature of sharp, unrelated to installation?

  • [x] Running npm install sharp completes without error.
  • [x] Running node -e "require('sharp')" completes without error.

Are you using the latest version of sharp?

  • [x] I am using the latest version of sharp as reported by npm view sharp dist-tags.latest.

What is the output of running npx envinfo --binaries --system --npmPackages=sharp --npmGlobalPackages=sharp?

  System:
    OS: Linux 5.13 Ubuntu 20.04.4 LTS (Focal Fossa)
    CPU: (4) arm64 Neoverse-N1
    Memory: 20.15 GB / 23.31 GB
    Container: Yes
    Shell: 5.0.17 - /bin/bash
  Binaries:
    Node: 18.1.0 - /usr/bin/node
    npm: 8.8.0 - /usr/bin/npm
  npmPackages:
    sharp: ^0.30.4 => 0.30.4 

(Oracle Cloud Infrastructure A1 instance)

What are the steps to reproduce?

Download PixelFed's favicon.png: https://pixelfed.tokyo/img/favicon.png?v=2 favicon

Load...

await sharp('favicon.png').toBuffer();

It throws error.
This error kills our software.

malloc(): corrupted top size
Aborted (core dumped)

What is the expected behaviour?

No error and no killing

tamaina avatar May 14 '22 15:05 tamaina

Information:

It happens with sharp 0.30.0 or later.

Using libjemalloc2 resolves this error but I hope it will work without any manipulation on our part.
https://misskey.xianon.net/notes/8zrrs9qm9c

tamaina avatar May 14 '22 15:05 tamaina

Thanks for reporting this, I'll try to reproduce on some ARM64 hardware when I can.

Please can you test the following code with the same image to help narrow down if this might be a decode or encode problem.

await sharp('favicon.png').raw().toBuffer(); // decode only, no PNG encoding

sharp v0.29.x provides libspng v0.6.3 whereas sharp v.30.x provides libspng v0.7.1.

The sample input is palette-based, and when decoded on ARM64 will use libspng v0.7.1's NEON optimised path, which leads me to suspect this might relate to https://github.com/randy408/libspng/issues/189

lovell avatar May 15 '22 11:05 lovell

Thanks for your reply.
It aborts as well in decode only.

image

tamaina avatar May 15 '22 12:05 tamaina

I was able to reproduce this crash with the libvips v8.12.2-build2 Windows ARM64 binaries on my Raspberry Pi 4B by simply doing:

vips.exe avg favicon.png

Doing the same with the (unpublished) v8.12.2 binaries, built against libspng 0.7.2, works without problems.

While looking at the temporary fix (commit https://github.com/randy408/libspng/commit/a4270af83467c343f85bf3cf308751017d57f67b), I noticed that only expand_palette_rgb8_neon was disabled in libspng 0.7.1. Given that this is a palettized RGBA8 PNG image, see:

$ vipsheader favicon.png
favicon.png: 153x152 uchar, 4 bands, srgb, pngload
$ vipsheader -f palette-bit-depth favicon.png
8

I think expand_palette_rgba8_neon also suffered from the same bug (https://github.com/randy408/libspng/issues/188). The fix available in libspng 0.7.2 (commit https://github.com/randy408/libspng/commit/f1c7735a13c58fc32506a62fcdbaa793cbf5ef9c), fixed this for both expand_palette_rgb8_neon and expand_palette_rgba8_neon functions.

kleisauke avatar May 16 '22 15:05 kleisauke

I think expand_palette_rgba8_neon also suffered from the same bug

This is confirmed to be the case. When the quickfix was released the root cause was not yet known, the image from the initial bug report did not trigger the same problem in the other function. It does with this new image.

The last release in February fixed both functions and also another issue that is listed in the announcement, so there was good reason to upgrade to the latest version, as it already declared the previous version (v0.7.1) buggy for other reasons.

randy408 avatar May 17 '22 20:05 randy408

Thanks all for confirming, the future v0.31.0 release of sharp will provide prebuilt binaries with a more recent version of libspng with the fix.

lovell avatar May 19 '22 10:05 lovell

In the meantime, pinning to "sharp": "0.30.1" in resolutions avoids the crash for me.

maximeg avatar May 24 '22 07:05 maximeg

Same problem here on M1 Mac, fixing version to 0.29.3 helped. In my case it influences NextJS and the upload-plugin of Strapi.

ngladbach avatar May 24 '22 11:05 ngladbach

Fixing version to 0.29.3 helps on aarch64 fedora

Lambdac0re avatar Jun 30 '22 11:06 Lambdac0re

All versions above 0.28.3 crash on Windows 11, Intel i7-11800H with this error. Try/catch doesn't work; the error takes the node process down (not sure how?).

oebilgen avatar Aug 13 '22 18:08 oebilgen

@oebilgen This issue relates only to ARM64. If you're seeing it on Intel hardware, please open a new issue and provide as much information as possible, including sample code with appropriate error handling, sample images, and ideally a backtrace of the crash.

lovell avatar Aug 13 '22 19:08 lovell

v0.31.0 now available with prebuilt binaries that contain the upstream fix, thanks all for reporting/helping with this.

lovell avatar Sep 05 '22 09:09 lovell