inflate icon indicating copy to clipboard operation
inflate copied to clipboard

Big slow-down in WebAssembly (wasm)

Open mpizenberg opened this issue 5 years ago • 4 comments

Hi, I'm coming from a discussion in https://github.com/image-rs/image-png/issues/114. My issue is regarding the very slow reading of png images in wasm. @HeroicKatora identified that the issue might come from inflate calls. So I've tried to set up a very simple example to verify performance drops.

For this I'm using the file at https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-all-titles-in-ns0.gz. I decoded then re-encoded it with libflate, otherwise the original encoding was not decodable by inflate. It is a 296Mb file when decoded, 88Mb encoded. In the example code, there is a main.rs in which I roughly measure native decoding performances. And in the lib.rs file, there are wasm exposed functions, enabling me to copy-paste different versions of the inflating and trying them in the browser.

So I've tried each version available from inflate in native, and 3 versions in wasm. Here are the results.

version native speed wasm speed
inflate_bytes 2.8s 20.6s
InflateStream 2.8s not tested
InflateWriter 2.9s not tested
DeflateDecoder 0.64s 4.6s
DeflateDecoderBuf 3.0s not tested
--------- -------------- ------------
libflate 2.2s 4.8s

As we can see, inflate is one order of magnitude slower in wasm than in native. Libflate however is "only" 2x slower in a wasm context. In addition, we can see here that using DeflateDecoder is a lot faster than using InflateStream, which is the one used in the png crate (there are probably reasons for this that I'm not aware of).

I'm not familiar enough with DEFLATE to try to understand what might be the reason for this slow down in the code but I wanted to report the issue. I hope you might have an idea of what is wrong, and probably a fix to enjoy inflate in wasm ^^.

mpizenberg avatar Apr 23 '19 19:04 mpizenberg

I believe the main reason is that DeflateDecoder was added to the API much more recently than InflateStream.

InflateStream (and DeflateDecoderBuf which uses it) have a 32k buffer, DeflateDecoder doesn't so maybe there is some difference there. Not sure what libflate has by default.

If you curious about speed, you may also want to compare flate2 with the rust back-end enabled.

oyvindln avatar Apr 23 '19 20:04 oyvindln

I've opened a pull request for the benchmark harness for inflate that I made ages ago because right now inflate has no benchmarking facilities at all: #56 Compared to a year ago or so inflate performance has regressed by 33% according to these benchmarks.

According to the investigation in https://github.com/image-rs/image-png/issues/114, 85% of time is spent in inflate::InflateStream::next_state, so that's where you should look if you want to fix this.

@mpizenberg for this to be actually tackled I suggest providing a step-by-step guide to reproducing the setup that exhibits the slowdown. If I were a library maintainer and never dealt with wasm before, I wouldn't bother unless it was very clear what to do.

Shnatsel avatar Jul 06 '19 09:07 Shnatsel

@Shnatsel ok, I'll update the example repo with exact instructions to reproduce behavior in coming days.

mpizenberg avatar Jul 07 '19 13:07 mpizenberg

I have updated the associated repository (https://github.com/mpizenberg/wasm-inflate) with instructions in the readme to reproduce this benchmark. Versions have changed since last April. On my computer, with rustc 1.36.0, inflate 0.4.5, libflate 0.1.25, wasm-bindgen 0.2.47, I have the following results with native rust compilation:

Elapsed (inflate_bytes): Ok(2.85300905s)
Elapsed (DeflateDecoder): Ok(3.021417847s)
Elapsed (DeflateDecoderBuf): Ok(2.992962828s)
Elapsed (InflateStream): Ok(2.565384845s)
Elapsed (InflateWriter): Ok(2.905209633s)
Elapsed (libflate): Ok(2.312502443s)

And in wasm with Firefox 67.0.4 I got:

inflate_bytes: 9991 milliseconds.
deflate_decoder: 9434 milliseconds.
deflate_decoder_buf: 9258 milliseconds.
inflate_stream: 7964 milliseconds.
inflate_writer: 8848 milliseconds.
libflate: 4902 milliseconds.

Two things are already noticeably different from last time in April.

  1. inflate_bytes in wasm is twice as fast as before.
  2. DeflateDecoder is much slower here and now roughly at same speed than other methods.

Unfortunately I don't think I'll have time to investigate this further for quite some time, but at least its better documented now.

mpizenberg avatar Jul 10 '19 18:07 mpizenberg