wasmi icon indicating copy to clipboard operation
wasmi copied to clipboard

Experiment: comment out all inline annotations in the executor

Open Robbepop opened this issue 1 year ago • 6 comments

Locally I saw no huge regressions (~0-20%) and even some improvements. Best improvement was with fib/iter which improved by 24% which is actually pretty big. My hope is that this might be resolving some large performance issues with certain CPUs and platforms due to some inline annotations having been over-optimized for my particular development system.

Unfortunately coremark-wasm shows a significant performance regression on my local machine: 1520 -> 1405 points (-8%)

Robbepop avatar Mar 20 '24 15:03 Robbepop

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 82.31%. Comparing base (8136778) to head (657ac85).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #960      +/-   ##
==========================================
+ Coverage   81.87%   82.31%   +0.44%     
==========================================
  Files         260      260              
  Lines       23904    23941      +37     
==========================================
+ Hits        19572    19708     +136     
+ Misses       4332     4233      -99     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 20 '24 15:03 codecov[bot]

@yamt In case you are able to spare some of your precious time it would be great to know how this particular branch of Wasmi (register) performs on your machine which previously displayed very poor execution performance. This is just guesswork of mine at this point of what could cause this poor performance on some platforms. Unfortunately I cannot do those cross-platform benchmarks on my own since the Wasmi benchmarking CI did not survive the Wasm repository transition.

Robbepop avatar Mar 20 '24 15:03 Robbepop

@yamt In case you are able to spare some of your precious time it would be great to know how this particular branch of Wasmi (register) performs on your machine which previously displayed very poor execution performance. This is just guesswork of mine at this point of what could cause this poor performance on some platforms. Unfortunately I cannot do those cross-platform benchmarks on my own since the Wasmi benchmarking CI did not survive the Wasm repository transition.

rustc version i'm currently using (1.73.0) doesn't meet the new requirement. i'm not in a mood to update rustc on my machine as it's used for other things. i guess rustup has a functionality to switch between multiple rustc versions. but it's something i need to learn. (ie. take some time)

yamt avatar Mar 24 '24 05:03 yamt

@yamt I am very sorry about this! Thank you anyways for trying it out. :)

If you are using rustup you can use rustup update to update all your toolchains. You can also install a particular toolchain via rustup toolchain install 1.72 and then do rustup default 1.72 to go back to your current toolchain, or rustup default stable to use the current stable toolchain.

Robbepop avatar Mar 24 '24 07:03 Robbepop

i ran it twice. these values should be comparable with values in https://github.com/yamt/toywasm/pull/143

spacetanuki% ./test/run-ffmpeg.sh /usr/bin/time -l wasmi_cli-9870647987eb16a2a04975a47833f074a791e952 --dir .video --
       31.28 real        31.24 user         0.03 sys
           115326976  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
               28307  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                  87  involuntary context switches
        239392329125  instructions retired
        127343262177  cycles elapsed
            94154752  peak memory footprint
spacetanuki% ./test/run-ffmpeg.sh /usr/bin/time -l wasmi_cli-9870647987eb16a2a04975a47833f074a791e952 --dir .video --
       31.36 real        31.32 user         0.03 sys
           112349184  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
               30758  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                 358  involuntary context switches
        239404061550  instructions retired
        127072794200  cycles elapsed
            91176960  peak memory footprint
spacetanuki% 

yamt avatar Mar 24 '24 11:03 yamt

Oh wow so there is no big difference at all. I have not expected that. That is actually very interesting. Thanks a lot for running the benchmarks @yamt !

Robbepop avatar Mar 24 '24 11:03 Robbepop

Due to indifferent results and seeing that on the current Rust nightly things are looking different again I'd rather not merge this and try to resolve performance issues another way. Thanks again @yamt for sharing your results. That really helped me here.

Robbepop avatar Mar 28 '24 16:03 Robbepop