wasmi icon indicating copy to clipboard operation
wasmi copied to clipboard

Optimization: Use compact encoding of bytecode

Open pepyakin opened this issue 7 years ago • 7 comments

Depends on #98

For now, every instruction is represented by an enum value. It means that each instruction occupies size required for a tag plus a size of the instruction that has the largest payload (probably BrTable) (which requires to be properly aligned). Now each instruction occupies about 24 bytes, even though most instructions are 1 byte wide.

Fortunately, we don't need to address instructions by an index and can iterate them sequentially, and branches can specify the exact byte offset for their target.

Shrinking the size of one instruction will allow us to use cache more efficiently and also will reduce memory overhead greatly.

pepyakin avatar Jun 20 '18 15:06 pepyakin

I got this working with a crude encoding scheme. Each isa::Instruction is identified with a single u8, followed by any operands (usually u32). As of now, I'm seeing a 10-15% performance improvement in cargo bench results.

The savings aren't as large as I had hoped. A napkin-math analysis over the test suite gives me a thought:

Bytes per instruction

Most of the u32 operands are slots or offsets or similar small integers. Maybe a variable length encoding scheme is worth a go?

willglynn avatar Oct 08 '18 04:10 willglynn

10-15% is a very good figure actually!

This is basically a trade-off and we can easily get more overhead than we save by compacting the bytecode. What we might want to explore is introduction of a special narrow versions of opcodes that takes, say, u16 instead of u32 immediates.

pepyakin avatar Oct 10 '18 17:10 pepyakin

Well… then I started thinking about how WebAssembly is already a compact bytecode format with variable-length encoding of operands. FunctionReader::read_instruction() swizzles around flow control, but many more instructions are validated then emitted 1:1. Maybe it's better not to produce a second bytecode, but to instead produce a mix of new/enriched flow control instructions along with references to the original bytecode.

willglynn avatar Oct 10 '18 17:10 willglynn

Maybe it's better not to produce a second bytecode, but to instead produce a mix of new/enriched flow control instructions along with references to the original bytecode.

Can you expand a little on this? I'm not sure I follow.

pepyakin avatar Oct 11 '18 06:10 pepyakin

This issue proposes decoding instructions for validation, re-encoding them to a new bytecode, and decoding that bytecode in the interpreter.

I'm now thinking about decoding instructions for validation, building &[u8]s of which instructions do not require special handling, and then decoding those instructions from the original bytecode again in the interpreter. Something like:

mod isa {
  pub enum Operation {   // formerly Instruction
    GetLocal(u32),
    SetLocal(u32),
    TeeLocal(u32),

    Br(Target),
    BrIfEqz(Target),
    BrIfNez(Target),
    BrTable(Box<[Target]>),
    Return(DropKeep),

    /// WebAssembly bytecode, guaranteed not to contain:
    ///
    ///  * `Nop`
    ///  * `Block`
    ///  * `Loop`
    ///  * `If`, `Else`, `End`
    ///  * `Br`, `BrIf`, `BrTable`
    ///  * `Return`
    ///  * `GetLocal`, `SetLocal`, `TeeLocal`
    ///
    /// All other instructions are permissible and have valid operands.
    ValidatedWasmBytecode(&[u8]),
  }
}

The existing bytecode format is equivalent for most instructions, and wasmi already knows how to decode it. It's denser than my u32 immediate format and thus ought to make better use of memory bandwidth. I intend to experiment in this direction :-)

willglynn avatar Oct 11 '18 15:10 willglynn

Related:

  • https://github.com/paritytech/parity-wasm/issues/135
  • https://github.com/paritytech/parity-wasm/pull/154
  • https://github.com/yurydelendik/wasmparser.rs/pull/45

pepyakin avatar Nov 02 '18 11:11 pepyakin

I am going to implement a more compact encoding of wasmi bytecode for the v1 implementation introduced in https://github.com/paritytech/wasmi/pull/287.

Robbepop avatar Jan 06 '22 15:01 Robbepop

The mentioned PR (https://github.com/paritytech/wasmi/pull/287) shrank the size of the enum bytecode from 24 bytes to 16 bytes by disentangling BrTable variant. The BrCode operand now only hosts the number of targets excluding the default target and is followed by Branch and Return operands by construction of the wasmi bytecode.

Closing this issue now since we probably won't get way lower than that given that we embed constant values into the bytecode since indirectly accessing constants (that must be 64-bits) would create too much overhead.

Robbepop avatar Aug 22 '22 20:08 Robbepop