binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

Read-only memory optimizations

Open tlively opened this issue 5 years ago • 13 comments

This came up in the context of thinking about how to optimize file system calls, but it could be more generally useful. The idea is to tell Binaryen what memory is read-only, essentially telling it where the .rodata section is. Then optimizations like precompute and constant propagation would be able to resolve loads from that memory at compile time. This could particularly help for precomputing string operations like strlen and strcmp when their arguments are constant strings.

tlively avatar Oct 19 '20 18:10 tlively

We also talked about this with @dcodeIO a while ago. We thinking about this as some Api (or user's metadata) which marked some data range as readonly and this fact utilized by optimizations afterwards.

MaxGraey avatar Oct 19 '20 18:10 MaxGraey

Context: Our use case is that we recently switch to represent functions as managed objects, i.e. a memory segment also containing the table index instead of just a bare table index. If we could tell Binaryen that the region in memory where the first-class function lives is read only, the table index could be obtained from memory, in turn allowing us to take advantage of directize again.

dcodeIO avatar Oct 19 '20 19:10 dcodeIO

Sounds good to me!

How about on PassOptions, something like readOnlyMemoryRange? I doubt we need more than one such range. (In fact probably just a single value "up to here" is enough?)

kripken avatar Oct 19 '20 20:10 kripken

Ideally for us would be per memory segment, so we don't have to relocate all read only memory to a specific range once compilation is done. For instance, when an array is encountered its memory segment may be mutable, but the next function encountered might be not and so on.

dcodeIO avatar Oct 19 '20 20:10 dcodeIO

Ideally for us would be per memory segment, so we don't have to relocate all read only memory to a specific range once compilation is done. For instance, when an array is encountered its memory segment may be mutable, but the next function encountered might be not and so on.

Native toolchain like clang and gcc tend to group read-only, read-write, and bss data together so you end up with 3 separate sections. It might be good to mimick that model since that is the most common form of binaryen input today.

Is it particularly hard for your toolchain to relocate and group data in this way?

sbc100 avatar Oct 21 '20 00:10 sbc100

The AS compiler essentially emits Binaryen IR in a single pass, leaving everything further down the road to Binaryen (passes). As such there is no mechanism to relocate or otherwise do a custom second pass over the IR atm. One could say AS relates to Binaryen as Clang relates to LLVM - or - Binaryen is AS's backend (that I guess would do the grouping).

dcodeIO avatar Oct 21 '20 05:10 dcodeIO

Ah OK, maybe my comment is less relevant to AS then.

It seems like it would be good to be able to encode this kind of information directly into the binary somehow, kind of like the "linking" section that llvm currently emits for objects files which contains extra information for the downstream tool (i.e. the linker). In the same way the static linker itself emits downstream information for the runtime linker. Maybe this kind of metadata fits into that category?

That way this kind of thing could "just work" for somebody who runs wasm-ld + wasm-opt

sbc100 avatar Oct 21 '20 07:10 sbc100

Does anyone have implementation plans for this?

MaxGraey avatar Dec 15 '20 23:12 MaxGraey

No, not that I know of.

tlively avatar Dec 16 '20 02:12 tlively

Another option is reusing intrinsic mechanics like "call.without.effects". Just add ned intrinsic "call.readonly" and use it as:

(module
  (memory $0 1)
  
  (data (i32.const 8) "\01\00\00\00")            ;; 1_i32
  (data (i32.const 12) "\00\00\00\00\00\00\f0?") ;; 1.0_f64
  
  (import "binaryen-intrinsics" "call.readonly.i32" (func $call-readonly-i32 (param i32) (result i32)))
  (import "binaryen-intrinsics" "call.readonly.f64" (func $call-readonly-f64 (param f64) (result f64)))
  
  (export "readI32" (func $readI32))
  (export "readF64" (func $readF64))
  (export "readF64_skip_opt" (func $readF64_skip_opt))
  
  (func $readI32 (result i32)
    (call $call-readonly-i32 
      (i32.load (i32.const 8))
    )
  )
  
  (func $readF64 (result f64)
    (call $call-readonly-f64
      (f64.load (i32.const 12))
    )
  )
  
  (func $readF64_skip_opt (param i32 $ptr) (result f64)
    (call $call-readonly-f64
      (f64.load (local.get $x)) ;; non-constant pointers just unwrap
    )
  )
)

which optimized to:

(module
  (export "readI32" (func $readI32))
  (export "readF64" (func $readF64))
  (export "readF64_skip_opt" (func $readF64_skip_opt))
  
  (func $readI32 (result i32)
    (i32.const 1)
  )
  
  (func $readF64 (result f64)
    (f64.const 1.0)
  )
  
  (func $readF64_skip_opt (param i32 $ptr) (result f64)
    (f64.load (local.get $x)) ;; non-constant pointers just unwrap
  )
)

@kripken WDYT?

MaxGraey avatar Jul 28 '22 16:07 MaxGraey

@MaxGraey I think that could work on the binaryen side. But how easy would it be to emit from compilers?

For LLVM I think it encoding the information in a special section, as @sbc100 mentioned before, would be simpler than emitting these intrinsics. The special section would just say "range X-Y is read-only" once instead of modifying all the relevant reads.

kripken avatar Jul 28 '22 16:07 kripken

Well, range records in custom / special section is also a good approach but it less explicit. With intrinsic, you can explicitly wrap / unwrap / rewrap loads. It will much easier to test (especially in lit), it also will support handwritten wat modules. Also, it is more flexible. What if you decide not to inline specific load which still fit into readonly-range? For example, for debug purposes. Of course there are plenty of disadvantages as well - bloated IR and the need for additional conversion for LLVM

MaxGraey avatar Jul 28 '22 16:07 MaxGraey

Being read-only is a property of the data, not individual loads, so I don't think intrinsics make much sense. What would it mean to have a read-only read with a non-constant address? What if you have a non-intrinsic read whose address is constant and in the read-only range only after other optimizations?

tlively avatar Jul 28 '22 19:07 tlively