Orb icon indicating copy to clipboard operation
Orb copied to clipboard

Get string address and type

Open orsinium opened this issue 1 year ago • 8 comments

According to the docs, strings get transformed into Orb.Memory.Range (the docs say Slice instead of Range but I think it's a typo). However, it's not what happens. When I define a function:

  def log_debug(s) do
    Firefly.Bindings.Misc.log_debug(
      Orb.Memory.Range.get_byte_offset(s),
      Orb.Memory.Range.get_byte_length(s)
    )
  end

And then call this function:

log_debug("hello")

The function receives a string, not a Range:

** (FunctionClauseError) no function clause matching in Orb.Memory.Range.get_byte_offset/1    
    
    The following arguments were given to Orb.Memory.Range.get_byte_offset/1:
    
        # 1
        "hello"
    
    Attempted function clauses (showing 2 out of 2):
    
        def get_byte_offset(-<<byte_offset::integer-little-unsigned-size(32), _::integer-little-unsigned-size(32)>>-)
        def get_byte_offset(-range = %{push_type: _}-)
    
    (orb 0.0.46) lib/orb/memory/range.ex:29: Orb.Memory.Range.get_byte_offset/1
    (firefly 0.1.0) lib/misc.ex:6: Firefly.Misc.log_debug/1
    (firefly 0.1.0) lib/demo/triangle.ex:5: Firefly.Demo.Triangle.__wasm_body__/1
    (orb 0.0.46) lib/orb/module_definition.ex:45: Orb.ModuleDefinition.get_body_of/1
    (orb 0.0.46) lib/orb/compiler.ex:18: Orb.Compiler.run/2
    (firefly 0.1.0) lib/demo/triangle.ex:1: Firefly.Demo.Triangle.__wasm_module__/0
    (orb 0.0.46) lib/orb.ex:1011: Orb.to_wat/1
    (firefly 0.1.0) lib/mix/tasks/wasm.ex:11: Mix.Tasks.Wasm.run/1

I guess it gets transformed only when passing into a host-defined function? The problem I'm trying to solve is that while the Range type packs both the string offset and string len into a single i64 value, I need to pass into the host two separate i32 values for offset and length (as shown in the snippet above).

orsinium avatar Jun 23 '24 15:06 orsinium

For reference, here is how the same function looks in Rust:

https://github.com/firefly-zero/firefly-rust/blob/main/src/misc.rs#L4-L10

orsinium avatar Jun 23 '24 15:06 orsinium

Yes apologies, this is one of the main decisions to be made for the alpha — how to model strings: #7. Currently in Orb it’s a i32 pointer to a nul-terminated string. But those have security issues by making it too easy to create buffer overflow or underflow.

So I’d like to always have the string length included whenever you reference a string. But WebAssembly doesn’t let you pass around tuples, so the packed i32+i32 into a i64 is my best idea currently.

The Range and Slice naming is another decision to be made. Orb doesn’t have an ownership model, the memory is just there and you can create pointer references to parts. So I want that to be clear in the name, but I also want it to feel natural similar to how you work with strings in other languages. Are you working with a pointer range or a slice of the memory? i.e. Is the thing you are working with the pointer or the memory itself? I’m leaning towards Slice and that’s why that crept into the site before the code actually has been updated.

I’ll make progress on this soon and make a PR to the https://github.com/firefly-zero/firefly-elixir project. It looks really great and readable!

Feel free to share your opinions on the above as I’m curious what you find works well with say Zig and Rust.

RoyalIcing avatar Jun 27 '24 06:06 RoyalIcing

Wasm supports tuples. You can return multiple values from a function (which is supported by all runtimes), and you can pass represent as multiple arguments in wasm what is one argument in your code.

I'm writing a programming language that compiles into wasm. Well, not right now, I switched to Firefly for now, but I before that I made good progress on it. Ask me questions if you get stuck.

For representing values in memory and passing them around and things like that I suggest following the ABI described in the component model.

orsinium avatar Jun 27 '24 07:06 orsinium

Sure, would appreciate any knowledge you can bring!

Multiple return values are awesome but the limitation is you can’t store tuples in locals. Orb currently exposes the underlying primitives as-is, so to support more complex locals I’d either have to create a C-like stack abstraction, or wait for the component model.

Orb targets currently shipping WebAssembly runtimes, so it’s going to be pretty conservative even when the component model matures.

RoyalIcing avatar Jun 27 '24 23:06 RoyalIcing

You still can make more locals for tuples, and nobody will complain. If in Elixir code variable a stores a 2-element tuple, just make a.0 and a.1 locals in wasm.

orsinium avatar Jun 28 '24 06:06 orsinium

The component describes many things. One of which is standard ABI, and you should follow that ABI when storing values in the linear memory. You don't need any special support from the runtime side to do that.

orsinium avatar Jun 28 '24 06:06 orsinium

For reference, here is how the same function looks in Rust:

https://github.com/firefly-zero/firefly-rust/blob/main/src/misc.rs#L4-L10

@RoyalIcing Any chance you could provide an example of this kind of usage with the new Str api? As far as I can tell it's still is now treated as a tuple {i32, i32}, however what I was looking for is the 2 separate values, however when using pattern matching I get the error below.

I'm sure it's entirely possible I'm just doing something wrong, but I looked through all the tests and examples and cannot find a similar use case.

The snippet

  defw handleState(input_ptr: I32, input_len: I32, output_len: I32), I32,
    s: Str,
    addr: I32,
    len: I32 do
    s = const("adasdasd")

    {addr, len} = s
 
    Env.console_log(0, len)
****

The error

Compiling 1 file (.ex)
    error: cannot invoke remote function Orb.VariableReference.local/2 inside a match
    │
 12 │   defw handleState(input_ptr: I32, input_len: I32, output_len: I32), I32,
    │   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    │
    └─ lib/elixir_strategy.ex:12: ElixirStrategy.__wasm_body__/1

mangas avatar Jan 05 '25 10:01 mangas

@mangas In v0.2.1 Str type can now expand to two arguments or two locals. So your code would be:

  defw handleState(input: Str, output_len: I32), I32,
    s: Str,
    addr: I32,
    len: I32 do
    s = "adasdasd"
    addr = s[:ptr]
    len = s[:size]
 
    Env.console_log(0, len)
  end

Thanks for your patience with this feature. Let me know what else you run into.

RoyalIcing avatar Mar 25 '25 11:03 RoyalIcing