spec Can the same sandbox instance be shared with the same extension (such as Filter) ？

Take API 'proxy_get_buffer' as an example:

Filter A(context ID:1) and Filter B(context ID:2)are XxxFilter extension instances, Filter A is handling OnHttpRequestHeader，Filter B is handling OnHttpResponseBody, Both Filter instances invoke 'proxy_get_buffer' method to get the HTTP header, because they didn't pass the contextId, On the host side, how do you know which HTTP request header to return correctly？

If the contextId is passed to the host, the host will correctly identify the Filter instance and return the HTTP header.

According to my understanding, the instance of each WASM module should be equivalent to the isolation sandbox, and sharing the same instance should save resources. If my understanding is incorrect, please correct me.

Dec 03 '20 13:12 zonghaishang

@zonghaishang the answer here is "it depends on the host implementation".

But take Envoy for example, of course, YES and Envoy creates Wasm VMs (each corresponding to plugin) per threads which makes it easy to identify different headers/body/trailers/metadata for each context created in Wasm VMs (here, Wasm VM is the sandboxed instance you mention).

Note that each context (corresponding to each request) is not "sandboxed" against each other since, as I said, multiple contexts are created in the per-thread VMs and they share underlying Wasm VM's resources including its linear memory.

Dec 04 '20 04:12 mathetake

If the contextId is passed to the host, the host will correctly identify the Filter instance and return the HTTP header.

And this is unnecessary because Wasm VMs are always executed by host implementation, and host should be able to know which context is currently executed regardless of their implementations.

Dec 04 '20 04:12 mathetake

@mathetake thank you for your reply. It sounds, because envoy is a single-threaded shared WASM VM instance, when the plug-in calls the host, the host can obtain the WASM VM instance Context according to the current thread.

If it is go runtime (sidecar mosn), thread storage is not supported, this seems to be problematic.

Dec 04 '20 08:12 zonghaishang

Filter A(context ID:1) and Filter B(context ID:2)are XxxFilter extension instances, Filter A is handling OnHttpRequestHeader，Filter B is handling OnHttpResponseBody, Both Filter instances invoke 'proxy_get_buffer' method to get the HTTP header, because they didn't pass the contextId, On the host side, how do you know which HTTP request header to return correctly？

"Context" is extremely overloaded in the current implementation, so let me be a little more explicit:

Filter A (plugin_id: 1) and Filter B (plugin_id: 2) are plugins that define code and configuration, and HTTP and TCP events aren't called on them, but on a specific HTTP or TCP context, e.g. for incoming HTTP request, you'll create context for that HTTP request for Filter A (context_id: 3, plugin_id: 1) and for Filter B (context_id: 4, plugin_id: 2).

Now, when HTTP headers are read on the host side, it's going to call proxy_on_http_request_headers(context_id=3, ...) (notice it's the HTTP request's context ID, not plugin's context ID), and since the WasmVM is single-threaded, the host side knows that it's currently handling context_id=3, so all host functions called from within WasmVM are assumed to be requested for the context_id=3 (unless it's changed using proxy_set_effective_context), so when the WasmVM calls proxy_get_buffer(...), host knows to return buffer for the HTTP request matching context_id=3.

Does it make sense?

If the contextId is passed to the host, the host will correctly identify the Filter instance and return the HTTP header.

Do you mean that we should explicitly include context_id in the host calls? e.g. call proxy_get_buffer(context_id=3, ...) instead of proxy_get_buffer(...) and having host side track current context_id?

I think that's a good idea, and I suggested it a while ago myself, but I didn't get buy-in from other people working on the project at the time. Maybe it's a good time to revisit this.

According to my understanding, the instance of each WASM module should be equivalent to the isolation sandbox, and sharing the same instance should save resources. If my understanding is incorrect, please correct me.

Multiple plugins are already supported in the same WasmVM, regardless of the context_id being included or not.

I think the thing you were missing is that current host implementations (e.g. Envoy) automatically track current context_id.

Dec 04 '20 08:12 PiotrSikora

@mathetake thank you for your reply. It sounds, because envoy is a single-threaded shared WASM VM instance, when the plug-in calls the host, the host can obtain the WASM VM instance Context according to the current thread.

If it is go runtime (sidecar mosn), thread storage is not supported, this seems to be problematic.

The WasmVM that MOSN uses is still single-threaded, right? So there is always only one context_id that's being executed.

Dec 04 '20 08:12 PiotrSikora

@mathetake thank you for your reply. It sounds, because envoy is a single-threaded shared WASM VM instance, when the plug-in calls the host, the host can obtain the WASM VM instance Context according to the current thread. If it is go runtime (sidecar mosn), thread storage is not supported, this seems to be problematic.

The WasmVM that MOSN uses is still single-threaded, right? So there is always only one context_id that's being executed.

Currently, each request corresponds to a coroutine(not single-threaded). If the host needs to follow the abi specification and maintain current context_id on the host side, we need to lock the VM Instance to ensure that the service processing is serial (golang does not support local thread storage).

Dec 04 '20 08:12 zonghaishang

Filter A(context ID:1) and Filter B(context ID:2)are XxxFilter extension instances, Filter A is handling OnHttpRequestHeader，Filter B is handling OnHttpResponseBody, Both Filter instances invoke 'proxy_get_buffer' method to get the HTTP header, because they didn't pass the contextId, On the host side, how do you know which HTTP request header to return correctly？

"Context" is extremely overloaded in the current implementation, so let me be a little more explicit:

Filter A (plugin_id: 1) and Filter B (plugin_id: 2) are plugins that define code and configuration, and HTTP and TCP events aren't called on them, but on a specific HTTP or TCP context, e.g. for incoming HTTP request, you'll create context for that HTTP request for Filter A (context_id: 3, plugin_id: 1) and for Filter B (context_id: 4, plugin_id: 2).

Now, when HTTP headers are read on the host side, it's going to call proxy_on_http_request_headers(context_id=3, ...) (notice it's the HTTP request's context ID, not plugin's context ID), and since the WasmVM is single-threaded, the host side knows that it's currently handling context_id=3, so all host functions called from within WasmVM are assumed to be requested for the context_id=3 (unless it's changed using proxy_set_effective_context), so when the WasmVM calls proxy_get_buffer(...), host knows to return buffer for the HTTP request matching context_id=3.

Does it make sense?

If the contextId is passed to the host, the host will correctly identify the Filter instance and return the HTTP header.

Do you mean that we should explicitly include context_id in the host calls? e.g. call proxy_get_buffer(context_id=3, ...) instead of proxy_get_buffer(...) and having host side track current context_id?

I think that's a good idea, and I suggested it a while ago myself, but I didn't get buy-in from other people working on the project at the time. Maybe it's a good time to revisit this.

According to my understanding, the instance of each WASM module should be equivalent to the isolation sandbox, and sharing the same instance should save resources. If my understanding is incorrect, please correct me.

Multiple plugins are already supported in the same WasmVM, regardless of the context_id being included or not.

I think the thing you were missing is that current host implementations (e.g. Envoy) automatically track current context_id.

Your description is very detailed, and my understanding is consistent with yours.

Do you mean that we should explicitly include context_id in the host calls? e.g. call proxy_get_buffer(context_id=3, ...) instead of proxy_get_buffer(...) and having host side track current context_id?

I think that's a good idea, and I suggested it a while ago myself, but I didn't get buy-in from other people working on the project at the time. Maybe it's a good time to revisit this.

If the calling host passes the contextId, I think it can provide great flexibility to the host implementation (non-envoy).

Dec 04 '20 08:12 zonghaishang

Currently, each request corresponds to a coroutine(not single-threaded). If the host needs to follow the abi specification and maintain current context_id on the host side, we need to lock the VM Instance to ensure that the service processing is serial (golang does not support local thread storage).

Right, each HTTP request correspond to a coroutine on the host side, but you only have a single WasmVM instance (that's effectively single-threaded) and you don't create WasmVM for each coroutine, right? If so, you should be able to track which context_id is being executed within that WasmVM.

If the calling host passes the contextId, I think it can provide great flexibility to the host implementation (non-envoy).

Agreed. I'm redesigning the ABI right now, and I'll take this into consideration.

Dec 04 '20 09:12 PiotrSikora

Currently, each request corresponds to a coroutine(not single-threaded). If the host needs to follow the abi specification and maintain current context_id on the host side, we need to lock the VM Instance to ensure that the service processing is serial (golang does not support local thread storage).

Right, each HTTP request correspond to a coroutine on the host side, but you only have a single WasmVM instance (that's effectively single-threaded) and you don't create WasmVM for each coroutine, right? If so, you should be able to track which context_id is being executed within that WasmVM.

If so, you should be able to track which context_id is being executed within that WasmVM.

yes. In order to reduce concurrent locking, multiple WASM VM Instances will also be considered, at least for now, it’s not the best solution.

Agreed. I'm redesigning the ABI right now, and I'll take this into consideration.

For me, this is good news.

Dec 04 '20 09:12 zonghaishang

yes. In order to reduce concurrent locking, multiple WASM VM Instances will also be considered, at least for now, it’s not the best solution.

You'll end-up using too much memory with multiple WasmVM instances. We already have issues with consuming too much memory in Envoy when using a single WasmVm per loaded bytecode per CPU.

Dec 04 '20 09:12 PiotrSikora

spec spec copied to clipboard

Can the same sandbox instance be shared with the same extension (such as Filter) ？

spec
spec copied to clipboard