wasm-micro-runtime Add APIs to read and destroy sections

trafficstars

#3354

Apr 30 '24 00:04 eloparco

is this basically a copy from wamr-app-framework?

Apr 30 '24 09:04 yamt

is this basically a copy from wamr-app-framework?

Yes, I should have mentioned it in the PR description.

destroy_all_wasm_sections, destroy_all_aot_sections, destroy_part_wasm_sections and destroy_part_aot_sections are exactly the same. wasm_runtime_read_to_sections is adapted from wasm_app_module_on_install_request_byte_arrive.

We want to be able to free as many resources that are not necessary anymore after module instantiation or loading (especially for constrained embedded devices).

Apr 30 '24 09:04 eloparco

is this basically a copy from wamr-app-framework?

Yes, I should have mentioned it in the PR description.

please update the description then.

destroy_all_wasm_sections, destroy_all_aot_sections, destroy_part_wasm_sections and destroy_part_aot_sections are exactly the same. wasm_runtime_read_to_sections is adapted from wasm_app_module_on_install_request_byte_arrive.

We want to be able to free as many resources that are not necessary anymore after module instantiation or loading (especially for constrained embedded devices).

if you just want to save some memory, why do you need this complicated api? i suspect what you need is an option to wasm_runtime_load etc.

May 02 '24 01:05 yamt

sometimes it's useful to have an api to examine sections. (especially custom sections) eg. https://github.com/yamt/toywasm/tree/master/libdyld#dylink0-custom-section but it isn't the purpose of this PR, is it?

May 02 '24 01:05 yamt

is this basically a copy from wamr-app-framework?

Yes, I should have mentioned it in the PR description.

please update the description then.

destroy_all_wasm_sections, destroy_all_aot_sections, destroy_part_wasm_sections and destroy_part_aot_sections are exactly the same. wasm_runtime_read_to_sections is adapted from wasm_app_module_on_install_request_byte_arrive. We want to be able to free as many resources that are not necessary anymore after module instantiation or loading (especially for constrained embedded devices).

if you just want to save some memory, why do you need this complicated api? i suspect what you need is an option to wasm_runtime_load etc.

wasm_runtime_load has been used for a long time, we had better not change its prototype to add an option? And we don't know what the wasm/aot file buf is allocated, normally it can be freed only after wasm_runtime_load, so if we also create sections in wasm_runtime_load and allocate memory for section->section_body, there will be almost two copies of wasm/aot file at the same time together with the memory for module data structures, the peak memory usage may be large (though many of them may be freed later) which may not meet the requirement of some embedded/iot devices. And since we have documented that the buffer for wasm_runtime_load is referred after loading, it is an ABI change if we change the behavior and developer may not know that the buf can freed. So I think it is reasonable to provide another way, the PR just introduces two API wasm_runtime_read_to_sections and wasm_runtime_destroy_sections, it should be acceptable with document and sample added.

May 02 '24 05:05 wenyongh

is this basically a copy from wamr-app-framework?

Yes, I should have mentioned it in the PR description.

please update the description then.

destroy_all_wasm_sections, destroy_all_aot_sections, destroy_part_wasm_sections and destroy_part_aot_sections are exactly the same. wasm_runtime_read_to_sections is adapted from wasm_app_module_on_install_request_byte_arrive. We want to be able to free as many resources that are not necessary anymore after module instantiation or loading (especially for constrained embedded devices).

if you just want to save some memory, why do you need this complicated api? i suspect what you need is an option to wasm_runtime_load etc.

wasm_runtime_load has been used for a long time, we had better not change its prototype to add an option?

we have introduced wasm_runtime_load_ex very recently.

And we don't know what the wasm/aot file buf is allocated, normally it can be freed only after wasm_runtime_load, so if we also create sections in wasm_runtime_load and allocate memory for section->section_body, there will be almost two copies of wasm/aot file at the same time together with the memory for module data structures, the peak memory usage may be large (though many of them may be freed later) which may not meet the requirement of some embedded/iot devices. And since we have documented that the buffer for wasm_runtime_load is referred after loading, it is an ABI change if we change the behavior and developer may not know that the buf can freed. So I think it is reasonable to provide another way, the PR just introduces two API wasm_runtime_read_to_sections and wasm_runtime_destroy_sections, it should be acceptable with document and sample added.

iiuc, there are three memory allocations involved:

user buffer (as large as the module)
sections (almost same size)
loaded module/instance

with this PR, the peak memory usage is max(1+2, 2+3), right?

if the peak memory usage is a problem, it's better to introduce a stream-style api so that a user can use a fixed-sized buffer. wasm module format is designed to be easily loaded that way. that is,

user buffer (1 byte at minimum)
loaded module/instance

and/or, maybe we can simply make a user specify a "free the buffer" callback so that some buffers can be freed a bit earlier.

May 02 '24 06:05 yamt

is this basically a copy from wamr-app-framework?

Yes, I should have mentioned it in the PR description.

please update the description then.

destroy_all_wasm_sections, destroy_all_aot_sections, destroy_part_wasm_sections and destroy_part_aot_sections are exactly the same. wasm_runtime_read_to_sections is adapted from wasm_app_module_on_install_request_byte_arrive. We want to be able to free as many resources that are not necessary anymore after module instantiation or loading (especially for constrained embedded devices).

if you just want to save some memory, why do you need this complicated api? i suspect what you need is an option to wasm_runtime_load etc.

wasm_runtime_load has been used for a long time, we had better not change its prototype to add an option?

we have introduced wasm_runtime_load_ex very recently.

Yes, it is good to modify wasm_runtime_load_ex only, we may add an option for wasm_runtime_load_ex to choose whether to create sections, load and free sections. And even add another option of freeing buffer callback to allow wasm_runtime_load_ex to free the buffer, so it can create sections, free wasm/aot file buffer, load and free sections to reduce the peak memory usage.

And we don't know what the wasm/aot file buf is allocated, normally it can be freed only after wasm_runtime_load, so if we also create sections in wasm_runtime_load and allocate memory for section->section_body, there will be almost two copies of wasm/aot file at the same time together with the memory for module data structures, the peak memory usage may be large (though many of them may be freed later) which may not meet the requirement of some embedded/iot devices. And since we have documented that the buffer for wasm_runtime_load is referred after loading, it is an ABI change if we change the behavior and developer may not know that the buf can freed. So I think it is reasonable to provide another way, the PR just introduces two API wasm_runtime_read_to_sections and wasm_runtime_destroy_sections, it should be acceptable with document and sample added.

iiuc, there are three memory allocations involved:

user buffer (as large as the module)

sections (almost same size)

loaded module/instance

with this PR, the peak memory usage is max(1+2, 2+3), right?

Yes, the user buffer can be freed after the sections are created.

if the peak memory usage is a problem, it's better to introduce a stream-style api so that a user can use a fixed-sized buffer. wasm module format is designed to be easily loaded that way. that is,

user buffer (1 byte at minimum)

loaded module/instance

and/or, maybe we can simply make a user specify a "free the buffer" callback so that some buffers can be freed a bit earlier.

How about letting the developer pass a read stream callback to read n bytes from the wasm/aot file, somewhat like

typedef uint32 (*read_file_callback_t)(char *buf, uint32 count);  //read count bytes to buffer

wasm_module_t
wasm_runtime_load_from_stream(read_file_callback_t, char *error_buf, uint32_t error_buf_size);

Or just extend wasm_runtime_load_ex to allow developer to pass the read file callback to LoadArgs structure.

Then the loader can then read and create sections first (read section type and section size, create section body with section size, and then read data to section body), and then load from sections, then destroy sections after loading. In this way, we can reduce both the peak memory usage and memory usage after loading.

May 02 '24 08:05 wenyongh

The specific problem I'm trying to solve is the one described in #3354.

I'm using the wasm-c-api and module and module instance are saved into the store, that can't be freed until the end of wasm execution. And the module in the store has a copy of the module binary https://github.com/bytecodealliance/wasm-micro-runtime/blob/c0e33f08b04a5e03220ef74c2a9dbea8f1f35996/core/iwasm/common/wasm_c_api.c#L2290 which is kept until the store is destroyed.

By using the APIs described in this PR, I can free the sections after module instantiation. Especially the code and data sections, that can be quite large, saving some memory before starting the wasm execution. I don't think it's possible to obtain the same result with existing APIs.

May 03 '24 09:05 eloparco

Actually I was thinking, what if we only save the pointer here https://github.com/bytecodealliance/wasm-micro-runtime/blob/c0e33f08b04a5e03220ef74c2a9dbea8f1f35996/core/iwasm/common/wasm_c_api.c#L2286-L2292 instead of copying? That would allow more control and fix the problem I have when using the wasm c api. We could specify that behavior in LoadArgs from wasm_module_new_ex.

May 03 '24 12:05 eloparco

Actually I was thinking, what if we only save the pointer here

https://github.com/bytecodealliance/wasm-micro-runtime/blob/c0e33f08b04a5e03220ef74c2a9dbea8f1f35996/core/iwasm/common/wasm_c_api.c#L2286-L2292

instead of copying? That would allow more control and fix the problem I have when using the wasm c api. We could specify that behavior in LoadArgs from wasm_module_new_ex.

@eloparco it seems to be a common issue so I think we can have a try. And we can also try to enable other methods in the future:

Enable LoadArgs->no_wasm_binary_copy or LoadArgs->clone_wasm_binary for wasm_runtime_load_ex, like yamt mentioned. For example, create sections internally and free parts of sections at the end of loading. And if free callback is provided in LoadArgs, runtime can call it to free the input buffer after creating sections to reduce peak memory usage.
Create sections from a stream when a read stream callback is provided, in this way, no need to read the whole wasm file to a buffer, so as to reduce the peak memory usage and total memory usage after loading.
Use wasm_runtime_read_to_sections and wasm_runtime_destroy_sections that this PR implements. Could you check the comment for destroy_sections and help address it if it is good to you? My suggestion is to merge this PR after it is well reviewed, but had better get others' feedback.

May 06 '24 02:05 wenyongh

How about letting the developer pass a read stream callback to read n bytes from the wasm/aot file, somewhat like
typedef uint32 (*read_file_callback_t)(char *buf, uint32 count);  //read count bytes to buffer

wasm_module_t
wasm_runtime_load_from_stream(read_file_callback_t, char *error_buf, uint32_t error_buf_size);

i feel that a callback based api in this case is a bit cumbersome to use for many use cases. consider a loop with non-blocking recv and poll, which loads a module received via a socket. in that case, a context-based apl similar to what we have in wamr-app-framework would be much simpler to use.

May 07 '24 03:05 yamt

How about letting the developer pass a read stream callback to read n bytes from the wasm/aot file, somewhat like
typedef uint32 (*read_file_callback_t)(char *buf, uint32 count);  //read count bytes to buffer

wasm_module_t
wasm_runtime_load_from_stream(read_file_callback_t, char *error_buf, uint32_t error_buf_size);
i feel that a callback based api in this case is a bit cumbersome to use for many use cases. consider a loop with non-blocking recv and poll, which loads a module received via a socket. in that case, a context-based apl similar to what we have in wamr-app-framework would be much simpler to use.

Yes, that is better and it would be great if it supports receiving wasm file buffer from socket, uart or from a disk file. Not sure what the context-based api will be like?

May 09 '24 01:05 wenyongh

Since https://github.com/bytecodealliance/wasm-micro-runtime/pull/3389 was merged instead, close this PR.

May 30 '24 08:05 wenyongh

wasm-micro-runtime wasm-micro-runtime copied to clipboard

Add APIs to read and destroy sections

wasm-micro-runtime
wasm-micro-runtime copied to clipboard