tinygo
tinygo copied to clipboard
wasm: do not export malloc, calloc, realloc, free
These functions were exported by accident, because the compiler had no way of saying these functions shouldn't be exported. Unfortunately, they can easily be misused and bloat WebAssembly programs.
Not exporting them can be a big code size reduction for small programs. Before:
$ tinygo build -o test.wasm -target=wasi -no-debug -scheduler=none ./testdata/alias.go && ls -l test.wasm
-rwxrwxr-x 1 ayke ayke 2947 8 sep 13:47 test.wasm
After:
$ tinygo build -o test.wasm -target=wasi -no-debug -scheduler=none ./testdata/alias.go && ls -l test.wasm
-rwxrwxr-x 1 ayke ayke 968 8 sep 13:47 test.wasm
That's a 2kB reduction in binary size, because the GC isn't needed anymore.
This change also adds support for using //go:wasm-module
to set the module name of an exported function (the default remains env).
I expect this can be a controversial change, because some people have been relying on the exported malloc
function. While this function does work, it is not intended to be used like that (it's only to be used for C linked using CGo). Instead, memory should be allocated by exporting a specific function for it. For example:
var allocatedBytes *byte
//export allocateBytes
func allocateBytes(n int) *byte {
buf := make([]byte, n)
allocatedBytes = &buf[0]
return &buf[0]
}
This is a safe way to export memory.
I think this will break a significant amount of the TinyGo wasm community, so this should sit at least a few days to deal with timezones etc.
Can we re-export them with a compiler flag but turn them off by default or other code patch?I think most people shouldn't need them.
What's one project this would break? Maybe we could add one to our test corpus.
How about adding a flag to enable/disable malloc export first, and flip the default to disable after next release?
Yeah, we can keep them in the form of a unsafe_malloc_export
build tag for example. That way people get a bit more time to transition to a supported way of allocating memory.
@codefromthecrypt would that work for you?
Do we want to add a file with those build tags in this PR or another one?
First, I would like to decouple this from me. I don't actually build end wasm, just I respond to many people who are using tinygo to various levels of success. Also, the "malloc" "free" functions have been in use before wasm was a glint in my eye, and they are commonly used in emscripten and other wasm toolchains. Main point is that regardless of my response, these functions are absolutely in use, so how to communicate is very important.
I would suggest the following:
- some sort of flag so that people don't have to change their code to upgrade
- maybe look at other toolchains and make an arbitrary, ideally similar decision.
-Wl,--export=malloc -Wl,--export=free
or-s EXPORTED_FUNCTIONS=_malloc,_free
- maybe look at other toolchains and make an arbitrary, ideally similar decision.
- make a library in tinygo org that is tested and obviates this via the recommended approach
- this is required because copy/paste is error prone and too technical
- if not here, I can create one in tetratelabs org
- communicate clearly in the release notes a workaround (ex flag if one exists), a recommended approach (copy paste or include a lib that does the same), and benefits of this break (some people have 2k smaller code, some concrete example of exactly how these functions break .. eg not theoretical, a failing test)
my 2p!
Thanks @codefromthecrypt - in my experience, functions to allocate memory within the GC are essential for being able to pass data in from the host to wasm, so losing them will make code harder to write. I think a short term of adding flags to export the functions to preserve existing code, and a longer term of a library that can be imported to provide implementations reusable among TinyGo programs, could be a good way to make this easier for users.
#3148 looks like it would remove the safety issues in the built-in "malloc", a primary reason to discourage its use. effectively it changes it from unsafe to safe.
So, I suggest we wait on #3148 before making a firm outcome here. If "malloc" becomes safe, I believe and correct me if wrong, the rationale here boils down to not wanting to export built-in functions (even if they are safe).
ps we created a lib which if used can decouple things. since this is integration tested it should be usable regardless of decision here https://github.com/tetratelabs/tinymem/pull/2
I believe and correct me if wrong, the rationale here boils down to not wanting to export built-in functions (even if they are safe).
Correct, I'd like to remove them (by default at least) even if they are made safe. The reason is the reduction in wasm binary size, which many people will appreciate.
I have made a small change to this PR, so that it is still possible to export malloc. Example:
// #cgo LDFLAGS: --export=malloc --export=free
import "C"
This exports the malloc and free functions, for people who need it. Would that be a usable workaround?
@aykevl Thanks for progressing.
So if I understand correctly, users won't be able to export via flags to tinygo build
, rather add this to their main.go before they build it.
// #cgo LDFLAGS: --export=malloc --export=free
import "C"
Instead of a blank import of something that exports 3rd party version of similar, like
import (
_ "github.com/tetratelabs/tinymem"
)
The idea being that while this requires a source change, it avoids a 3rd party lib. Avoiding a 3rd party dep will allow code that currently has no dependencies to stay that way without copy/pasta.
Do I get it?
The idea being that while this requires a source change, it avoids a 3rd party lib. Avoiding a 3rd party dep will allow code that currently has no dependencies to stay that way without copy/pasta.
Yes, that sounds correct. (But of course the #cgo
line could also be in a 3rd party library).
The main reason why I did it the way I did it is because it is a very small and simple change.
As far as I'm concerned the workaround is helpful, if it is planned to stick around. If you think it will be removed later, than maybe still helpful but should mention that. In any case a "what to do" part in the release notes like written above will help folks.
And especially mention the rationale about the 2k savings. I'll do one last stab to put into context the savings. It might make you change your mind about defaulting to have these unavailable vs making it possible to exclude them instead.
Let's take this cat program which carefully avoids big packages like fmt.
package main
import (
"os"
)
// main runs cat: concatenate and print files.
//
// Note: main becomes WASI's "_start" function.
func main() {
// Start at arg[1] because args[0] is the program name.
for i := 1; i < len(os.Args); i++ {
bytes, err := os.ReadFile(os.Args[i])
if err != nil {
os.Exit(1)
}
// Use write to avoid needing to worry about Windows newlines.
os.Stdout.Write(bytes)
}
}
2KB is less than 1pct for a default build
$ tinygo build -o cat.wasm -target=wasi cat.go
$ du -k cat.wasm
260 cat.wasm
2KB is less than 4pct for a trimmed build
$ tinygo build -o cat.wasm -scheduler=none --no-debug -target=wasi cat.go
$ du -k cat.wasm
60 cat.wasm
And especially mention the rationale about the 2k savings. I'll do one last stab to put into context the savings. It might make you change your mind about defaulting to have these unavailable vs making it possible to exclude them instead.
It's not just that. It was never intended this way, like a private API that accidentally got exported. I want to make the absolute minimum available in an API so that the maintenance burden is smaller.
Admittedly malloc
is a rather standard API but it should never have been exposed in the first place.
2KB is less than 4pct for a trimmed build
4% is enormous for a compiler developer. Even a 1% saving is really big to a compiler developer. Most changes to reduce code size are less than that. The thing is that there are many of these little changes and they add up. TinyGo would produce much larger binaries if I wasn't paying attention to all these small savings.
I can understand where you are coming from. keep in mind that tinygo is for end users, not its developers.
Also, as I mentioned, a normal developer would probably use "fmt" instead of dancing around things, and the default wasm size would be 404KB.
I know you are going to remove this now, I'm mainly trying to provide you a perspective you can consider for the release notes which are for end users, some of whom will break, and some of whom may have megabytes or more wasm.
I can understand where you are coming from. keep in mind that tinygo is for end users, not its developers.
Absolutely. But end users wouldn't be using TinyGo if it didn't produce small binaries.
Also, as I mentioned, a normal developer would probably use "fmt" instead of dancing around things, and the default wasm size would be 404KB.
I guess there are many different kinds of developers and use cases. I try to make TinyGo useful to as many as possible (which also means making malloc
and free
available for those who need it).
Anyway, I agree this will need some special mention in the release notes so that wasm people will know the next release might break their code.
Last call for comment for interested WASM/WASI people!
I personally agree with the reasons for this PR. The #cgo LDFLAGS
solution does not seem too burdensome for those who have ended up using this feature and need some time to modify their code, or start using tinymem
, or whatever is the best option based on their scenario.
@deadprogram and I suppose modifying code is in any case necessary meanwhile, as we have no other way to pass cgo LDFLAGS another way (such as CGO_LDFLAGS
env variable). Correct?
@k33g (wasm-university etc) @juntao (wasmedge examples) @knqyf263 (trivy, go-plugin) FYI the tinygo built-in exports for "malloc" and "free" will be removed by default and need a code change to put them back, starting next release.
You may need to change docs or code generators when this happens. Here's a synopsis of two ways to do it.
https://github.com/tinygo-org/tinygo/pull/3142#issuecomment-1246621492
Personally, I think the cgo LDFLAG thing is a bit scary for newcomers to look at, so will change wazero's docs to not suggest doing that. Up to you with your respective audiences!
From what I can tell from this thread and the associated code, the fix for any third party packages that are currently using malloc()
would be either:
- add a file someplace in the third party package that includes the
LDFLAGS
mentioned above. - perform memory allocations in the package using code similar to that mentioned in the original comment for this PR https://github.com/tinygo-org/tinygo/pull/3142#issue-1366164801
The first option would be a "quick fix" if package maintainers need it. The second would be preferable, due to being safer, and also backward compatible.
For end-user applications that are using malloc()
directly, the same applies. But if the app itself is not doing this, then the above changes to whatever package they are using should make it invisible to the end-developer.
@aykevl can you please confirm if my summary is correct?
Let's wait a bit longer with this PR since it's a bit more contentious than I expected.
The main problem with malloc
and free
has been fixed with #3148 so that alleviates one of my concerns with it. The other (code size increase) isn't fixed yet though, so I still would like to merge this PR eventually - especially since #3148 made the difference even bigger.
@deadprogram yes this sounds correct.
Is the plan to merge this after the next release?
I don't have a plan, but yeah we could perhaps do that.
With 0.27 right around the corner, it seems like this will get cleaned up and merged shortly thereafter.
@dgryski yes let's do it then. If it gives any issues, there is plenty of time to look for a solution.
Hi @aykevl @dgryski just wanted to check, is this still planned to unexport malloc, etc by default? Since it didn't happen for 0.28, would we expect it for 0.29 if so?