wabt Is it possible to have de-mangled function name in the output of wasm-decompile.exe

wasm-decompile.exe tool is great!!! It gives better readable format. Is it possible to have the mangled function name instead of demangled in the output file ?

May 06 '20 01:05 Praveer1981

@aardappel

May 06 '20 18:05 binji

What is in the names section is often neither mangled (C++ linker symbol) or de-mangled (C++ source level function name), it is a complete signature including non-identifier characters like (, and :: etc. The decompiler tries to make this into into an identifier by filtering out such characters and also filtering out common strings (since these signatures can easily become several hundred characters long due to the STL types).

It all happens here: https://github.com/WebAssembly/wabt/blob/master/src/decompiler-naming.h

What "demangled" name would you like to see? Just the function name? That is problematic, since it would require a C++ function signature parser just to find out which identifier is the function name, which can get complex given C++ type syntax. And then there's the issue that the name section is not even guaranteed to contain a C++ signature, this just happens to be the current convention. It may instead come from a different compiler or language, and contain any arbitrary syntax.

May 07 '20 16:05 aardappel

If its a useful feature we could have an option in wasm-ld that puts mangled names in the name section rather that human-readable demanded ones.

May 07 '20 17:05 sbc100

@sbc100 I think that be great, personally I think that should be the default (since special characters have no business in a "name" ? :)

May 07 '20 18:05 aardappel

Looks like user expectations would be that a disassembling tool would be responsible for demangling C++ names, rather than the linker itself.

Is there information we have in the linker that later stages don't have? Should we not-mangle by default? Maybe use the producers section to autodetect C++ demangling in binaryen et al, or just let that be user defined rather than autodetected.

May 07 '20 18:05 jgravelle-google

The information that we have in the linker and that later stages don't have is basically how to demangle symbols. Late stage tools like devtools don't now how to demangle and its not clear that we want to teach them. The idea was that tools like devtools should not need to know the different mangling techniques used by different languages and that the name section should contain the most user-friendly name.

The C++ name mangling convention used by llvm is also not part of any Wasm standard or convention so in theory it is subject to change and we wouldn't want to leak outside the toolchain.

Given that, do you think that leaving the default alone and adding an option seems reasonable?

May 07 '20 18:05 sbc100

At the same time, de-mangling is a lot easier than mangling (if that requires a C++ parser in the general case), so to me it makes sense that things are stored in the simpler format.

Also, specifying somewhere what the format is, seems like a great idea. Right now, if you feed the decompiler Wasm that comes from some other compiler and it contains identifiers that accidentally look like STL ones they may get filtered which is really not right.

May 07 '20 18:05 aardappel

In my eyes doing the demangling earlier vs later is being presumptive about what users want. If you do want demangling, then you're happy, otherwise less so. If the consumer of the wasm is devtools, then great. If it's another post-processing tool though, less-great. Here I think the linker's output will be primarily consumed by other tools, rather than piped directly into the browser. In emscripten we already run several binaryen operations on our linked wasms before the browsers get a chance to see it, so we have plenty of time to demangle at a later stage while still being convenient for the devtools use case (I expect emscripten output would primarily be browsers).

May 07 '20 19:05 jgravelle-google

I disagree that wasm-ld output should not be expected to be ship-able and run-able as is. I hope that for many uses it will be.

If wasm-ld is part of toolchain/pipeline then surely its trivial for that pipeline to set whatever flags it wants to prevent the names section from being human readable, no? Seems like a flag would solve the problem, no?

May 07 '20 20:05 sbc100

I'm not saying it shouldn't be, just that I don't think it's our primary use case.

I do agree that wasm-ld being part of any toolchain will be able to set that flag either way they care about so it's less of an issue. And if you're trying to build your own toolchain... kinda ditto. So I don't feel too strongly about the default, but yeah a flag would be neat.

May 07 '20 21:05 jgravelle-google

What is in the names section is often neither mangled (C++ linker symbol) or de-mangled (C++ source level function name), it is a complete signature including non-identifier characters like (, and :: etc. The decompiler tries to make this into into an identifier by filtering out such characters and also filtering out common strings (since these signatures can easily become several hundred characters long due to the STL types).

It all happens here: https://github.com/WebAssembly/wabt/blob/master/src/decompiler-naming.h

What "demangled" name would you like to see? Just the function name? That is problematic, since it would require a C++ function signature parser just to find out which identifier is the function name, which can get complex given C++ type syntax. And then there's the issue that the name section is not even guaranteed to contain a C++ signature, this just happens to be the current convention. It may instead come from a different compiler or language, and contain any arbitrary syntax.

For example - Currently I get , function ZNK14AcDbShPropertyI26AcDbShPyramidSidesPropertyE11subGetValueEPK10AcRxObjectR9AcRxValue(a:int, b:int, c:int):int { var d:int_ptr = g_f; g_f = g_f + 32;

In order to get the meaningful name I need to add _ as prefix and use Demangler to get the actual name. Now I can search this function in my code base. If I already had demangled name in the output file( which was generated by wasm-decompile.exe) then I had no need to to use another tool to demangle.

It can save developer time actually.

May 08 '20 06:05 Praveer1981

Wait, it sounds like you are fact asking for the opposite of what we thought you were asking for. You are observing mangled names in decompiled output and what you want to see is in fact demangled name(?). Well, that changes this whole discussion then.

I'm guessing that simply preserving the name section would be enough to fix your issue. I think that building build --profiling-functions is the simplest why to achieve that (without also include dwarf or otherwise effecting your build).

May 08 '20 16:05 sbc100

Hmm that's odd, what is the exact toolchain/options that built this binary? In my testing, I only get de-mangled full signatures as names.

Presumably in that case the function name should be PyramidSidesProperty, not sure how we'd get that short of adding a de-mangler to WABT. And then there's the problem that nothing indicates that this is indeed a mangled name.. it could be anything else.

May 11 '20 17:05 aardappel

wabt wabt copied to clipboard

Is it possible to have de-mangled function name in the output of wasm-decompile.exe

wabt
wabt copied to clipboard