tree-sitter-bash icon indicating copy to clipboard operation
tree-sitter-bash copied to clipboard

`Cannot read properties of undefined (reading 'apply')` during parsing in the browser sometimes

Open verhovsky opened this issue 2 years ago • 2 comments

The fastest way to see this issue is to go to https://curlconverter.com/ and paste this broken bash code

curl https://api.dropbox.com/oauth2/token \
    -d code=<AUTHORIZATION_CODE> \
    -d grant_type=authorization_code \
    -d redirect_uri=<REDIRECT_URI> \
    -d client_id=<APP_KEY> \
    -d client_secret=<APP_SECRET>

or you can reproduce it yourself in the browser like this

cd /tmp/
mkdir bashtstest
cd bashtstest
npm init -y
npm install web-tree-sitter tree-sitter-bash
npx tree-sitter build-wasm node_modules/tree-sitter-bash
cp node_modules/web-tree-sitter/tree-sitter.wasm .

then create index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <title>test tree-sitter-bash</title>
</head>
<body>

    <script src="/node_modules/web-tree-sitter/tree-sitter.js"></script>
    <script type="module">
        async function main() {
            const Parser = TreeSitter;

            await Parser.init();
            const Bash = await Parser.Language.load("/tree-sitter-bash.wasm");
            const parser = new Parser();
            parser.setLanguage(Bash);

            // make sure Tree-sitter works
            console.log(parser.parse('curl example.com'));

            // this doesn't
            const parsed = parser.parse('curl https://api.dropbox.com/oauth2/token \\\n -d code=<AUTHORIZATION_CODE> \\\n -d grant_type=authorization_code \\\n -d redirect_uri=<REDIRECT_URI> \\\n -d client_id=<APP_KEY> \\\n -d client_secret=<APP_SECRET>\n');
            console.log(parsed);
        }

        main();
    </script>
</body>
</html>

then serve the directory

python -m http.server

You will see this in the console:

Tree {0: 777776, language: Language, textCallback: ƒ}
Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'apply')
    at e.<computed> (tree-sitter.js:1:16465)
    at 00353ca2:0x2817
    at 00353ca2:0x1b06
    at tree-sitter.wasm:0x25842
    at Parser.parse (tree-sitter.js:1:53325)
    at main ((index):21:35)

but it shouldn't error and it doesn't error on the playground.

Stepping through the code it errors on this part

[...]
                function loadWebAssemblyModule(binary, flags, handle) {
                    var metadata = getDylinkMetadata(binary);
                    function loadModule() {
                        var firstLoad = !handle || !HEAP8[handle + 12 >> 0];
                        if (firstLoad) {
                            var memAlign = Math.pow(2, metadata.memoryAlign);
                            memAlign = Math.max(memAlign, STACK_ALIGN);
                            var memoryBase = metadata.memorySize ? alignMemory(getMemory(metadata.memorySize + memAlign), memAlign) : 0
                              , tableBase = metadata.tableSize ? wasmTable.length : 0;
                            handle && (HEAP8[handle + 12 >> 0] = 1,
                            HEAPU32[handle + 16 >> 2] = memoryBase,
                            HEAP32[handle + 20 >> 2] = metadata.memorySize,
                            HEAPU32[handle + 24 >> 2] = tableBase,
                            HEAP32[handle + 28 >> 2] = metadata.tableSize)
                        } else
                            memoryBase = HEAPU32[handle + 16 >> 2],
                            tableBase = HEAPU32[handle + 24 >> 2];
                        var tableGrowthNeeded = tableBase + metadata.tableSize - wasmTable.length, moduleExports;
                        function resolveSymbol(e) {
                            var t = resolveGlobalSymbol(e, !1);
                            return t || (t = moduleExports[e]),
                            t
                        }
                        tableGrowthNeeded > 0 && wasmTable.grow(tableGrowthNeeded);
                        var proxyHandler = {
                            get: function(e, t) {
                                switch (t) {
                                case "__memory_base":
                                    return memoryBase;
                                case "__table_base":
                                    return tableBase
                                }
                                if (t in asmLibraryArg)
                                    return asmLibraryArg[t];
                                var r;
                                t in e || (e[t] = function() {
                                    return r || (r = resolveSymbol(t)),
                                    r.apply(null, arguments)
// HERE                             ^^^^^^^^^^^^^^^
                                }
                                );
                                return e[t]
                            }
                        }
                          , proxy = new Proxy({},proxyHandler)
                          , info = {
                            "GOT.mem": new Proxy({},GOTHandler),
                            "GOT.func": new Proxy({},GOTHandler),
                            env: proxy,
                            wasi_snapshot_preview1: proxy
                        };
[...]

Where r is undefined and t is 'isalpha' and resolving it returns undefined. Although on curlconverter.com this function is called a few times with values like _ZNSt3__212basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEaSERKS5_ and works fine, the last value, which fails, is _ZNSt3__212basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE17__assign_no_aliasILb1EEERS5_PKcm.

This still an issue building from master, which I confirmed by replacing with this line in package.json

    "tree-sitter-bash": "git+https://github.com/tree-sitter/tree-sitter-bash.git#f7239f638d3dc16762563a9027faeee518ce1bd9",

then

rm -rf node_modules/ tree-sitter-bash.wasm
npm install
npx tree-sitter build-wasm node_modules/tree-sitter-bash

See https://github.com/curlconverter/curlconverter/issues/574

verhovsky avatar Jan 26 '24 04:01 verhovsky

In the browser I can trigger this with

parser.parse('curl \\\n-d\n');

the space before the \ and the dash after the newline seem to be what's doing it, removing either doesn't trigger the error. On curlconverter.com it's way more weird, if I remove parts of the command the issue goes away and then even if I type back exactly the characters I removed the bug doesn't come back.

verhovsky avatar Jan 26 '24 04:01 verhovsky

Hm, I think it's because isalpha is actually a macro wrapping __ctype_b_loc, and this isn't exported

Are you able to try with a local build of the wasm that patches in __ctype_b_loc (___ctype_b_loc for the mangled name) inside exports.json and see if that works?

amaanq avatar Jan 26 '24 04:01 amaanq

any fix for this? I'm facing similar for https://github.com/ganezdragon/tree-sitter-perl

ganezdragon avatar Mar 30 '24 03:03 ganezdragon

@ganezdragon update tree-sitter-cli and regenerate the wasm file (you'll need to change build-wasm to build --wasm --output tree-sitter-perl.wasm)

verhovsky avatar Mar 30 '24 05:03 verhovsky

@verhovsky thank you :)

and a hack I found was to use wctype.h instead of ctype.h . And, also replace string.h with a custom implementation of its methods with the same logic.

ganezdragon avatar Mar 30 '24 08:03 ganezdragon