bun icon indicating copy to clipboard operation
bun copied to clipboard

TypeScript's tests hang when run with bun run --bun

Open jakebailey opened this issue 2 years ago • 17 comments

What version of Bun is running?

1.0.18+36c316a24

What platform is your computer?

Linux 5.15.90.1-microsoft-standard-WSL2 x86_64 x86_64

What steps can reproduce the bug?

https://github.com/microsoft/TypeScript/pull/55987

Basically, install bun, then run bun run --bun test --no-lint in the TypeScript repo.

The logs will show which commands are running; they should indicate that bun is being used, not node.

What is the expected behavior?

No hang, tests complete.

What do you see instead?

CPU usage goes to minimal, tests make extremely slow progress (but maybe would complete eventually).

Additional information

I can bisect to this having changed between v1.0.7 and v1.0.8. v1.0.7 makes progress, but v1.0.8's CPU usage falls down to nothing.

jakebailey avatar Dec 14 '23 20:12 jakebailey

Stacktrace:

❯ bun run --bun test -- --no-lint --bundle=false
$ hereby runtests-parallel --light=false --no-lint --bundle=false
Using ~/Build/TypeScript/Herebyfile.mjs to run runtests-parallel
Starting lib
Starting generate-diagnostics
> /Users/jarred/.bun/bin/bun scripts/processDiagnosticMessages.mjs src/compiler/diagnosticMessages.json
Starting shim-tests
Finished shim-tests in 6ms
Reading diagnostics from src/compiler/diagnosticMessages.json
Finished lib in 20ms
Finished generate-diagnostics in 37ms
Starting build-tests
Starting build-services
Finished build-tests in 13s
Finished build-services in 13s
Starting dts-services
> /Users/jarred/.bun/bin/bun ./scripts/dtsBundler.mjs --entrypoint ./built/local/typescript/typescript.d.ts --output ./built/local/typescript.d.ts
Bundling ./built/local/typescript/typescript.d.ts to ./built/local/typescript.d.ts and ./built/local/typescript.internal.d.ts
Finished dts-services in 4.9s
Starting dts-lssl
Finished dts-lssl in 3ms
Starting do-runtests-parallel
Running tests with config: {"light":false,"workerCount":9,"taskConfigsFolder":"/var/folders/hd/k_cxl9f55475q62m8_f_9k_c0000gn/T/ts-tests1","noColor":false,"timeout":40000,"keepFailed":false}
> /Users/jarred/.bun/bin/bun ./built/local/run.js
583 |     if (aNeedle[aLineName] <= 0) {
584 |       throw new TypeError('Line must be greater than or equal to 1, got '
585 |                           + aNeedle[aLineName]);
586 |     }
587 |     if (aNeedle[aColumnName] < 0) {
588 |       throw new TypeError('Column must be greater than or equal to 0, got '
                  ^
TypeError: Column must be greater than or equal to 0, got -1
      at SourceMapConsumer_findMapping (/Users/jarred/Build/TypeScript/node_modules/source-map/lib/source-map-consumer.js:588:13)
      at SourceMapConsumer_originalPositionFor (/Users/jarred/Build/TypeScript/node_modules/source-map/lib/source-map-consumer.js:653:17)
      at mapSourcePosition (/Users/jarred/Build/TypeScript/node_modules/source-map-support/source-map-support.js:244:28)
      at wrapCallSite (/Users/jarred/Build/TypeScript/node_modules/source-map-support/source-map-support.js:397:20)
      at prepareStackTrace (/Users/jarred/Build/TypeScript/node_modules/source-map-support/source-map-support.js:446:39)
      at createIOError (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:1195:9)
      at _walk (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:952:31)
      at mkdirpSync (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:330:24)
      at _applyFilesWorker (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:1050:17)
      at _applyFilesWorker (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:1057:17)
      at _applyFiles (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:1001:9)
      at new FileSystem (/Users/jarred/Build/TypeScript/built/local/harness/vfsUtil.js:54:13)
      at createFileSystem (/Users/jarred/Build/TypeScript/built/local/testRunner/unittests/config/configurationExtension.js:10:12)
      at /Users/jarred/Build/TypeScript/built/local/testRunner/unittests/config/configurationExtension.js:259:55
      at /Users/jarred/Build/TypeScript/built/local/testRunner/tests.js:26:1
      at /Users/jarred/Build/TypeScript/built/local/testRunner/runner.js:204:5
      at /Users/jarred/Build/TypeScript/built/local/run.js:1:8
Error in do-runtests-parallel in 443ms
Error: Process exited with code: 1
Completed runtests-parallel with errors in 18.5s

I do not believe it ran in v1.0.7, as it hangs if node isn't in PATH and when node is symlinked to Bun, it hangs.

Jarred-Sumner avatar Dec 15 '23 03:12 Jarred-Sumner

Hm. I had it working in CI and locally; https://github.com/microsoft/TypeScript/pull/55987#issuecomment-1854870300 has a screenshot of it completing.

jakebailey avatar Dec 15 '23 03:12 jakebailey

Hm. I had it working in CI and locally; microsoft/TypeScript#55987 (comment) has a screenshot of it completing.

I suspect it still ran in Node:

image

Jarred-Sumner avatar Dec 15 '23 03:12 Jarred-Sumner

It definitely runs bun for me locally. I can roll the CI back to 1.0.7 to verify.

jakebailey avatar Dec 15 '23 03:12 jakebailey

Locally:

image

I unfortunately didn't catch CI before the logs stopped working (good ol GHA logs), but it should appear later with 1.0.7.

I have had quite a bit of trouble getting bun to actually run and not node; the setup I have with a "global" install of the bun package is the only way I've gotten it to actually work, short of a wrapper command named node that goes onto PATH. (Which, as I understand it, will be slightly better in 1.0.18.)

jakebailey avatar Dec 15 '23 03:12 jakebailey

CI did run bun at 1.0.7.

https://github.com/microsoft/TypeScript/actions/runs/7217529866/job/19665492754?pr=55987

$ hereby runtests-parallel --light=false --no-lint --bundle=true
Using ~/work/TypeScript/TypeScript/Herebyfile.mjs to run runtests-parallel
Starting lib
Starting generate-diagnostics
> /home/runner/.bun/bin/bun scripts/processDiagnosticMessages.mjs src/compiler/diagnosticMessages.json
Finished lib in 30ms
Reading diagnostics from src/compiler/diagnosticMessages.json
Finished generate-diagnostics in 7[5](https://github.com/microsoft/TypeScript/actions/runs/7217529866/job/19665492754?pr=55987#step:7:6)ms
Starting bundle-tests
Starting build-tests
Starting build-services
Finished bundle-tests in 1s
Finished build-tests in 23.1s
Finished build-services in 23.1s
Starting dts-services
> /home/runner/.bun/bin/bun ./scripts/dtsBundler.mjs --entrypoint ./built/local/typescript/typescript.d.ts --output ./built/local/typescript.d.ts
Bundling ./built/local/typescript/typescript.d.ts to ./built/local/typescript.d.ts and ./built/local/typescript.internal.d.ts
Compiling https://plugins.dprint.dev/json-0.19.0.wasm
Compiling https://plugins.dprint.dev/typescript-0.88.3.wasm
Finished dts-services in 4s
Starting dts-lssl
Finished dts-lssl in 1ms
Starting do-runtests-parallel
Running tests with config: {"light":false,"workerCount":4,"taskConfigsFolder":"/tmp/ts-tests1","noColor":false,"timeout":40000,"keepFailed":false}
> /home/runner/.bun/bin/bun ./built/local/run.js
Discovered 306 unittest suites.
Discovering runner-based tests...
Discovered 18542 test files in 327ms.
Starting to run tests using 4 threads...
Batching initial test lists...
Batched into 4 groups with approximate total file sizes of 3[6](https://github.com/microsoft/TypeScript/actions/runs/7217529866/job/19665492754?pr=55987#step:7:7)4[9](https://github.com/microsoft/TypeScript/actions/runs/7217529866/job/19665492754?pr=55987#step:7:10)6[12](https://github.com/microsoft/TypeScript/actions/runs/7217529866/job/19665492754?pr=55987#step:7:13) bytes in each group. (90.5% of total tests batched)

Just pushed another run at 1.0.8 to show that's where things go wrong.

jakebailey avatar Dec 15 '23 03:12 jakebailey

I wonder if there's a regression in our implementation of Error.prepareStackTrace leading it to return an invalid column number

Jarred-Sumner avatar Dec 15 '23 03:12 Jarred-Sumner

In the trace where you saw the source-map-support, did you use some fancy build of bun to get that info? I suppose it's possible that every test is failing and that's why it's taking so long, but I'm actually seeing the tests eventually complete on my machine as though they did run.

jakebailey avatar Dec 15 '23 03:12 jakebailey

Just to clarify, hang is probably the wrong term, because it does complete on my machine:


  91738 passing (1h)

Finished do-runtests-parallel in 1h 24m 9.6s
Completed runtests-parallel in 1h 24m 10s
total time:  5050.14s
user time:   3128.98s
system time: 1488.52s
CPU percent: 91%
max memory:  2147 MB

It's just that this same command typically takes 1m 45s!

jakebailey avatar Dec 15 '23 04:12 jakebailey

I've been trying to bisect this myself this weekend, but not been able to get a working bun binary at any revision (if the commit I've chosen builds at all).

It took a ton of fiddling because these older versions are super fickle when it comes to what the child process runs (node versus bun versus... literally unrelated binaries), but I've at least been able to find that the change happened between 1.0.7-canary.20231024.1 and 1.0.7-canary.20231025.1.

I wanted to bisect further, but I haven't actually been able to get a build from source to work (either it doesn't build, or the binary it produces crashes when I run our tests).

jakebailey avatar Jan 21 '24 20:01 jakebailey

I finally got building to work locally; my zig version was wrong and I needed to pull the version out of the Dockerfile used to build the canaries/releases.

A bisect points to #6674.

jakebailey avatar Jan 22 '24 05:01 jakebailey

Does the test runner use a lot of setImmediate or setTimeout(cb, 0)? I think right now our setImmediate increments the reference count, keeping the event loop alive. It's possible that we shouldn't do that.

Jarred-Sumner avatar Jan 22 '24 05:01 Jarred-Sumner

I know it does at least once to kick testing off; I'll look again tomorrow.

jakebailey avatar Jan 22 '24 06:01 jakebailey

A quick hack you could try is globalThis.setImmediate = process.nextTick at the beginning of one of the files

It doesn't look like we use setImmediate in any of the node polyfills or in our code regularly, so it might be fine. There could also be strangeness with node streams though depending on which library is used with it

Jarred-Sumner avatar Jan 22 '24 06:01 Jarred-Sumner

Placing that line early enough in our tests does appear to "fix" things and allow the tests to make progress.

jakebailey avatar Jan 22 '24 17:01 jakebailey

TS hang due to setImmediate? May also be related to the playwright issue: https://github.com/oven-sh/bun/issues/8222 (though it appears with config)

jdalton avatar Jan 22 '24 18:01 jdalton

To be clear, TypeScript itself does not hang, tsc and the API all work fine. It's just our test suite (which is more or less custom + some use of mocha).

jakebailey avatar Jan 22 '24 19:01 jakebailey

@jakebailey This PR is relevant, it should fix the error source-map-support is throwing https://github.com/oven-sh/bun/pull/8657

Jarred-Sumner avatar Feb 03 '24 09:02 Jarred-Sumner

As far as I can tell, that didn't seem to affect anything (so we must be resistent to the problems solved by that PR).

I still need the globalThis.setImmediate = process.nextTick hack to get tests to run (but, I wasn't expecting the stack trace thing to be the solution there, of course).

jakebailey avatar Feb 05 '24 23:02 jakebailey

Is there some testing or something I could do to help with this? I would like to be able to stick bun into TypeScript's CI, but this sure makes it seem like something's wrong...

jakebailey avatar Jul 09 '24 02:07 jakebailey

in a debug build it eventually starts running some tests, and produces 7 MB of debug logs mostly related to IPC. image

random idea: what if you make it use JSON serialization mode for IPC? it probably will make your CI run faster regardless if you can use it

Jarred-Sumner avatar Jul 09 '24 03:07 Jarred-Sumner

It does eventually run, but starts to hang and go extremely slowly without globalThis.setImmediate = process.nextTick.

random idea: what if you make it use JSON serialization mode for IPC? it probably will make your CI run faster regardless if you can use it

json is the default serialization used for forked processes so I believe we're already doing that. I'm not sure how that plays with this hang, though.

jakebailey avatar Jul 09 '24 03:07 jakebailey

The parallel testing isn't very amenable to the command you ran, but I did push something up to https://github.com/jakebailey/TypeScript/tree/bun-ci-logging which splits logging by subprocess.

The tests do not make a lot of progress; I can see them almost exclusively doing:

[alloc] new() = src.bun.js.api.Timer.TimerObject@2000ffb92c0
[zig] Timeout.toJS
[zig] Timeout.fromJS
[EventLoop] exit() = 0
[EventLoop] enter() = 0
[alloc] new() = src.bun.js.api.Timer.TimerObject@2000ffb9320
[zig] Timeout.toJS
[zig] Timeout.fromJS
[EventLoop] exit() = 0
[EventLoop] enter() = 0
[alloc] new() = src.bun.js.api.Timer.TimerObject@2000ffb9380
[zig] Timeout.toJS
[zig] Timeout.fromJS
[EventLoop] exit() = 0
[EventLoop] enter() = 0
[alloc] new() = src.bun.js.api.Timer.TimerObject@2000ffb9080
[zig] Timeout.toJS
[zig] Timeout.fromJS
[EventLoop] exit() = 0
[EventLoop] enter() = 0
[alloc] new() = src.bun.js.api.Timer.TimerObject@2000ffb90e0
[zig] Timeout.toJS
[zig] ~Timeout 0x2000ffb9380
[alloc] destroy() = src.bun.js.api.Timer.TimerObject@2000ffb9380
[zig] ~Timeout 0x2000ffb9320
[alloc] destroy() = src.bun.js.api.Timer.TimerObject@2000ffb9320
[zig] ~Timeout 0x2000ffb92c0
[alloc] destroy() = src.bun.js.api.Timer.TimerObject@2000ffb92c0
[zig] ~Timeout 0x2000ffb9800
[alloc] destroy() = src.bun.js.api.Timer.TimerObject@2000ffb9800
[zig] ~Timeout 0x2000ffb97a0
[alloc] destroy() = src.bun.js.api.Timer.TimerObject@2000ffb97a0

Restore the setImmediate = nextTick hack and those go away, instead making progress:

[SYS] stat(tests/baselines/local/convertToAsyncFunction/convertToAsyncFunction_basic.ts) = -1
[zig] NodeJSFS.statSync(String, FinalObject)
[SYS] stat(tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts) = 0
[alloc] new() = src.bun.js.node.types.StatType(false)@2000ffb0740
[zig] Stats.toJS
[zig] Stats.fromJS
[zig] Stats.isFile()
[zig] NodeJSFS.readFileSync(String)
[SYS] openat(-100, tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts) = 15
[SYS] fstat(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts]) = 0
[SYS] read(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts], 344) = 328 (0.009ms)
[SYS] read(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts], 16) = 0 (0.004ms)
[SYS] close(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts])
[zig] NodeJSFS.statSync(String, FinalObject)
[SYS] stat(tests/baselines/local/convertToAsyncFunction/convertToAsyncFunction_arrayBindingPattern.ts) = -1
[zig] NodeJSFS.statSync(String, FinalObject)
[SYS] stat(tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts) = 0
[alloc] new() = src.bun.js.node.types.StatType(false)@2000ffb07a0
[zig] Stats.toJS
[zig] Stats.fromJS
[zig] Stats.isFile()
[zig] NodeJSFS.readFileSync(String)
[SYS] openat(-100, tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts) = 15
[SYS] fstat(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts]) = 0
[SYS] read(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts], 348) = 332 (0.011ms)
[SYS] read(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts], 16) = 0 (0.008ms)
[SYS] close(15[/home/jabaile/work/TypeScript/tests/baselines/reference/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts])
[zig] NodeJSFS.statSync(String, FinalObject)
[SYS] stat(tests/baselines/local/convertToAsyncFunction/convertToAsyncFunction_objectBindingPattern.ts) = -1

(wish these log lines had timestamps...)

jakebailey avatar Jul 09 '24 05:07 jakebailey