TypeScript Module conversion tracking issue / notes

I'm using this thread to dump my brain and to note problems as I solve them (or to discuss).

Issues of note:

#49037
#27891
#46744 (Eli's old notes)
#35210
#46567

2022-05-31

Changes since the previous effort

The "typeformer" (the code that does the transform) now lives at https://github.com/jakebailey/typeformer, and has been rewritten using ts-morph to preserve comments and formatting.
- The old typeformer use TS's own transform and emitter, but the emitter is not careful enough to preform TS-to-TS transformations (I have a list of bugs I've discovered, which don't matter for JS emit)
My WIP changes are semi-automatically pushed to a stack of PRs on my fork, starting here.
- It's "best" to look at the difference between the "indent" commit and the last commit, e.g. this diff (but note that this link may not be up to date, is is as of now).
- Please don't comment on these PRs; the script I run to create this stack is not very smart and will force-push if I insert a new commit in the middle. I wish I could just use gerrit, but...
I have removed the esbuild step from the previous effort; not because we don't plan to use it, but because it's simplest to start by using raw modules and then tweaking the output later. If we're going to be modules, I'd hope we can be run through a bundler successfully.
The faked namespace modules (barrel modules that at runtime behave very similarly to the old namespaces) have been moved to a dedicated folder in each project, rather than being dumped at the top level.
#46944 has to be reverted; this code doesn't seem to compile properly when actually built.

typeformer bugs

The typeformer's inlineImports step converts globalThis.assert into assert, because it thinks it's a global function that should be settable. Technically true, but if we wanted that, we probably wouldn't have written globalThis explicitly. This is just a bug when I rewrote the step.
Sometimes #region comments are dropped. Likely a bug that can be worked around like other ts-morph oddities to do with comment preservation (which it is almost always good at, but sometimes not).
Sometimes comments get duplicated, e.g. // DestructuringAssignmentType in src/compiler/types.ts.
The output path is determined by the project name; this was changed in my iteration of the typeformer, but should be hand-tweaked back to the right path, e.g. the debug code (dbg.ts) that is dynamically loaded in, or loggedIO.

Unsolved problems

My version of the typeformer's "inlineImports" pass is very aggressive, importing everything from their original declaration if possible. This is closest to how we would write the code by hand (if we had done so from the start), but the circularity of the codebase has broken things.
- e.g., debug.ts imports semver.ts, but the Semver class creates a static property zero, which calls the Semver constructor, which then contains Debug.assert calls.
- This only happened because the top of debug.ts contains import { Semver } from './semver, whereas before, the code didn't do this but instead referred to the namespace, so never actually caused that file to be evaluated.
- It's totally possible that we'd hit this circularity issue today, except that most code is not at the top level and avoids the problem entirely, or has worked "magically" thanks to the order we happen to merge namespaces in.
- We could just import these from the namespaces (like #46567 does), which I think would fix this, but this is really unnatural code structure and auto-imports will not function properly.
The System interface (aka ts.sys) is set (ts.sys = ...) in order to support the browser, but in the module version, modules do import { sys } from '...'), which makes it very difficult to do like this. What can do do here?
Additionally, the System interface has getExecutingFilePath (among other functions), which only makes sense for TS packaged as a self contained file.

TODOs

[ ] Figure out what to do about the API; the past effort talks about API extractor, which is the right thing to do, but we need to make sure that we produce an output that describes the API as namespaces, not as reexported modules. #4529 makes this hard.
[ ] We need to create the entrypoints for the new exported API; right now it just outputs as outDir, but I don't think we intend to just allow everyone to import our internals directly. outFile made a file that did the right thing before, and after we just need to write something like tsc.js which exports * from _namespaces/ts.ts or something.
- See also exportAsModule, which hacks together a module.exports = ... line rather than emitting it via export.
[ ] Deprecations were previously done via namespace merging (i.e. deprecating "Map" in favor of "ESMap"); these are very messy when not coming up with a single file.
[ ] registerRefactor uses side effects of file loading to register refactorings. This technically works OK after modification if the fake namespace modules still import these files, but it'd be better to refactor this to work like the transforms (which are just imported in other files instead).
[ ] Loads of type names are duplicated; this previously didn't matter because you'd just fully qualify names for another namespace, but now that things are directly imported, you end up with these random fully-qualified names which would be better done differently. For example, fakeHosts.ts implements certain interfaces, but does so with classes named the same as the interface. Maybe just do FakeSomeInterface?

Other stuff

We should be able to hide the big refactor when it's merged using a .git-blame-ignore file. See:
- https://github.com/microsoft/vscode/blob/main/.git-blame-ignore
- https://docs.github.com/en/repositories/working-with-files/using-files/viewing-a-file#ignore-commits-in-the-blame-view

May 31 '22 23:05 jakebailey

My version of the typeformer's "inlineImports" pass is very aggressive, importing everything from their original declaration if possible. This is closest to how we would write the code by hand (if we had done so from the start), but the circularity of the codebase has broken things. e.g., debug.ts imports semver.ts, but the Semver class creates a static property zero, which calls the Semver constructor, which then contains Debug.assert calls. This only happened because the top of debug.ts contains import { Semver } from './semver, whereas before, the code didn't do this but instead referred to the namespace, so never actually caused that file to be evaluated. It's totally possible that we'd hit this circularity issue today, except that most code is not at the top level and avoids the problem entirely, or has worked "magically" thanks to the order we happen to merge namespaces in. We could just import these from the namespaces (like https://github.com/microsoft/TypeScript/pull/46567 does), which I think would fix this, but this is really unnatural code structure and auto-imports will not function properly.

I'd fixed this by hand a few times in my version - I usually swap deep imports for barrel file imports and reorder the barrel imports to match our tsconfig order, since that represents the canonical execution order we desire. Sometimes some barrel imports need to be replaced with deep imports to break circularities, but it wasn't too many, iirc.

This:

e.g., debug.ts imports semver.ts, but the Semver class creates a static property zero, which calls the Semver constructor, which then contains Debug.assert calls.

? I don't see a Semver reference in debug.ts locally, and semver comes after debug in our tsconfig, so semver's never available during debug's setup (not that it seems to need to be?).

or has worked "magically" thanks to the order we happen to merge namespaces in.

Nothing too magic about it. The tsconfig order is curated to be a correct execution order - so long as barrel files reproduce the same order (either automatically or by hand), we should be equivalent, at least.

The System interface (aka ts.sys) is set (ts.sys = ...) in order to support the browser, but in the module version, modules do import { sys } from '...'), which makes it very difficult to do like this. What can do do here?

We have a setSys function in sys already as part of my work years ago - we should swap external stuff over to using that if it isn't already.

Additionally, the System interface has getExecutingFilePath (among other functions), which only makes sense for TS packaged as a self contained file.

It still makes sense for a tree of files, imo - intuitively I feel like it should be the entrypoint file, or whatever was initially executed by node. (be it tsc.js or tsserver.js or bin/whatever)

Jun 01 '22 20:06 weswigham

? I don't see a Semver reference in debug.ts locally, and semver comes after debug in our tsconfig, so semver's never available during debug's setup (not that it seems to need to be?).

I meant to say Version, which is defined in semver.ts. It's referenced many times. (I was confusing it with the Semver package everyone else uses.)

I'd fixed this by hand a few times in my version - I usually swap deep imports for barrel file imports and reorder the barrel imports to match our tsconfig order, since that represents the canonical execution order we desire. Sometimes some barrel imports need to be replaced with deep imports to break circularities, but it wasn't too many, iirc.

Nothing too magic about it. The tsconfig order is curated to be a correct execution order - so long as barrel files reproduce the same order (either automatically or by hand), we should be equivalent, at least.

My overall goal was to end up with a codebase that felt like one that had been written with modules; this means eliminating these barrel imports from files. Otherwise, auto-imports will not function properly because auto-imports will prefer importing directly from the declaring modules. It also means not requiring that a given file import each file in a specific order either, given everyone (including our own import organizer) will not preserve this order.

Nothing too magic about it. The tsconfig order is curated to be a correct execution order - so long as barrel files reproduce the same order (either automatically or by hand), we should be equivalent, at least.

This is going to be rough to maintain in the long term, because I don't think the average person knows that this ordering has any impact. I sure didn't; I've never used a project that listed its files explicitly, only globs. I would hope that we can stop having so few files just to not have to add more, and instead break the code apart a bit. Especially if perf improves because we're not running every file in a closure where each access to the old namespace was actually fully-qualified under the hood.

We have a setSys function in sys already as part of my work years ago - we should swap external stuff over to using that if it isn't already.

My understanding had been that if one has a import { sys } from "..." that you're stuck with whatever value existed at import time, but searching it now, I guess export let ... is some sort of magical "live" view, which does fix this. That's good.

It still makes sense for a tree of files, imo - intuitively I feel like it should be the entrypoint file, or whatever was initially executed by node. (be it tsc.js or tsserver.js or bin/whatever)

Yep, this is more of a plumbing problem. I just wanted to note it so I could delete my random TODOs.

Jun 01 '22 20:06 jakebailey

This is going to be rough to maintain in the long term

I mean, it already is, it's one of the hidden costs if using namespaces. It just only comes up when someone finds they want to adjust the order (because, eg, they want to start using utilities somewhere new or something). Likewise, it'll only come up with modules when someone wants to import from a new module that's already in use elsewhere (and that module does static initialization), which, in practice, is pretty rare.

Jun 01 '22 20:06 weswigham

this means eliminating these barrel imports from files

So this is a style argument, really. Imo, you're supposed to import from the barrel files for everything other than your immediately adjacent files (and we've talked about our autoimport behavior re: this before, and the only way we'll get better at it is honestly using it ourselves and seeing the upsides and downsides of each approach). Still, our first concern should probably be retaining a functional execution order (T.T) and not a specific style - that can come later.

Jun 01 '22 20:06 weswigham

My next step is to revert my changes that made imports really aggressive in favor of importing from the barrels, just to confirm that things run again (and to double check the state of auto-imports), so that should hopefully get me back to a working codebase.

I was really hopeful that we could write things like everyone else, but it's a shame about the execution ordering.

Jun 01 '22 20:06 jakebailey

2022-08-29

I took a few months off of this to work on other issues, but have recently picked this back up again with the intent to make this breaking packaging change for 5.0 (it seems like a good idea to not do this on a minor release).

Last week, I hit a major milestone and got the test suite to run; all tests pass except the API tests (which don't pass because I haven't started on fixing the d.ts files yet).

The entire build is just generateLibs and tsc -b ./src, which is a great simplification from our current state. I expect this to only expand for things like api-extractor and potentially a bundler if we still need to support a single-file for some outputs. The build time is roughly the same as main for a clean build of every project, but a good deal faster than main when making an incremental change to checker.ts.
Getting tests running required a reordering of things in the namespaces, as well as moving some helper code around such that tests do not import other tests.
The tests only work via the serial test runner, i.e. what's built into mocha. Our own custom parallel runner does not work. I am investigating this, however, mocha has a built-in parallel test runner. Our own was built before it didn't, so I'm going to see if we can drop our own special runner and just stick to upstream.
I had to bring the module transform back to importing things from the faked namespaces (as noted in my previous comment); this is not my preference, but our code is currently too circular / execution order dependent to use direct imports.
- It's likely that we can just start with these namespace imports first, then fix the circularities over time. There's a tool that the VS Code team used to analyze their codebase for these sorts of issues, and would likely work on our own once it uses imports.
getExecutingFilePath is still replaced with a dummy implementation; reasonably I expect this API to change to something more like getTypeScriptLibRoot or similar such that the path to our lib folder can be provided rather than being deduced relative to tsc.js or something.

Aug 29 '22 17:08 jakebailey

I posted the above early in the day; I did get the parallel test suite working. Mocha's parallel runner unfortunately can't be used with our repo in its current state as it expects individual test files, which it then runs in parallel (never any actual tests within a file in parallel).

We do our own thing and end up with effectively one file. Potentially, we can switch to something "more normal", somehow, but it's no longer a blocker. If we were using something like jest or ava, they could potentially run tests within a file in parallel, but that's a much larger change.

Aug 29 '22 23:08 jakebailey

2022-08-31

I have the parallel test runner working; it came down to the fact that the entrypoint I was using for the test runner was the namespace file, which broke because runner.ts exported a mutable variable to another file, but runner.ts hadn't yet "returned" to the namespace object's require('./runner.ts') to add the binding to its own exports, so it appeared to be undefined. This wouldn't happen if we used direct imports, but making the entrypoint be runner.js was enough.

While working on this, I prototyped a Jest runner that could run our compiler tests instead of the relatively large bit of code we wrote to make Mocha work; I think that's a good idea for the future. I only tried it because I was frustrated trying to get our own parallel runner to work (so was looking for any alternative), but I'll defer that to another day now that I have all tests running besides the API tests.

Next up is api-extractor, and getting some proper entrypoints for things like tsc.js, tsserver.js, typescript.js, etc, which we should now be able to build unconditionally in build mode as they will be small.

Sep 01 '22 00:09 jakebailey

2022-09-02

I have the entrypoints set up and working for the libraries/executables, and a basic api-extractor config for one of them. Unfortunately, things seem to fail on that front. I've reported https://github.com/microsoft/rushstack/issues/3612. But, I've noticed that our d.ts output is a little suboptimal; the case that we are currently breaking on is an import that shouldn't even be emitted as it can be safely imported. I'm sure we've gotten someone reporting that issue before.

We'll have to eventually call api-extractor programatically; one transformation we will need to do again is to convert our const enum to enum; I doubt api-extractor does this for us.

Sep 02 '22 23:09 jakebailey

But, I've noticed that our d.ts output is a little suboptimal; the case that we are currently breaking on is an import that shouldn't even be emitted as it can be safely imported. I'm sure we've gotten someone reporting that issue before.

This turned out to be part of the bug; hand-fixing these in the d.ts file allowed api-extractor to run to completion, although the output did not seem to exclude @internal declarations. I'll have to come up with some sort of reproducer for this.

Sep 06 '22 20:09 jakebailey

2022-09-12

The aforementioned import() thing has to do with #44044 and #49730; the cases that crash api-extractor are ones where we don't emit an import statement because the original file didn't have one, so emit an import expression instead. api-extractor shouldn't be crashing for these cases, but I'm going to try and explicitly annotate the offending pieces of code and see how far I get.

Sep 12 '22 23:09 jakebailey

2022-09-14

I've gotten things far along enough to run perf benchmarks, and I'm overwealmingly happy to share the results.

Emitting CJS, 1:1 output to source, we see a performance loss of anywhere from 15 to 55 percent.
Emitting ESM, 1:1 output to source, we see a performance gain of 3-9%. 🎉

This difference is very likely due to the fact that we are going through our export helpers, which are like our old namespace accesses, but extra slow. Compare that to ESM, which is faster as it doesn't use those import helpers, but since we import things like helper functions (isIdentifier, createParser, etc), we see a performance boost as we are no longer accessing via an object for each invocation, instead using a local binding created by the import statement.

Now, for esbuild:

Running esbuild on our CJS output, we see about the same performance loss as individual CJS files (maybe slightly worse). Not surprising, our import helpers are still present, on top of esbuild's.
Running esbuild directly on our source files (no tsc emit), we see an incredible 7-25% performance boost! :tada: :tada: Looking at the output file, esbuild was able to lift all modules to the top level, meaning all accesses to functions in other modules are effectively local; no overhead. What's even more incredible is that this is without lowering const/let to var, something that esbuild currently does not support. (Could we go even faster?)
Running esbuild on our ESM output, we see about a 6-13% performance boost. This is somewhat surprising, as this is a case where we were using var instead of let/const, so I would have expected it to be even faster than the previous case.

Previously, we had intended to try and ship as individual files, a breaking change for 5.0. Since we're not likely to switch away from CJS, the performance penalty there is pretty unfortunate. But, if we just bundle our output, we would be producing a package that's still compatible with the previous one, and turns out to be 7-25% faster, and has an extremely fast build that we can make use of during development.

At this point, most things are working; I have some remaining tasks:

api-extractor; the bug I'm hitting needs to be minimized and reported. Declarations that are marked as @internal aren't stripped from the namespace objects produced for our namespace-barrel modules. Likely an oversight, as export * support is new.
Random loose end endpoints need to be handled, like the typings installer, cancellation token, etc.
dynamicImportCompat needs to be checked (and maybe rethought).
buildProtocol needs to be rethought (do we need to keep providing protocol.d.ts?)
createPlaygroundBuild needs to be checked to make sure it works; this was tricker when I was planning to output individual modules, but probably no longer a problem if we are bundling.
The build needs to be massively cleaned up. Things are a lot simpler, but I haven't been deleting anything from the gulpfile quite yet.
Whatever else comes up.

Sep 14 '22 23:09 jakebailey

2022-09-19

On #50811, I tested what it'd be like to downlevel let/const to var on esbuild's output. Since it doesn't support that itself, I just had our build run @babel/plugin-transform-block-scoping on the output.

The additional performance boost was as expected, roughly matching the performance difference I noted when trying to switch our target up past ES5 on #50022, or about 3%. Most of that difference comes from improving parsing performance by 5-9%, which indicates that it's doing a lot in the TDZ. The other parts of the compiler don't improve as greatly, but do generally get a few percent faster.

Running babel on the output of esbuild works, but is definitely a lot slower than esbuild itself. I don't think we need to downlevel everything to ES5 (https://github.com/evanw/esbuild/issues/297), so just --supported=let-and-const:false support in esbuild would be very helpful.

Benchmark results, bundled before and after let/const downleveling:

Metric	50811	50811	Delta	Best	Worst
Angular - node (v14.15.1, x64)
Memory used	328,472k ^±0.01%	328,287k ^±0.00%	-185k ^-0.06%	328,244k	328,312k
Parse Time	2.06s ^±0.52%	1.93s ^±0.41%	-0.13s ^-6.36%	1.91s	1.95s
Bind Time	0.70s ^±0.57%	0.69s ^±0.71%	-0.01s ^-0.72%	0.69s	0.71s
Check Time	5.35s ^±0.39%	5.28s ^±0.39%	-0.07s ^-1.31%	5.25s	5.33s
Emit Time	5.20s ^±0.82%	5.15s ^±0.56%	-0.06s ^-1.08%	5.07s	5.21s
Total Time	13.31s ^±0.37%	13.05s ^±0.26%	-0.25s ^-1.92%	12.98s	13.15s
Compiler-Unions - node (v14.15.1, x64)
Memory used	182,299k ^±0.44%	182,374k ^±0.39%	+75k ^+0.04%	181,976k	185,246k
Parse Time	0.89s ^±0.65%	0.81s ^±0.37%	-0.08s ^-8.91%	0.80s	0.81s
Bind Time	0.46s ^±0.87%	0.45s ^±0.50%	-0.01s ^-2.18%	0.44s	0.45s
Check Time	6.22s ^±0.67%	6.16s ^±0.50%	-0.07s ^-1.06%	6.10s	6.22s
Emit Time	1.99s ^±0.98%	1.96s ^±0.76%	-0.03s ^-1.31%	1.94s	2.01s
Total Time	9.56s ^±0.60%	9.38s ^±0.32%	-0.18s ^-1.87%	9.30s	9.43s
Monaco - node (v14.15.1, x64)
Memory used	315,704k ^±0.01%	315,437k ^±0.01%	-267k ^-0.08%	315,401k	315,488k
Parse Time	1.56s ^±0.48%	1.47s ^±0.35%	-0.09s ^-5.52%	1.46s	1.48s
Bind Time	0.63s ^±0.94%	0.62s ^±0.55%	-0.01s ^-1.74%	0.62s	0.63s
Check Time	5.14s ^±0.44%	5.12s ^±0.50%	-0.02s ^-0.29%	5.06s	5.18s
Emit Time	2.79s ^±0.71%	2.74s ^±0.42%	-0.05s ^-1.72%	2.71s	2.77s
Total Time	10.12s ^±0.29%	9.96s ^±0.32%	-0.16s ^-1.60%	9.89s	10.04s
TFS - node (v14.15.1, x64)
Memory used	280,802k ^±0.01%	278,434k ^±0.01%	-2,367k ^-0.84%	278,374k	278,495k
Parse Time	1.32s ^±0.76%	1.26s ^±1.47%	-0.06s ^-4.91%	1.23s	1.31s
Bind Time	0.58s ^±0.63%	0.59s ^±0.62%	+0.00s ^+0.17%	0.58s	0.59s
Check Time	5.03s ^±0.43%	4.97s ^±0.46%	-0.05s ^-1.07%	4.92s	5.02s
Emit Time	2.95s ^±0.56%	2.71s ^±0.54%	-0.24s ^-8.07%	2.69s	2.75s
Total Time	9.89s ^±0.28%	9.53s ^±0.30%	-0.36s ^-3.60%	9.49s	9.60s
material-ui - node (v14.15.1, x64)
Memory used	427,373k ^±0.00%	427,047k ^±0.00%	-326k ^-0.08%	427,006k	427,094k
Parse Time	1.87s ^±0.46%	1.77s ^±0.42%	-0.10s ^-5.25%	1.76s	1.79s
Bind Time	0.53s ^±0.65%	0.53s ^±0.71%	-0.00s ^-0.38%	0.52s	0.53s
Check Time	12.00s ^±0.48%	11.91s ^±0.35%	-0.09s ^-0.74%	11.81s	11.99s
Emit Time	0.00s ^±0.00%	0.00s ^±0.00%	0.00s ^NaN%	0.00s	0.00s
Total Time	14.40s ^±0.40%	14.21s ^±0.32%	-0.19s ^-1.30%	14.10s	14.31s
xstate - node (v14.15.1, x64)
Memory used	517,236k ^±0.01%	516,843k ^±0.00%	-394k ^-0.08%	516,784k	516,900k
Parse Time	2.68s ^±0.57%	2.53s ^±0.67%	-0.15s ^-5.63%	2.50s	2.57s
Bind Time	0.85s ^±0.86%	0.84s ^±0.68%	-0.01s ^-0.94%	0.83s	0.85s
Check Time	1.47s ^±0.50%	1.46s ^±0.52%	-0.00s ^-0.27%	1.45s	1.48s
Emit Time	0.07s ^±4.13%	0.07s ^±3.23%	-0.00s ^-4.17%	0.06s	0.07s
Total Time	5.08s ^±0.37%	4.91s ^±0.51%	-0.17s ^-3.39%	4.87s	4.98s

System

Machine Name	ts-ci-ubuntu
Platform	linux 4.4.0-210-generic
Architecture	x64
Available Memory	16 GB
CPUs	4 × Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz

Hosts

node (v14.15.1, x64)

Scenarios

Angular - node (v14.15.1, x64)
Compiler-Unions - node (v14.15.1, x64)
Monaco - node (v14.15.1, x64)
TFS - node (v14.15.1, x64)
material-ui - node (v14.15.1, x64)
xstate - node (v14.15.1, x64)

Summary

Benchmark	Name	Iterations
Current	50811	10
Baseline	50811	10

Also of note, I'm going to have to change our bundled output format to an IIFE; typescript uses a single file for both node and the browser, conditionally using module.exports. This isn't something that esbuild can do (without eval or new Function), but it looks like I can just use an IIFE with a footer and tack on some extra code at the bottom to set things up (per https://github.com/evanw/esbuild/issues/507#issuecomment-727690399; TS doesn't need full UMD support, but it's close). This also gives me the opportunity to eliminate dynamicImportCompat, I think, if I can just inject (id) => import(id) directly into the output.

Sep 19 '22 18:09 jakebailey

2022-09-29

At this point, everything is mostly done. I ended up writing my own small dts bundler tool that can cover our codebase specifically; we have some unique requirements, namely that we define our own Node, Symbol, Map, etc. types, which most rollup tools try and rename. The limited bundler produces namespaces that look pretty similar to our current d.ts files and makes use of the fact that we don't have any name duplication through our exports, something that is effectively already required by the fact that we do export * and that errors if you don't resolve the conflict.

At this point, the only remaining tasks are:

Backporting changes so the stack isn't so big, including build changes and other removals, such as...
Changes related to removing typescriptServices.js (#50758); this isn't strictly required but simplifies my build a little bit.
Fixing up createPlaygroundBuild, or remove it entirely.
Get a plan in place to correctly ignore the final commit in git blame using .git-blame-ignore; this will hopefully make this change hidden from GitHub blames and potentially GitLens or other tools.
Testing the developer experience when it comes to auto-importing from the new namespace barrel modules; I might have to move them to another location so they are chosen as the "right" place to import helpers from.

Sep 29 '22 20:09 jakebailey

2022-10-11

Down to basically odds and ends.

CodeQL appears to get stuck (https://github.com/microsoft/TypeScript/actions/runs/3229429756/jobs/5286722214); probably something to do with how circular the codebase is. I plan to try and run the same "layer violation" software that the VS Code team used to fixup their codebase, after the fact. It'd be nice to switch to direct, non-cyclical imports.
I have sent PRs for monaco to fix things up in prep for 5.0: https://github.com/microsoft/monaco-editor/pull/3344, https://github.com/microsoft/monaco-editor/pull/3356, which are already applied in https://github.com/microsoft/TypeScript-Make-Monaco-Builds and running for all playground builds. These changes allow us to remove typescriptServices.js, as well as no longer worry about our code changes accidentally breaking the source-level hacks applied in monaco.
- We have a copy of their import script in our own repo as createPlaygroundBuild, but it turns out that monaco's build has changed enough to the point that this script no longer works; monaco needs tsWorker.js, a file that only their build can produce as it includes code from their codebase along with ours.
I have automated .git-blame-ignore-revs to ignore the "conversion step" commits that do the bulk of the work and make all of the noise. When we merge the module branch into main, we'll have to use a merge commit rather than a squash to make this work. Ignoring a squash commit itself will cause git blames to look very odd as there is a lot of new code that I do want git blame to work on.
Auto-imports appear to work.

We should be good to dogfood this.

Also, since this effectively requires a whole new build, I've taken this opportunity to make a number of major build changes to speed up our build performance, remove dependencies, replace dependencies, and so on; changes that are hard to make during our normal dev cycle without causing churn. The module conversion itself is a huge speedbump to anyone bisecting around as it is.

Oct 11 '22 23:10 jakebailey

2022-10-19

Things are looking good, still at odds and ends. The dogfooding has netted some useful feedback, some issues I've fixed like tsserver needing to be a UMD-module for web, and so on.

The big news this week is that I've been able to shave an additional 5.7 MB off our package (TS 4.8 is 63 MB, post-modules we will be at 35.6 MB, ~43% smaller). This comes from a seemingly unrelated change; we have a compiler-debug.js that's built locally which adds in some helpers to format the code flow graph in debugging. This was not included into the compiler as there's no real need to ship this to end users. However, the way that it worked was to require it, then call an init function with the ts namespace. By moving that code directly into our Debug namespace (~300 lines), we no longer need to pass in the ts namespace, which enables bundlers to actually tree shake it. Since the Debug namespace is used in every bundle, this means that both tsc and typingsInstaller shrunk, giving that 5.7 MB number. Notably, this doesn't affect typescript.js, tsserver.js, or tsserverlibrary.js, as those support the plugin API, which has to pass in the ts namespace just like the debug helper did. So, those ones are effectively unable to tree shake, which is unfortunate. But in any case, it's a win.
I am no longer planning on handling let/const in any way; I considered sending a PR to esbuild that does the "easy" examples, but I believe this is actually a bad idea because in the future we'll likely be emitting ESM and we don't want to emit something like export var such that consumers only see mutable bindings.
All of the changes to monaco have been merged, so our playground will continue to work as expected, and in a way that lets us do things like minification in the future without trouble.
All of our pipelines are updated in prep for the new task runner.

After some design meeting bikeshedding:

We're going to target ES2018; I picked this as it was the ES version we list for Node 10, and it doesn't seem like we'd gain much from newer syntax besides nullish colalescing, which only appeared in Node 14. (I'd rather not leave Node 12 people in the dust quite yet.)
- Related to this, we are also all okay with dropping ts.Map, ts.Set, etc; with our target no longer being ES5, we can just start using the real types. In a previous version of TS, I deleted our shims entirely, and nobody has been bitten yet, so this seems safe plus/minus the API break.
We're not going to do anything automated to make imports sorted / have a max line limit. All of the different things we think we could do here are undesirable in various ways; we can't make each import its own line (the files would be huge), we can't use a formatter to reflow them as that would cause major merge conflicts, and so on. This is easy to address in the future, and the reality is that it's rare to add a new import to our worst files like the checker.

Remaining work:

CodeQL analysis of our repo times out. Still figuring out who to talk to there.
- https://github.com/github/codeql/issues/10937
The debugging experience in VS Code is a little slow; need to talk to the VS Code debugger people about that one.
- https://github.com/microsoft/vscode-js-debug/issues/1433
My stack needs to be cleaned up again; I've made lots of small changes and since this PR will be merged (not squashed), it should be cleaner.

Oct 19 '22 22:10 jakebailey

2022-10-25

Probably the last update before we make the switch in 5.0.

First off, more performance wins; tsc.js is now 30% faster to start. Measured using hyperfine:

Benchmark 1: tsc main
  Time (mean ± σ):     115.3 ms ±   1.1 ms    [User: 103.3 ms, System: 11.5 ms]
  Range (min … max):   112.5 ms … 123.9 ms    1000 runs

Benchmark 2: tsc module-ified
  Time (mean ± σ):      89.2 ms ±   1.1 ms    [User: 78.3 ms, System: 10.1 ms]
  Range (min … max):    86.8 ms … 101.4 ms    1000 runs

Summary
  'tsc module-ified' ran
    1.29 ± 0.02 times faster than 'tsc main'

Expand me for tsserver.js, typescript.js, tsserverlibrary.js

Benchmark 1: tsserver main
  Time (mean ± σ):     161.0 ms ±   1.6 ms    [User: 147.1 ms, System: 13.7 ms]
  Range (min … max):   157.4 ms … 170.8 ms    1000 runs

Benchmark 2: tsserver module-ified
  Time (mean ± σ):     151.9 ms ±   1.4 ms    [User: 137.6 ms, System: 15.0 ms]
  Range (min … max):   148.4 ms … 161.9 ms    1000 runs

Summary
  'tsserver module-ified' ran
    1.06 ± 0.01 times faster than 'tsserver main'

Benchmark 1: typescript main
  Time (mean ± σ):     146.3 ms ±   1.7 ms    [User: 132.4 ms, System: 14.2 ms]
  Range (min … max):   142.9 ms … 165.6 ms    1000 runs

Benchmark 2: typescript module-ified
  Time (mean ± σ):     132.6 ms ±   1.5 ms    [User: 120.0 ms, System: 13.4 ms]
  Range (min … max):   129.6 ms … 147.3 ms    1000 runs

Summary
  'typescript module-ified' ran
    1.10 ± 0.02 times faster than 'typescript main'

Benchmark 1: tsserverlibrary main
  Time (mean ± σ):     153.0 ms ±   1.3 ms    [User: 139.2 ms, System: 14.1 ms]
  Range (min … max):   149.7 ms … 161.4 ms    1000 runs

Benchmark 2: tsserverlibrary module-ified
  Time (mean ± σ):     143.1 ms ±   1.5 ms    [User: 127.7 ms, System: 16.0 ms]
  Range (min … max):   139.5 ms … 158.4 ms    1000 runs

Summary
  'tsserverlibrary module-ified' ran
    1.07 ± 0.01 times faster than 'tsserverlibrary main'

The script for this is located here: https://gist.github.com/jakebailey/362aedfb19bff6086585509e223b7a64

I have filed bugs for the problems cross-team, linked in the previous post.

These are two other bugs in TypeScript that I've experienced during the development of the conversion:

https://github.com/microsoft/TypeScript/issues/51301
https://github.com/microsoft/TypeScript/issues/35004

The former is an annoyance; the latter is a half-blocker towards converting the Debug namespace into a module (which will allow bundlers to directly call our debug helpers).

I also have a big writeup which will go into the final PR's description + be converted into a blog post.

See you all then.

Oct 25 '22 21:10 jakebailey

still have some require calls. any plan to move it to ts.sys?

Oct 28 '22 06:10 Jack-Works

I prototyped a Jest runner that could run our compiler tests

please consider jest's snapshot test! manually update some test is pain

Oct 28 '22 07:10 Jack-Works

still have some require calls. any plan to move it to ts.sys?

I'm not sure what you're asking; the output format is still something like UMD so the same files serve Node and the browser, so those can't really go away. I don't have a good long term plan for eliminating them given our synchronous requirements.

please consider jest's snapshot test! manually update some test is pain

This turned out to be a failure; custom runners are not in a good place, and we'd have to completely overhaul how we do tests. Maybe one day, but right now things are still fine. Post modules, my build is a little faster at accepting baselines.

Oct 28 '22 07:10 jakebailey

we don't want to emit something like export var such that consumers only see mutable bindings.

they're immutable even if they're let/var anyway

Oct 28 '22 07:10 Jack-Works

The PR has been posted! #51387

Nov 02 '22 17:11 jakebailey

The modules PR went in earlier today. Here's a set of follow-up issues and PRs that I am already working on which build on the modules change:

#51438
#51439
#51440
#51441
#51442
#51443
#51444

Nov 08 '22 05:11 jakebailey

This wiki page is still references `protocol.d.ts": https://github.com/Microsoft/TypeScript/wiki/Standalone-Server-%28tsserver%29

Should it just point to here then? https://github.com/microsoft/TypeScript/blob/main/src/server/protocol.ts

Dec 03 '22 16:12 kungfooman

It should probably just say "check the ts.server.protocol" namespace, reflecting the suggestion in my PR. Thanks for finding that.

Dec 05 '22 21:12 jakebailey

Hello @jakebailey

The TDZ performance finding quoted here (the 5% — 9% parser win) is being quoted as one of the reasons to perform TDZ optimisation in ESBuild. Meaning ESBuild is replacing const and let with var to improve performance with no way to opt-out.

We all want faster code and so this optimization may well be justified but I'd like to dig in a bit more before concluding. Because excess downlevelling hampers debugging, as you have previously recognized. And when we abandon using modern features in production via tooling that downlevels, it reduces the incentive for engines to optimize those features - a negative feedback loop.

Specifically I'd like to understand if there is a missing V8 optimisation here. So two questions:

Have you replicated this performance delta on any other engine?
Do you have an isolated/minimal repro that could be provided to V8 team as part of a performance bug report?

Feb 06 '23 09:02 robpalme

Meaning ESBuild is replacing const and let with var to improve performance with no way to opt-out.

Having read that linked thread, I honestly don't know how esbuild could stop using var without some major performance changes in v8 and such; the var + lazy init function pattern is explicitly one where if let were used, the TDZ checks would affect basically all uses.

because excess downlevelling hampers debugging, as you have previously recognized.

This is true, but in this particular case, switching to var does not really hurt debugging so long as the vars that are switched don't generate any other additional code such as the extra closures needed for for loop block scoping. But, it did catch at least one bug.

Have you replicated this performance delta on any other engine?

The benchmark suite that tests TypeScript largely runs tsc, and tsc only runs in node, so I have not tested anything other than v8, no. Perhaps recent Bun changes mean I can do a drop-in test and see what happens. (I'm pretty sure Deno is v8 so I wouldn't test that.)

Do you have an isolated/minimal repro that could be provided to V8 team as part of a performance bug report?

I don't; what I really need is some sort of tool that acts as a "TDZ blame" and can tell me where all of my TDZ time is being spent, but no such tool exists. The parser is very large and besides simply running our code through a transform which changes all const/let to var, it's hard for me to know what I can do to actually find the problem than to sit down and just replace a bunch and see what happens.

Feb 06 '23 18:02 jakebailey

it's hard for me to know what I can do to actually find the problem than to sit down and just replace a bunch and see what happens.

That being said, I could likely do a differential profile between the two and look there.

Feb 06 '23 18:02 jakebailey

@robpalme Just to show the current state, https://github.com/microsoft/TypeScript/pull/52656#issuecomment-1421249664 is what it'd be like if we downleveled all let/const to var; the performance difference is really stunning.

I still need to go try this on another engine, but the delta shows that there really is a lot of perf sitting on the table.

Feb 07 '23 20:02 jakebailey