rules_ts
rules_ts copied to clipboard
Worker mode should have better strategies to GC workers which are consuming a lot of memory and options for configuring the number of possible workers.
tsc --watch is pretty memory heavy in general - a project with just 6 files and 5-6 dependencies in package.json ends up taking ~300 Mbs. Our company's projects are pretty massive ~ 300 files in some projects!
I have seen tsc
workers taking around 22 GBs of memory on our projects and sometimes they get OOMed. I tracked down the problem to this function in ts_project_worker.js
:
function isNearOomZone() {
const stat = v8.getHeapStatistics()
const used = (100 / stat.heap_size_limit) * stat.used_heap_size
}
This check is only for the v8's used heap memory however there's no check on the used memory of the entire system. I changed the function to get around OOM:
function isNearOomZone() {
const stat = v8.getHeapStatistics()
const used = (100 / stat.heap_size_limit) * stat.used_heap_size
const availableMem = os.freemem()
const totalMem = os.totalmem()
const ans =
100 - used < NEAR_OOM_ZONE ||
(100 * availableMem) / totalMem < NEAR_OOM_ZONE
return ans
}
and implemented the following GC strategy as a workaround (GC the worker which has the least sourcefiles)
function sweepLeastRecentlyUsedWorkers() {
while (workers.size > 0 && isNearOomZone()) {
// Find the program with least number of source files and kill it.
const m = Array.from(workers.entries()).map(([k, v]) => {
return [k, v.program.getCurrentProgram().getSourceFiles().length]
})
m.sort((a, b) => a[1] - b[1])
const [killKey, _] = m[0]
fs.appendFileSync(
'WorkerLog.txt',
`Killing worker with key: ${killKey} at time ${Date.now()}`
)
workers.get(killKey).program.close()
workers.delete(killKey)
}
}
there's been a lot of discussion about memory control for workers. bazel lacks resource control and estimation when it comes to workers.
See: https://github.com/bazelbuild/bazel/issues/10662 and https://github.com/bazelbuild/bazel/issues/12165.
the principled fix is here letting bazel control system-wide resource management. OTOH, I'd be willing to take a peek at why it takes 300Mb of ram for each target. 300Mb seems excessive. maybe I could come up with extra optimization like sharing ASTs across targets to reduce duplicate ASTs.
You can do this simply by adding args to the target;
args = [
"--generateTrace",
"/tmp/traces"
]
Hey there's definitely scope of optimization across sharing watchers and ast. I guess it requires a deeper dive in the ts compiler's code.
I saw tsc --watch (the cli command) consuming around 300 MBs on a project with 4-5 ts files and 6-7 deps in package.json
sharing watchers
with the virtual fs
implementation present, watchers are barely a problem.
and ast
the problem mostly arises from having to store hundreds of ASTs. now that I am thinking about it, you might even be suffering from memory leaks if you have multiple ts_project
targets in the same BUILD file.
I am warm to the idea of having different GC algorithms but I am not entirely sure that would work for everyone.
Due to bugs like this one, we are moving away from supporting the Persistent Worker in the next major 2.0 release of rules_ts, and likely will never fix this, sorry!