typedoc icon indicating copy to clipboard operation
typedoc copied to clipboard

OOM on large monorepo project

Open avatarneil opened this issue 3 years ago • 9 comments

Search terms

OOM, Out Of Memory, Heap Allocation, Crash, Performance

Expected Behavior

I expected pnpm typedoc --packages 'packages/*' to run without crashing!

Actual Behavior

pnpm typedoc --packages 'packages/*' crashes when running on >3 packages within our monorepo. It's worth noting, however, that TypeDoc executes perfectly fine when attempting to generate docs for only a handful of packages within the monorepo (e.g. pnpm typedoc --packages 'packages/mds-types' --packages 'packages/mds-agency').

Stack Trace

[45163:0x118008000] 52690 ms: Mark-sweep 4047.0 (4134.4) -> 4036.0 (4139.4) MB, 1383.4 / 0.0 ms (average mu = 0.423, current mu = 0.122) allocation failure scavenge might not succeed [45163:0x118008000] 54171 ms: Mark-sweep 4052.5 (4140.0) -> 4040.8 (4144.0) MB, 1434.9 / 0.0 ms (average mu = 0.269, current mu = 0.032) allocation failure scavenge might not succeed

<--- JS stacktrace --->

FATAL ERROR: MarkCompactCollector: young object promotion failed Allocation failed - JavaScript heap out of memory 1: 0x101390285 node::Abort() (.cold.1) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 2: 0x1000c5e59 node::Abort() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 3: 0x1000c5fbf node::OnFatalError(char const*, char const*) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 4: 0x100242fd7 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 5: 0x100242f73 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 6: 0x1003fe015 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 7: 0x10045b674 v8::internal::EvacuateNewSpaceVisitor::Visit(v8::internal::HeapObject, int) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 8: 0x100442ceb void v8::internal::LiveObjectVisitor::VisitBlackObjectsNoFail<v8::internal::EvacuateNewSpaceVisitor, v8::internal::MajorNonAtomicMarkingState>(v8::internal::MemoryChunk*, v8::internal::MajorNonAtomicMarkingState*, v8::internal::EvacuateNewSpaceVisitor*, v8::internal::LiveObjectVisitor::IterationMode) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 9: 0x100442816 v8::internal::FullEvacuator::RawEvacuatePage(v8::internal::MemoryChunk*, long*) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 10: 0x100442516 v8::internal::Evacuator::EvacuatePage(v8::internal::MemoryChunk*) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 11: 0x1004603ae v8::internal::PageEvacuationTask::RunInParallel(v8::internal::ItemParallelJob::Task::Runner) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 12: 0x100418102 v8::internal::ItemParallelJob::Task::RunInternal() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 13: 0x100418588 v8::internal::ItemParallelJob::Run() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 14: 0x100444625 void v8::internal::MarkCompactCollectorBase::CreateAndExecuteEvacuationTasks<v8::internal::FullEvacuator, v8::internal::MarkCompactCollector>(v8::internal::MarkCompactCollector*, v8::internal::ItemParallelJob*, v8::internal::MigrationObserver*, long) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 15: 0x1004441e7 v8::internal::MarkCompactCollector::EvacuatePagesInParallel() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 16: 0x10042ef87 v8::internal::MarkCompactCollector::Evacuate() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 17: 0x10042c84b v8::internal::MarkCompactCollector::CollectGarbage() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 18: 0x1003fe734 v8::internal::Heap::MarkCompact() [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 19: 0x1003fb723 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 20: 0x1003f95cd v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 21: 0x10040722a v8::internal::Heap::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 22: 0x1004072b1 v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 23: 0x1003d41dd v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 24: 0x100759ebf v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/Users/neil/.nvm/versions/node/v15.11.0/bin/node] 25: 0x100ad6939 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/Users/neil/.nvm/versions/node/v15.11.0/bin/node]  ERROR  Command was killed with SIGABRT (Aborted): typedoc --packages packages/*

It seems like this could be pretty tricky to trace down, as there are no pointers in the stack trace to TypeDoc code (at least as far as I can tell) :/

I wonder if this could simply be due to over-parallelization, and if that's the case (perhaps this is a dumb suggestion), could there be an option added to generate documentation serially instead of in parallel to lighten the memory pressure?

Steps to reproduce the bug

Apologies in advance for not being able to create a minimal-repro project for this; I believe this is an issue that arises due to the size of a project.

  • Clone mds-core
  • git checkout feature/neil/upgrade-typedoc-to-0.21.0
  • Ensure pnpm is installed (this is a pnpm based project, YMMV using npm/yarn)
  • pnpm clean && pnpm build (clears any local build artifacts that may exist, installs dependencies, builds JS artifacts)
  • pnpm typedoc --packages 'packages/*'

Environment

  • Typedoc version: 0.21.0
  • TypeScript version: 4.2.4
  • Node.js version: 15.11.0
  • OS: MacOS 12.0b1

avatarneil avatar Jun 22 '21 15:06 avatarneil

Hmm... does typedoc crash before or after creating all the programs? (Does rendering json instead of HTML work?) Unfortunately typedoc relies on having program info throughout the rendering process, so there isn't currently a good way to switch it around to only create one ts.Program at a time. Fixing this would nearly enable generating docs from TypeDoc's json output, which I'd like to be able to someday anyways...

Gerrit0 avatar Jun 23 '21 03:06 Gerrit0

@Gerrit0 It seems like with the default memory allocated to node on my machine, emitting to JSON also fails; however, when I bump node's memory limit up to ~32GB, after about 10 minutes it almost succeeds! It seems like memory usage begins dropping, but prior to the emission of the JSON I am faced with a different (I imagine likely unrelated) error 😿 :

TypeDoc exiting with unexpected error:
TypeError: Cannot read property 'map' of undefined
    at Object.convertType (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/types.js:254:57)
    at convertType (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/types.js:96:34)
    at /Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/types.js:254:71
    at Array.map (<anonymous>)
    at Object.convertType (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/types.js:254:57)
    at Object.convertType (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/types.js:96:34)
    at Converter.convertType (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/converter.js:61:24)
    at Object.createSignature (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/factories/signature.js:37:41)
    at convertVariableAsFunction (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/symbols.js:441:21)
    at Object.convertVariable (/Users/neil/Documents/mds.nosync/mds-core/node_modules/.pnpm/[email protected][email protected]/node_modules/typedoc/dist/lib/converter/symbols.js:404:16)

Thanks for all your great work on this project and for the quick response, we used to use TypeDoc for generating our project documentation (older version of Typescript & pre-0.20.0 TypeDoc), and our team is eager to be able to use TypeDoc again!

avatarneil avatar Jun 23 '21 16:06 avatarneil

Unfortunately the convertType there means that you still haven't gotten through conversion, so TypeDoc is still working on converting one of the projects.

I think fixing this requires doing a couple things:

  1. Refactor such that resolution happens in stages, and after resolution, no references to compiler nodes or symbols are kept. This is necessary since if we don't do this, then we have to keep every program in memory until all resolution is complete
  2. Refactor to create only one ts.Program at a time

This is a rather large change... it's possible, but by no means trivial.

Gerrit0 avatar Jun 26 '21 23:06 Gerrit0

I've run into what looks like the same bug. Medium-sized mono-repo using project references to build, publishes an acyclic graph of packages. Uses a yarn workspace and lerna.

Runs out of memory in packages mode. It does however complete if I give it a list of entrypoints in resolve mode. It doesn't seem to make any difference if I give a list of packages entrypoints in packages mode.

This project was only recently ported to project references (from a pure lerna setup) and I have memory of my pre-project references setup working correctly. But I foolishly didn't keep that branch, if you're interested I can recreate it.

danni avatar Jan 15 '22 09:01 danni

Any workarounds besides increasing memory?

Is it possible to build each project’s typedoc individually and then merge them at the end?

Giving it a list of entrypoints in resolve mode did not work for me.

jlarmstrongiv avatar Jun 17 '22 16:06 jlarmstrongiv

Is it possible to build each project’s typedoc individually and then merge them at the end?

Once 0.23 is out, this is what I plan to work on next, and is how I'd like packages mode to always work, but it's not there yet.

Gerrit0 avatar Jun 18 '22 14:06 Gerrit0

Cool! Looking forward to it

That’s similar to how merging jest coverage reports with nyc works

jlarmstrongiv avatar Jun 18 '22 16:06 jlarmstrongiv

I've reported a test in #1971 replying to a message there, you can read the full comment here: https://github.com/TypeStrong/typedoc/issues/1971#issuecomment-1185655181

Long story short, tested allocating 25 GB for my monorepo without success.

I'd like to help about that, but I don't know where to start.

matteobruni avatar Jul 15 '22 15:07 matteobruni

Fixing this is going to involve quite a few changes:

  1. We need to be able to deserialize JSON emitted with --json (requires a review of all current serialized properties to ensure that important ones are included)
  2. Events need to be revisited/reworked. Currently, plugins listening to conversion events don't have a way to tell when TypeDoc should be finished with a ts.Program and thus should throw away all references to it, so they do it on conversion end.
  3. (Not strictly required, but I think will make 1 easier) rework any reflection/type models which store a reference to any ts.* object (exception: ts.__String) to not. The big one here is ReferenceType needs to not store a ts.Symbol... not sure what this looks like quite yet.

Gerrit0 avatar Jul 17 '22 23:07 Gerrit0

Bit of a progress update - the beta branch today has working deserialization code - still needs a lot of work in the sense that TypeDoc doesn't know what to do with this, but.... it's progress!

Next steps:

  1. Support .json files as entry points, which get deserialized as previously converted project
  2. Create a merge projects function
  3. Update packages mode to run TypeDoc in each folder, emitting JSON to a cache location, and then merge projects

Gerrit0 avatar Nov 07 '22 02:11 Gerrit0

I don't know if this is helpful, but we're running into the same issue. We currently have 6 packages in our monorepo (https://github.com/covid-projections/act-now-packages). It's honestly not a ton of code (some of the packages are trivial). What I notice is that:

$ yarn typedoc packages/*/src/index.ts runs in 10 seconds and only needs a 1G node heap.

$ yarn typedoc --entryPointStrategy packages packages/* runs in 50 seconds and needs 5G of node heap.

So not only is there a big memory hit for using the monorepo support, but there's also a big performance hit (even though I'm on an M1 Max MacBook Pro with lots of cores). And because of https://github.com/TypeStrong/typedoc/issues/1835 the first command actually results in working cross-linking, while the second one does not. Of course the big "downside" is that the generated docs show file paths (e.g. "metrics/src") instead of actual package names (e.g. "@actnowcoalition/metrics") and don't show per-module readme files, etc.

So I find myself wanting the speed / memory usage of not using monorepo support, but the doc formatting of the monorepo support. No idea if that's feasible.

mikelehen avatar Nov 15 '22 16:11 mikelehen

Hopefully some of this information is helpful, we have run into the same issue with aws-sdk-js-v3. Special thanks to @eduardomourar for a draft PR that can provide a reproduction: https://github.com/aws/aws-sdk-js-v3/pull/4139

After allocating 250gb of RAM to the build, it continued to crash with a (different!) error: std::bad_alloc

std_alloc note: the number after the line 1 reference changes on every attempt

An OS max_map_count mmap limit was causing the std::bad_alloc failure, and after increasing the max_map_count, we were able to get a build to succeed after 161 minutes, with >200gb of RAM utilization.

MYoung25 avatar Dec 15 '22 20:12 MYoung25

I'm not fully confident it's fully correct yet, but I just released 0.24.0-beta.1 which completely reworks how packages mode works. It will now effectively run TypeDoc in each package folder, then merge the json created for each package.

docs: not written yet, out of time for this week changelog: https://github.com/TypeStrong/typedoc/blob/beta/CHANGELOG.md

I ran this on the tsParticles repo, which previously required >25gb to work with packages mode, and TypeDoc finished in 120 seconds, using a maximum of 970mb RAM. (One difference: typedoc-plugin-missing-exports was not enabled) diff

Gerrit0 avatar Mar 06 '23 00:03 Gerrit0

@Gerrit0 Nice work! I'm using 0.24.0-beta.8 and it's working great!

stuft2 avatar Apr 05 '23 09:04 stuft2

v0.24.0 has released for real now :)

Gerrit0 avatar Apr 08 '23 22:04 Gerrit0

@Gerrit0 works breezily well with my project! The only one thing I saw warning about lodash typings in a module where I created utils wrapper around lodash utils, not sure if this has something to do with exclude not working well with entryPointStrategy: packages or not? If you don't have any idea off top your head, I'll find time to file a bug.

But really appreciate your effort man 🙏

akphi avatar Apr 08 '23 23:04 akphi

Please do open an issue, I thought I'd ironed out all of the cross-module issues, but it's certainly possible I missed something

Gerrit0 avatar Apr 08 '23 23:04 Gerrit0