Crash on huge D files
We have an auto-generated D file with around 7000 definitions, on of which is about 10MB ubyte array. The entire size of the file is 51MB. When that file is present in the project tree, DCD crashes with a cryptic "Killed" message.
Replace the D file with its di file (same number of definitions, but "only" 61KB) solves the problem.
I can see several ways to solve this issue, and I am willing to help with implementation, but the project's maintainer should be the one to decide which solution to use:
- Since none of the definitions in this file are particularly interesting. A way to define, per project, an ignore list would be helpful. One such way may be a
.dcdrcfile or adcd.conffile in the project's root that can say which files to ignore. - When there is a
.difile with the same name as a.d, index the.diinstead. This solution has the advantage of requiring no additional configuration from the user, but it may mess up the jump-to-definition feature, since it'll jump to the declaration in the.difile when you usually want to jump to the implementation in the.dfile. - Skip files that are too large for DCD to handle.
- Try to locate and solve the bug.
c.c. @shachar
Just checked in dmesg: dcd-server was killed by the out-of-memory daemon:
[14821.410450] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
# ...................
[14821.410594] [24798] 1000 24798 3436613 3054377 5985 16 0 0 dcd-server
[14821.410595] Out of memory: Kill process 24798 (dcd-server) score 749 or sacrifice child
[14821.410598] Killed process 24798 (dcd-server) total-vm:13746452kB, anon-rss:12217508kB, file-rss:0kB, shmem-rss:0kB
[14821.710052] oom_reaper: reaped process 24798 (dcd-server), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
The problem is that for your 10 Mb array, a huge amount of AST nodes are generated. To be exact each element of an array literal is 5 classes instances, which means that the whole ArrayInitializer in the AST is made of 50_000_000 instances.
Another workaround not to pollute the AST is to use an import expression (import(array.dat)) instead of the massive (i imagine) array literal. This is even the more idiomatic way to put something in the data segment.
FWIW, as of https://github.com/dlang-community/dsymbol/pull/114 , DCD will no longer load .d files just because they're somewhere under the import path, unless they're actually imported from edited source files, or you invoke the symbol search feature.
on of which is about 10MB ubyte array
Any reason you can't use an import expression instead?