mold icon indicating copy to clipboard operation
mold copied to clipboard

increase file limit when hit the ulimit

Open karuboniru opened this issue 1 year ago • 6 comments
trafficstars

Fixes #1362 #851

It seems that with llvm lto we don't have much way to fix the issue than increasing the ulimit. (I strace'd bfd and it is calling setprlimit when it gets any EMFILE)

karuboniru avatar Oct 21 '24 13:10 karuboniru

Well I don't think any program should automagically increase the ulimit. Can you file a bug to LLVM instead? If GCC can handle the same thing without hitting the ulimit, LLVM should be able to do the same thing without hitting it, too.

rui314 avatar Oct 21 '24 13:10 rui314

If GCC can handle the same thing without hitting the ulimit, LLVM should be able to do the same thing without hitting it, too.

Stracing ld.bfd with llvm lto shows it is calling setprlimit. Maybe I can try to see what happens with lld.

$ strace -f ld.bfd @link.txt |& grep EMFILE -A5 -B5
lseek(6, 4096, SEEK_SET)                = 4096
lseek(6, 4096, SEEK_SET)                = 4096
lseek(6, 4096, SEEK_SET)                = 4096
lseek(6, 4096, SEEK_SET)                = 4096
lseek(6, 4096, SEEK_SET)                = 4096
openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4XNNTotal.cc.o", O_RDONLY) = -1 EMFILE (打开的文件过多)
prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=1024, rlim_max=512*1024}) = 0
prlimit64(0, RLIMIT_NOFILE, {rlim_cur=512*1024, rlim_max=512*1024}, NULL) = 0
openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4XNNTotal.cc.o", O_RDONLY) = 1024
fstat(1024, {st_mode=S_IFREG|0644, st_size=161208, ...}) = 0
mmap(NULL, 161208, PROT_READ, MAP_PRIVATE, 1024, 0) = 0x7fc465ba7000

LLD seems to follow the pattern of [open] -> [mmap] -> [close] without keeping too mant fds.

openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionNNToNDelta1950.cc.o", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 225128, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7fd7caa50000
close(3)                                = 0
openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionNNToNDeltastar.cc.o", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 235624, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7fd7caa16000
close(3)                                = 0
openat(AT_FDCWD, "source/CMakeFiles/G4processes.dir/processes/hadronic/models/im_r_matrix/src/G4CollisionNNToNNstar.cc.o", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 583212, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x7fd7ca987000
close(3)                                = 0

I could imagine the issue can't be easily fixed on llvm side, as changing the ownership of fd would be a breaking change?

karuboniru avatar Oct 21 '24 14:10 karuboniru

I just tried LTO with Clang, and it seems it works just fine without closing the FDs. Can you try again with the following change?

diff --git a/src/lto-unix.cc b/src/lto-unix.cc
index 7d839c87..f009478e 100644
--- a/src/lto-unix.cc
+++ b/src/lto-unix.cc
@@ -637,13 +637,8 @@ ObjectFile<E> *read_lto_object(Context<E> &ctx, MappedFile *mf) {
                << " please make sure you are using the same compiler of the"
                << " same version for all object files";

-  // It looks like GCC doesn't need fd after claim_file_hook() while
-  // LLVM needs it and takes the ownership of fd. To prevent "too many
-  // open files" issue, we close fd only for GCC. This is ugly, though.
-  if (!is_llvm(ctx)) {
-    MappedFile *mf2 = mf->parent ? mf->parent : mf;
-    mf2->close_fd();
-  }
+  MappedFile *mf2 = mf->parent ? mf->parent : mf;
+  mf2->close_fd();

   // Create a symbol strtab
   i64 strtab_size = 1;

rui314 avatar Oct 22 '24 07:10 rui314

I tried the same and the link succeed, however the code for LLVMGOLD suggested that it is using the fd information to track the orgin of file if multiple "handle" is from the same file. I believe it will cause the backend treat all object file as they have same source.

Sorry but I know nearly nothing about how thinLTO works, so I don't know if this will case any potenial issue.

https://github.com/llvm/llvm-project/blob/6e0b0038cd65ce726ce404305a06e1cf33e36cca/llvm/tools/gold/gold-plugin.cpp#L580-L585

karuboniru avatar Oct 22 '24 11:10 karuboniru

Do you mind if I ask you to try to fix the issue on the LLVM plugin side rather than on the linker side? If you don't have enough bandwidth, I can take a look later. But it seems you have already started reading the source code of the plugin.

rui314 avatar Oct 23 '24 05:10 rui314

I've no idea how should I test a change in llvm, so I'd appreciate it if you can take time fix the issue in LLVM side.

But from my understanding of the code in llvmgold I could tell the leading handle thing only wants uniqueness and this could be done by using file path instead of FD. (since if we close the file the fd will no longer be unique)

With the change the FD could be closed after https://github.com/llvm/llvm-project/blob/6e0b0038cd65ce726ce404305a06e1cf33e36cca/llvm/tools/gold/gold-plugin.cpp#L546

karuboniru avatar Oct 23 '24 08:10 karuboniru