embedmongo-maven-plugin icon indicating copy to clipboard operation
embedmongo-maven-plugin copied to clipboard

occasional "text file busy" error

Open jumarko opened this issue 10 years ago • 11 comments

Hi,

we're struggling with occasional strange "text file busy" error: [ERROR] Failed to execute goal com.github.joelittlejohn.embedmongo:embedmongo-maven-plugin:0.1.8:start (start) on project ***: Unable to start the mongod: Cannot run program "/tmp/extract-4938eb39-e49d-4dff-bfa9-7e48da910665mongod": error=26, Text file busy -> [Help 1] [ERROR]

This error occurs on our jenkins machines (Linux) quite frequently (6 occurrences from 100 job executions in last 14 days) . I was not able to reproduce this issue on my local computer (Mac OS X).

I thought that this issue is related to the fact that we are running maven build in parallel (mvn -T 4C) and plugin reuse /tmp/extract-* directory. However, as far as I understand the process there is a new "extract-" directory for each plugin run. Therefore it does not make any sense.

Have you got any idea what can be the root cause of this problem?

jumarko avatar Jul 30 '13 12:07 jumarko

I've never experienced this issue, although I must admit we don't run a lot of parallel builds.

My best guess: could this be caused by many processes attempting to write to the same log file? Do you have the logging configured to go into a file? (and if so, is it the same file for all Maven modules?). I assume you're using the randomPort feature that we added, maybe it would be worth configuring the logFile path to include this value, e.g.:

<logFile>embedmongo${mongo.port}.log</logFile>

EDIT: In fact after a little research, it appears that this is more likely caused by attempting to replace the mongod binary whilst it's in use. The UUID that appears in the extract path should surely be protecting against this. Hmmm

joelittlejohn avatar Jul 30 '13 13:07 joelittlejohn

I can't still understand why the mongod should fail to start.

My observation is that each maven module's integration tests use its own unique temp extract dir, for example: "/tmp/extract-4938eb39-e49d-4dff-bfa9-7e48da910665mongod". Therefore, there should be no interferences between different modules. However, mongod fails to start, this looks like something is still writing to the extract dir when plugin tries to start mongod.

jumarko avatar Jul 30 '13 15:07 jumarko

I wonder if there is some static state being reused by the embedmongo API. It's possible that although multiple extract dirs are created, each module could be using the same binary. Are you able to capture all the processes that exist during the build time and see if they really are using their own, separate, extracted mongod?

joelittlejohn avatar Jul 30 '13 15:07 joelittlejohn

We have only three modules which are using embedmongo-maven-plugin and the issue is always in one of the two, call it "tools/1 and tools/2". If I run both tools/1 and tools/2 then I can clearly see two distinct processes from 'ps aux':

       90926   0.8  0.6  2568720  46408 s012  S+    6:10PM   0:00.10 /var/folders/wk/46yhmyp16pj4sqknz6grnw880000gn/T/extract-9890a696-4cf9-4698-bf04-79c48050c01fmongod -v --port 57434 --dbpath /var/folders/wk/46yhmyp16pj4sqknz6grnw880000gn/T/embedmongo-db-d54fc9dd-d611-42ce-bc70-1c91ee958c23 --noprealloc --nohttpinterface --smallfiles --nojournal --noauth
       90966   9.5  0.3  2525432  28820 s012  S+    6:10PM   0:00.05 /var/folders/wk/46yhmyp16pj4sqknz6grnw880000gn/T/extract-2dd15b9b-5fca-4b95-8e9f-5bc4fcc4cc51mongod -v --port 57442 --dbpath /var/folders/wk/46yhmyp16pj4sqknz6grnw880000gn/T/embedmongo-db-84220683-2311-4be4-a645-cfb50b26a7f5 --noprealloc --nohttpinterface --smallfiles --nojournal --noauth

jumarko avatar Jul 30 '13 16:07 jumarko

I was searching through the net and finds similar issue: http://stackoverflow.com/questions/6345660/java-executing-bash-script-error-26-text-file-busy

I've checked StartEmbeddedMongoMojo, de.flapdoodle.embed.process.runtime.Starter#prepare and subsequent calls but haven't found any unclosed FileOutputStreams but I still believe that this can be some problem with unclosed file handler.

jumarko avatar Jul 30 '13 16:07 jumarko

@jumarko is this something that still causes problems for you? Did you ever get any further with your diagnosis?

joelittlejohn avatar Nov 04 '13 23:11 joelittlejohn

I also stumbed across this, more or less reproducible. All test cases are at least executing 4 parallel tests with an embedded mongo, only then it seems to happen. There are lots of ones with three parallel tests, but these never fail curiously. I haven't had time to look into this in more details since the time it takes to reproduce and fix it probably is never saved by executing a few tests in parallel =)

It only happens on Linux boxes, a windows box (local build) never had this problem until now.

The stack trace is:

java.io.IOException: Cannot run program "/tmp/extract-5cc2b23e-4a9d-4b43-a3de-b90c58344a74mongod": error=26, Text file busy
    at de.flapdoodle.embed.process.runtime.ProcessControl.start(ProcessControl.java:202)
    at de.flapdoodle.embed.process.runtime.AbstractProcess.<init>(AbstractProcess.java:72)
    at de.flapdoodle.embed.mongo.AbstractMongoProcess.<init>(AbstractMongoProcess.java:52)
    at de.flapdoodle.embed.mongo.MongodProcess.<init>(MongodProcess.java:49)
    at de.flapdoodle.embed.mongo.MongodExecutable.start(MongodExecutable.java:47)
    at de.flapdoodle.embed.mongo.MongodExecutable.start(MongodExecutable.java:35)
    at de.flapdoodle.embed.process.runtime.Executable.start(Executable.java:94)

elm- avatar Dec 10 '13 20:12 elm-

I fixed this now within my test setup by locking the call to #start with a semaphore, that works without problems. I'm not sure if that makes sense to include as a generic fix?

elm- avatar Jan 08 '14 15:01 elm-

@elm- Could you give me a code snippet so I can understand exactly what you mean?

joelittlejohn avatar Jan 08 '14 15:01 joelittlejohn

I get a lot of: -rwxrw-r-- 1 user user 33787094 Fev 10 15:01 extract-035621b9-9711-4815-b4b5-7c1e0928dc12mongod* -rwxrw-r-- 1 user user 33787094 Fev 10 14:50 extract-05046b12-83bb-41c7-adcc-df48deb82537mongod* -rwxrw-r-- 1 user user 33787094 Fev 10 14:29 extract-08065d88-a67f-46ed-9e2a-00d7fec9e447mongod* -rwxrw-r-- 1 user user 33787094 Fev 10 14:48 extract-0d2bcaf6-1709-490e-bf12-31f009e44dd1mongod* -rwxrw-r-- 1 user user 33787094 Fev 10 15:19 extract-108c65f2-6659-4eab-b04d-b4842dcae20cmongod* -rwxrw-r-- 1 user user 33787094 Fev 10 14:30 extract-1444649c-5782-4a55-a512-4f1464b061d1mongod*

on folder /tmp/

I've discovered this because i don't have much space on my work environment (only 30Gb) and started having disk full warnings and failures with each build. It is possible that other users may be experiencing conflicts as the directory contains so much 'garbage'.

Could you add some code to eliminate this tmp files on the next release?

rochajoel avatar Feb 10 '17 17:02 rochajoel

See https://issues.jenkins.io/browse/JENKINS-48258?focusedCommentId=324590#comment-324590

The issue here appears to be caused by a Java bug, https://bugs.openjdk.java.net/browse/JDK-8068370.
The issue is possible when multiple threads open a file for writing, close it, and then execute them (each thread using its own file).
Even if all files are closed "properly", due to how file handles work around fork/exec, a child process in one thread may inherit the handle to anther thread's open file, and thus break that thread's later subprocess call.

The two usual ways to avoid this issue is to either open files with O_CLOEXEC so they get auto-closed in child processes, or by closing open file descriptors on the child side of fork().
Java appears to do neither of those things, and a simple test program like the ones in the linked openjdk bug show that the problem is still there, on at least OpenJDK Java 8 and Oracle Java 9.

philippe-granet avatar Mar 27 '21 09:03 philippe-granet