jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8294899: Process.waitFor() throws IllegalThreadStateException when a process on Windows returns an exit code of 259

Open RogerRiggs opened this issue 2 years ago • 3 comments

Process.waitFor() throws IllegalThreadStateException when a process returns an exit code of 259. As described in the bug report, waitFor() should not be sensitive to the exit value. Previously, it erroneously threw IllegalStateException. Added a test to verify.


Progress

  • [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8294899: Process.waitFor() throws IllegalThreadStateException when a process on Windows returns an exit code of 259

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10680/head:pull/10680
$ git checkout pull/10680

Update a local copy of the PR:
$ git checkout pull/10680
$ git pull https://git.openjdk.org/jdk pull/10680/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10680

View PR using the GUI difftool:
$ git pr show -t 10680

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10680.diff

RogerRiggs avatar Oct 12 '22 16:10 RogerRiggs

:wave: Welcome back rriggs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Oct 12 '22 16:10 bridgekeeper[bot]

@RogerRiggs The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Oct 12 '22 16:10 openjdk[bot]

Webrevs

mlbridge[bot] avatar Oct 12 '22 16:10 mlbridge[bot]

@RogerRiggs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8294899: Process.waitFor() throws IllegalThreadStateException when a process on Windows returns an exit code of 259

Reviewed-by: alanb, jpai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 395 new commits pushed to the master branch:

  • 34a499de8edc9a6b750ae7af356fa9cb1d2a0748: 8294033: x86_64: libm stubs are missing
  • f0b648bc5cea0014e85e16b14c285618c4b94661: 8296758: [BACKOUT] Revert 8296115
  • 7f587e5a5cc1b71ced1cd27f748201c6662040bd: 8296872: gtest is built with the build-jdk
  • 819c6919ca3067ec475b5b268f54e10700eec039: 8295867: TestVerifyGraphEdges.java fails with exit code -1073741571 when using AlwaysIncrementalInline
  • ced88a2fd9a35e0e027661ef1f3c5ea3a5fff9e0: 8296733: JFR: File Read event for RandomAccessFile::write(byte[]) is incorrect
  • 87b809a2cb43d8717105ece5b812efc11ec5c539: 8296229: JFR: jfr tool should print unsigned values correctly
  • e7c2a8e60e35da0919119e919ed162217049e89f: 8295214: Generational ZGC: Guard nmethods from cross modifying code
  • d4d183edfea70a330cc5a092590f8b724fbb4259: 8296301: Interpreter(RISC-V): Implement -XX:+PrintBytecodeHistogram and -XX:+PrintBytecodePairHistogram options
  • f75484063f116fce6f8546b381d90fe46a0ef7e1: 8296773: G1: Factor out hash function for G1CardSet
  • fdabd3796098c0ef0f528847da2cd98256443877: 8293696: java/nio/channels/DatagramChannel/SelectWhenRefused.java fails with "Unexpected wakeup"
  • ... and 385 more: https://git.openjdk.org/jdk/compare/94a9b048afef789e5c604201b61b86ace5c9af67...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Oct 16 '22 06:10 openjdk[bot]

Hello Roger,

The Process.waitFor() documentation states:

Causes the current thread to wait, if necessary, until the process represented by this {@code Process} object has terminated.

So the implementation is expected to wait until the process has terminated. The Windows implementation of waitFor calls the JNI function waitForInterruptibly. The JNI implementation of waitForInterruptibly is:

HANDLE events[2];
events[0] = (HANDLE) handle;
events[1] = JVM_GetThreadInterruptEvent();

if (WaitForMultipleObjects(sizeof(events)/sizeof(events[0]), events,
                          FALSE,    /* Wait for ANY event */
                          INFINITE)  /* Wait forever */
   == WAIT_FAILED)
   win32Error(env, L"WaitForMultipleObjects");

So it calls a Windows native function called WaitForMultipleObjects and passes it 2 handles to wait on - one is the process handle and another for thread interrupt event. FALSE is passed as the bWaitAll param, which effectively means wait for either the process exit or the thread interrupt event.

The documentation of this Windows native API WaitForMultipleObjects states https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects:

If bWaitAll is FALSE, the return value minus WAIT_OBJECT_0 indicates the lpHandles array index of the object that satisfied the wait. If more than one object became signaled during the call, this is the array index of the signaled object with the smallest index value of all the signaled objects.

In our JNI implementation of waitForInterruptibly we appear to only check for the WAIT_FAILED but don't seem to check which handle satisfied the wait. Do you think it's possible that the process didn't yet terminate and instead the thread interrupt event was signalled? Should this waitForInterruptibly do those additional checks? Does that perhaps explain why a subsequent call to Windows native GetExitCodeProcess function returns STILL_ACTIVE (the 259 exit code) which as per that functions documentation states https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getexitcodeprocess

If the process has not terminated and the function succeeds, the status returned is STILL_ACTIVE

jaikiran avatar Oct 17 '22 05:10 jaikiran

The reporter of the issue provided additional details that it was their own application/program which was returning that exit value:

I encountered it while prototyping an idea involving a Java application spawning a process running a C++ application that returned an exit value indicating the number of items it processed.

So this appears like the case where this change would help. I haven't found any conclusive/official Windows documentation which forbids user applications from returning this exit value (which represents STILL_ACTIVE).

jaikiran avatar Oct 21 '22 06:10 jaikiran

@jaikiran There is advice against exiting with STILL_ACTIVE: GetExitCodeProcess says:

The GetExitCodeProcess function returns a valid error code defined by the application only after the thread terminates. Therefore, an application should not use STILL_ACTIVE (259) as an error code (STILL_ACTIVE is a macro for STATUS_PENDING (minwinbase.h)). If a thread returns STILL_ACTIVE (259) as an error code, then applications that test for that value could interpret it to mean that the thread is still running, and continue to test for the completion of the thread after the thread has terminated, which could put the application into an infinite loop.

RogerRiggs avatar Nov 08 '22 15:11 RogerRiggs

Though waitForInterruptibly(handle) doesn't check the handle, the code in the caller ProcessImpl.waitFor() calls Thread.interrupted() and throws the InterruptedException if needed. So the exit code is returned only if the process had exited.

RogerRiggs avatar Nov 08 '22 15:11 RogerRiggs

Thank you Roger for the reference to that paragraph in the documentation. I had read that page previously but I think I overlooked that section. Also thank you for the detail about the thread interrupt handling. What you state makes sense.

jaikiran avatar Nov 10 '22 12:11 jaikiran

One final question - Now with this change, Process.waitFor() won't throw the IllegalThreadStateException for such programs that return STILL_ACTIVE exit code. However, looking at the code a subsequent Process.exitValue() call on that same process instance will still throw this exception. Should we be changing that too?

jaikiran avatar Nov 10 '22 12:11 jaikiran

One final question - Now with this change, Process.waitFor() won't throw the IllegalThreadStateException for such programs that return STILL_ACTIVE exit code. However, looking at the code a subsequent Process.exitValue() call on that same process instance will still throw this exception. Should we be changing that too?

To fix that we'd have to find another Windows API to determine if the process had exited. From the Windows API description, GetExitCodeProcess seems to be the recommended way.

Also if exitValue() returned STILL_ACTIVE it would still be confusing to the program/programmer about whether the process had exited and where that exitValue came from.

I think the current Windows admonition about returning STILL_ACTIVE to "don't do that" is sufficient.

RogerRiggs avatar Nov 11 '22 21:11 RogerRiggs

/integrate

RogerRiggs avatar Nov 14 '22 14:11 RogerRiggs

Going to push as commit 9c399326724dc47eae90076d1237ff582b783863. Since your change was applied there have been 403 commits pushed to the master branch:

  • 3f401b309124eecef7a39aac663bb5e8808a4476: 8296670: G1: Remove unused G1GCPhaseTimes::record_preserve_cm_referents_time_ms
  • 68301cdecae861ecb6c910aeb89465a787184454: 8296665: IGV: Show dialog with stack trace for exceptions
  • 277f0c24a2e186166bfe70fc93ba79aec10585aa: 8296821: compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java fails after JDK-8262901
  • 34d10f19f5321961bdeea8d1c9aff7ca89101d1f: 8296243: [IR Framework] Fix issues with IRNode.ALLOC* regexes
  • 8eb90e2d9c4ab5975f4301dbfdb0a6d9fa036af3: 8296797: java/nio/channels/vthread/BlockingChannelOps.testSocketChannelWriteAsyncClose failed with ClosedChannelException
  • a2cdcdd65dbbc6717c363fc4e22d9b16a4dea986: 8296630: Fix SkipIfEqual on AArch64 and RISC-V
  • 657a0b2f1564e1754dbd64b776c53a52c480c901: 8295865: Several issues with os::realloc
  • ff2c987669523613f3e5dc19493a41f849f882f6: 8294378: URLPermission constructor exception when using tr locale
  • 34a499de8edc9a6b750ae7af356fa9cb1d2a0748: 8294033: x86_64: libm stubs are missing
  • f0b648bc5cea0014e85e16b14c285618c4b94661: 8296758: [BACKOUT] Revert 8296115
  • ... and 393 more: https://git.openjdk.org/jdk/compare/94a9b048afef789e5c604201b61b86ace5c9af67...master

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Nov 14 '22 14:11 openjdk[bot]

@RogerRiggs Pushed as commit 9c399326724dc47eae90076d1237ff582b783863.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Nov 14 '22 14:11 openjdk[bot]