jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8342449: reimplement: JDK-8327114 Attach in Linux may have wrong behavior when pid == ns_pid

Open larry-cable opened this issue 1 year ago • 4 comments

the implementation I originally provided does not in fact solve the issue!

the attach protocol initiation "handhake" requires that the "attacher" (the caller of this code) and the "attachee"(the target JVM to be "attached" to) must share a "/tmp" in common in order to rendezvous on the "attach" socket (created in the /tmp filesystem).

"attacher" and "attachee" JVM processes are guaranteed to share a common "/tmp" directory/filesystem when thy occupy the same "mount namespace", this is the environment in which such "peers" exist, and the attach handshake (initiated by the attacher) can use "/tmp" to establish the socket connection with the attachee.

with the advent of "containers" (implemented in Linux via various namespaces, e.g.: pid & mount) another "attacher" and "attachee" relationship exists, that of "attacher" (ancestor) and "attachee" (descendant) (w.r.t their pid and mount namespace(s)).

in this environment it is possible (and highly probable) that the "attacher" and the "attachee" do not share a "/tmp" directory in common!

In this scenario the "attacher" must resort to handshaking with the attachee via /proc in order to access the "attachee's" /tmp from the "attacher" via the symlink in /proc//root.

In order to achieve this rendezvous, the "attachee" must occupy a descendant, or same, (pid) namespace of, or as, the "attacher" (this relationship is required to ensure that the "attachee" 'root' is accessible via its /proc/<pid>root symlink in the namespace of its ancestor (attacher).

since (pid) namespaces are hierarchical, a descendant process (in its own descendent pid namespace) will also occupy all its ancestor (pid) namespaces (between it and the 'root' or 'host' pid namespace) with a unique pid in each of those "interstitial" (pid) namespace(s).

thus the "attachee" "/tmp" is accessible, via an "ancestor's" (or peer's) /proc filesystem using the pid of the "attachee" that is associated with it in that (pid) namespace.

thus an "ancestor" "attacher" can handshake with a descendant "attachee" in this fashion.

therefore an "attacher" has two choices when attempting to attach:

  • use the /proc//root/tmp path to the "attachee's" /tmp

    • this works with both peers and descendants
  • use /tmp

    • this only works if the "attacher" and "attachee" share a /tmp in common

the obvious choice is to default to /proc//root/tmp however there is an issue with this; should the attachee have elevated privileges, the attacher may not have r/w permission on the attachee's /proc//root path which is required in order to establish the 'attach' protocol transport (a UDS)

In these circumstances, the "attacher" can only resort to /tmp which may or may not be in common with the "attachee".

the logic required for an "attacher" to determine if an "attachee" /tmp is in common, unfortunately requires read access (from the attacher) to the "attachee's" /proc filesystem (namely to read namespace identities and to also 'stat(2)' the /proc//root/tmp directory to obtain device and inode values for comparison in order to determine if "/tmp" is in common.

when the "attachee" has elevated privileges, this access is not available to the "attacher".

therefore in such circumstances, the "attacher" has no choice but to attempt to use /tmp and simply timeout the handshake if in fact the "attachee" does not share a /tmp in common

this fix proposes to use /proc/... by default unless it is not writeable by the 'attacher' in which case it defaults to /tmp

this introduces an issue, if /tmp is not in common, then the attach cannot occur, but this is not determinable in the elevated case.

there is a possibility that if another process, in the 'attacher' pid namespace shares a pid with the actual 'attachee', then the attacher could mistakenly signal the wrong process, which could be fatal, if SIGQUIT is not caught.

the fix includes some code to 1st introspect the 'attachee' to determine if it is catching 'SIGQUIT', prior to sending it, and throwing if the 'attachee' does not do so.


Progress

  • [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [ ] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8342449: reimplement: JDK-8327114 Attach in Linux may have wrong behavior when pid == ns_pid (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21572/head:pull/21572
$ git checkout pull/21572

Update a local copy of the PR:
$ git checkout pull/21572
$ git pull https://git.openjdk.org/jdk.git pull/21572/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21572

View PR using the GUI difftool:
$ git pr show -t 21572

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21572.diff

larry-cable avatar Oct 17 '24 21:10 larry-cable

:wave: Welcome back larry-cable! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Oct 17 '24 21:10 bridgekeeper[bot]

❗ This change is not yet ready to be integrated. See the Progress checklist in the description for automated requirements.

openjdk[bot] avatar Oct 17 '24 21:10 openjdk[bot]

@larry-cable The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Oct 17 '24 21:10 openjdk[bot]

⚠️ @larry-cable This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

openjdk[bot] avatar Oct 23 '24 21:10 openjdk[bot]