cli
cli copied to clipboard
cf ssh - container does not have app's environment (same issue 3 years later)
When I run cf ssh <app>
the terminal session does not have the same environment that the application process has - my $PATH etc is not setup for the interpreter provided by the buildpack, etc.
I must always remember to run /tmp/lifecycle/launcher "app" "$SHELL" ""
immediately after cf ssh
.
When I raised this in Feb 2016 (3 years ago) in cloudfoundry/cloud_controller_ng#754 the answer was something about Windows support.
Can we please fix this now so that cf ssh
always sets up the environment to match the application process from the buildpack?
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/165518892
The labels on this github issue will be updated when the story is started.
@tcdowney @ssisil @zrob do you folks know whether the issues described above were addressed in v3 cf ssh
? (@emalm who responded to the initial github issue)
@drnic can you create an issue on https://github.com/cloudfoundry/cloud_controller_ng/issues since this is primarily a CAPI/Diego/Buildpacks Team (via the rootfs and launcher) interaction? We'll be able to track it better there. @cwlbraa has some thoughts on what we might be able to do.
Edit: Never mind for this part, Abby moved it. ^^
Also going to tag @heyjcollins, @sunjayBhatia, @shanks3012 for awareness and Diego/Buildpacks context.
I feel like this might be a rootfs change to add a login .profile for vcap that does the equivalent of https://github.com/cloudfoundry/buildpackapplifecycle/blob/master/launcher/launcher_unix.go
... the missing env vars, afaict, are the ones that buildpacks write into droplets in the files under /home/vcap/.profile.d/*
... it's not necessarily straightforward for CAPI to pull those out of the droplet after staging, though that's another possible way of getting at them.
Thank you @tcdowney, appreciate the help. Moving this over to the CAPI github.
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/165577320
The labels on this github issue will be updated when the story is started.
@drnic FYI, the CF docs at https://docs.cloudfoundry.org/devguide/deploy-apps/ssh-apps.html#ssh-env refer to a /tmp/lifecycle/shell
helper binary that can replace the /tmp/lifecycle/launcher "app" "$SHELL" ""
command.
I recall that after we encountered the port-mapping issues on Windows 2012R2 containers (cf-dev context at https://lists.cloudfoundry.org/g/cf-dev/topic/6332994), we also realized that there's a theoretical problem with running the launcher
command more than once: since it always invokes the .profile.d
shell scripts, and since those scripts are arbitrary buildpack-provided content, there's no guarantee that they're idempotent or re-entrant. Ideally those scripts are purely local, in that they're modifying only the environment of the process that sources them, but they could be modifying the mutable contents of the app in arbitrary ways (such as moving files or modifying their contents). For example, the Staticfile buildpack mv
s and then erb
s the provided nginx.conf file in https://github.com/cloudfoundry/staticfile-buildpack/blob/da1ba710b62633a9a922037e0bd63536203cf413/src/staticfile/finalize/data.go#L22-L23, and while I'd guess that this setup is idempotent I think it may not be re-entrant without a mutex around those two lines.
Thanks for mentioning slightly nicer /tmp/lifecycle/shell
.
I'll follow up on slack about the second paragraph. I'm not sure if you're writing good news or bad news.
Hey, @drnic, that second paragraph is likely bad news for previous ways we thought we could solve the environment duplication problem: they had involved running either the diego-sshd
server or each one of the processes it spawns through the launcher
binary that loads the profile.d
scripts, but that's not always safe to do. Turns out we had some other discussion of this issue on https://github.com/cloudfoundry/diego-ssh/issues/29#issuecomment-276926719, too, but I don't think we've considered it much since then.
@emalm Is the issue above only for windows? Can cf ssh
have a diff code path for windows vs linux containers; and auto-run the /tmp/lifecycle/shell
command only when cf ssh
into linux?
No, the issue is with the potential danger in invoking the buildpack-provided profile.d/*.sh
scripts more than once, which is independent of the OS. There was a separate issue on Windows 2012R2 that we were able to resolve.
Where is /tmp/lifecycle/shell
implemented? Could we have a slightly different one, say /tmp/lifecycle/ssh-shell
that is invoked transparently during cf ssh
that doesn't reload profile.d/*.sh
scripts into the same env?
actually I'm still a bit confused why it is recommended for me to manually run /tmp/lifecycle/shell
after cf ssh
, but we can't have this done automatically for us. damn. sorry.
It's in the buildpackapplifecycle, at https://github.com/cloudfoundry/buildpackapplifecycle/blob/master/shell/shell.go. But it doesn't sound like you want a version that skips running the profile.d/*.sh
scripts, because in the case of your Ruby app those scripts set up the Ruby-specific env vars (BUNDLE_PATH
, GEM_HOME
, GEM_PATH
, etc.) that you need for bundle
and so forth.
The platform currently has no idea for a particular application whether it's safe to run the profile.d/*.sh
scripts repeatedly in its instances, because they are completely opaque to it. Even if all the system-provided buildpacks provide safe scripts, some app on the platform may have used an externally hosted one whose scripts aren't safe to re-run. The app developer is able to reason about those scripts, though: they can either know that the buildpack they have used for the app provides re-runnable ones, or they can cf ssh
into the container and inspect all of them to make sure before deciding whether to invoke them via /tmp/lifecycle/shell
. (To be honest, I think https://docs.cloudfoundry.org/devguide/deploy-apps/ssh-apps.html#ssh-env deserves a warning that app devs should do this inspection.) So for the time being we've left it opt-in for the sake of safety. I agree that this experience could be better, but naively re-running those scripts in all cases (whether at the time of launching the diego-sshd
server or at the invocation of each one-off cf ssh
command) sounds like a recipe for subtle, non-deterministic bugs that actually affect app behavior (such as crashing on start up or corrupting configuration or app content).
Also /cc @sclevine in case anything about Buildpacks v3 makes this environment-variable context easier to arrange.
What if there was a way for app devs to opt into running the /tmp/lifecycle/shell
script using a flag on cf ssh
. Maybe something like cf ssh --environment=full
(open to better naming here). This would probably be more discoverable/intuitive for CLI users than the section in the docs Eric linked.
As a proof of concept:
ssh -p 2222 -t cf:<app-guid>/0@ssh.<domain> "/tmp/lifecycle/shell"
appears to do the right thing (using cf ssh-code
to get the password).
What do you think @drnic @emalm? Will this still be an issue in the 4k8s world?
I like the idea of a flag; albeit I'd prefer that the default was this full environment idea, and a flag for have a raw container.
I paid for Alfred so I could have clipboard shortcuts; and I paste in the magic line that enhances my ssh container. But it is hiding the pain and confusion and yes still needs some UX empathy and love.
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/174631297
The labels on this github issue will be updated when the story is started.
workarounds:
- get envs from
/proc/<PID>/environ
- start a "sidecar" server in the entrypoint as port ":9999" which responds with the environment it has and then when executing the task or ssh run
eval $(curl localhost:9999)
before the actual command
The documented approach with /tmp/lifecycle/shell
doesn't work for apps deployed with Docker images, because /tmp/lifecycle/shell
is provided by the buildpacks. What about diego-sshd
sourcing the env from /proc/<PID>/environ
?