pcluster-manager icon indicating copy to clipboard operation
pcluster-manager copied to clipboard

PCM deployment is overriding default behavior of SSM Sessions

Open demartinofra opened this issue 2 years ago • 5 comments

Hi,

When deploying pcluster-manger the following substack performs an update of the SSM-SessionManagerRunShell document by overriding the default document with an hardcoded version. As documented here, SSM-SessionManagerRunShell document controls the default SSM sessions settings for the account at the region level.

I have the following concerns:

  1. The default SSM session settings are overwritten with a static default
  2. The updated document changes the default SSM user for all nodes where /opt/parallelcluster directory is found. Users expect the default ssm-user to be used while they will automatically land on the cluster nodes as the default cluster user. Also if this customization is triggered on arbitrary nodes where for some reason /opt/parallelcluster dir is present, the execution will just fail.
  3. The command in the doc relies on some internal pcluster variables that might be changed at some point with the risk to break ssm session access.
  4. [minor] The change is even persisted when deleting pcluster-manager stack

Can you share details on why this is necessary and if this configuration can be done at a more scoped level?

Cheers, Francesco

demartinofra avatar Oct 19 '22 15:10 demartinofra

A few notes here:

  • There's no way to specify the user to connect as when you run SessionManagerRunShell - the default is ssm-user
  • There's also no way to specify a separate SSM document when you do the shell connect. You need to use the default.
  • The script will only change the user if it detects the /opt/parallelcluster/cfnconfig file. If not it'll keep the default behaviour.

I wish there was a better way to do this - maybe raising a feature request with the SSM team to get either the user as a parameter or the document but for now this is the best way to ensure the correct user is set.

sean-smith avatar Oct 19 '22 17:10 sean-smith

Thanks Sean! What is driving the need of having to switch user before the session is started vs starting the session as the default ssm-user and then performing the switch user?

demartinofra avatar Oct 20 '22 16:10 demartinofra

AWS has landing zone accelerators that also creates/modifies the default SSM-SessionManagerRunShell document to enable encryption, centralized logging and few others according to security best practices. Also they have SCPs to deny access to that document. If we deploy pcluster managers in those environments, the stack creation fails because the lambda that modifies the document fails to execute. If the SCP is removed the lambda updates the document overriding all those security best practices.

I have a couple ideas that I think might work:

  1. I think it would be better to create a whole new session document for pcluster-manager purposes and enforce users to use that when connecting through ssm
  2. Update the lambda to read the document content first and append linuxcmd parameters to that.
  3. Or the same linuxcmd can be appended to /etc/profile so it gets executed on any shell logins. Or any other similar approach (init scripts) would work.

joshvmaws avatar Nov 10 '22 22:11 joshvmaws

@joshvmaws Is there anyway to point to a specific document when connecting? The link we're using is:

https://[region].console.aws.amazon.com/systems-manager/session-manager/[instance_id]?region=[region]

When we implemented this (Nov 2021) there wasn't a way to select a specific SSM document when connecting. Maybe this has changed?

sean-smith avatar Nov 10 '22 22:11 sean-smith

In CLI there is a parameter you can pass --document-name. There must be a corresponding parameter for the web session that I haven't found yet.

joshvmaws avatar Nov 10 '22 22:11 joshvmaws