amazon-ssm-agent icon indicating copy to clipboard operation
amazon-ssm-agent copied to clipboard

Unable to dial connection to server: dial tcp :23760: connect: connection refused

Open dhulmul opened this issue 5 years ago • 11 comments

I am using amazon ssm to start a connection between my local machine and a backend server in a private network using a jump server. On checking the logs on jump server for amazon-ssm-agent logs, I constantly get this error message:

ERROR [ssm-session-worker] [***] [DataBackend] [pluginName=Port] Unable to dial connection to server: dial tcp :23760: connect: connection refused

Restarting the jump server makes the error go away at times. Could anyone suggest what could be going wrong here?

dhulmul avatar Feb 22 '21 10:02 dhulmul

Can you please provide the following details to help us investigate the issue?

  • amazon-ssm-agent version
  • OS version and arch
  • Any relevant network configuration (e.g. jump server config)
  • Details on how you are opening the connection (e.g. request parameters)
  • Is this only occurring on one host, or is it happening when connecting from multiple hosts?
  • After restarting the jump server, how long before the problem starts happening again?

Thank you.

danr-amz avatar Feb 24 '21 22:02 danr-amz

  • amazon-ssm-agent version

Name : amazon-ssm-agent Arch : aarch64 Version : 3.0.655.0

  • OS version and arch

NAME="Amazon Linux" VERSION="2" ID="amzn" ID_LIKE="centos rhel fedora" VERSION_ID="2" PRETTY_NAME="Amazon Linux 2" ANSI_COLOR="0;33" CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2" HOME_URL="https://amazonlinux.com/"

  • Any relevant network configuration (e.g. jump server config)

None I can think of, let me know if you need any specifics.

  • Details on how you are opening the connection (e.g. request parameters)

The following command runs on the local machine:

  CMD="'sudo socat -d -d TCP4-LISTEN:$REM_PORT,fork TCP4:$REMOTE_HOST'"
  INVOCATION_COMMAND_ID=$(aws ssm send-command --instance-ids "${INSTANCE_ID}" \
    --document-name 'AWS-RunShellScript' \
    --parameters "commands=$CMD,executionTimeout=$TIMEOUT" --output json | jq -r '.Command.CommandId')

  aws ssm start-session --target "${INSTANCE_ID}" \
    --document-name "AWS-StartPortForwardingSession" \
    --parameters "portNumber=$REM_PORT,localPortNumber=$LOCAL_PORT"

  • Is this only occurring on one host, or is it happening when connecting from multiple hosts?

This happens on multiple instances of jump server and also on connecting to different system, Database and Application Server alike.

  • After restarting the jump server, how long before the problem starts happening again?

It's erratic, if there are multiple connections in a short period, the problem happens very soon like in a minute. Sometimes it works fine for hours and then it persist for a long time.

Another observation: Sessions seem to linger from past when I go to AWS Systems Manager->Session Manager and check the sessions. I suspect adding reuseaddr option to socat command might fix that? Also, if I check Session History tab, there are sessions which are in "Terminating state" from more than a week!

dhulmul avatar Feb 25 '21 10:02 dhulmul

@danr-amz any updates? I could see sessions in "Terminating" state from more than 2 weeks in the Session History Tab.

dhulmul avatar Mar 05 '21 00:03 dhulmul

Hey @dhulmul , the terminating session issue has been resolved, can you confirm on your side they are no longer in terminating state?

Thor-Bjorgvinsson avatar Mar 30 '21 15:03 Thor-Bjorgvinsson

Hi @dhulmul , Are we still seeing this issue?

VishnuKarthikRavindran avatar Apr 15 '21 17:04 VishnuKarthikRavindran

@Thor-Bjorgvinsson @VishnuKarthikRavindran I can't see any sessions in terminating state as of now. Could you please tell what was the issue? Thank you.

dhulmul avatar Apr 17 '21 20:04 dhulmul

I'm trying to achieve the port forwarding to another host (DocumentDB), but am running into some challenges.

May I ask which AMI you're using that has socat installed on it? I'm running my EC2 machine in a private subnet without an internet connection (by design), and can't simply yum install it.

The image I'm currently using is based on resolve:ssm:/aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2 by the way.

ronkorving avatar Oct 20 '21 03:10 ronkorving

Did you manage to find the resolution for this issue?

kkrav3ts avatar Jan 21 '22 09:01 kkrav3ts

I've been able to resolve my issues 👍 (but I didn't create this issue)

ronkorving avatar Jan 31 '22 10:01 ronkorving

@ronkorving I am also facing the same issue. Could you please let me know the solution ? it will be helpful for me. Thank you

Khayuum avatar Jul 13 '22 06:07 Khayuum

@Khayuum AWS recently released the AWS-StartPortForwardingSessionToRemoteHost document. Look for that :)

eg:

aws ssm start-session \
  --target "${INSTANCE_ID}" \
  --document-name AWS-StartPortForwardingSessionToRemoteHost \
  --parameters "{\"host\":[\"${REMOTE_HOST}\"],\"portNumber\":[\"${REMOTE_PORT}\"], \"localPortNumber\":[\"${LOCAL_PORT}\"]}"

ronkorving avatar Jul 13 '22 10:07 ronkorving