terraform-aws-gitlab-runner icon indicating copy to clipboard operation
terraform-aws-gitlab-runner copied to clipboard

enable_manage_gitlab_token = false does not store runner token

Open rewb0rn opened this issue 3 years ago • 6 comments
trafficstars

Hello,

when starting the runner with enable_manage_gitlab_token = false the script is supposed to set the token in SSM. However there is an error due to a missing permission.

A workaround is to put the token value manually into SSM.

This is the error from the EC2 instance user-data:

aws ssm put-parameter --overwrite --type SecureString --name gitlab-runner-manager-runner-token --value=XXX --region eu-central-1

An error occurred (AccessDeniedException) when calling the PutParameter operation: User: arn:aws:sts::XXX:assumed-role/gitlab-runner-manager-instance/i-XXX is not authorized to perform: ssm:PutParameter on resource: arn:aws:ssm:eu-central-1:XXX:parameter/gitlab-runner-manager-runner-token because no identity-based policy allows the ssm:PutParameter action

rewb0rn avatar Jun 18 '22 22:06 rewb0rn

From the docs:

"Boolean to enable the management of the GitLab token in SSM. If `true` the token will be stored in SSM, which means the SSM 
property is a terraform managed resource. If `false` the Gitlab token will be stored in the SSM by the user-data script during 
creation of the the instance. However the SSM parameter is not managed by terraform and will remain in SSM after a `terraform 
destroy`."

I checked the code and found that enable_manage_gitlab_token steers the creation of the ssm:PutParameter and ssm:GetParameter authorization only. And as explained in the docs the token will be stored in an SSM parameter always. This is what you have seen in your logs.

Suggestion is to remove this enable_manage_gitlab_token variable as the policy is always needed and no switch is needed.

kayman-mk avatar Jun 19 '22 11:06 kayman-mk

Hmm I am not sure if there is a misunderstanding. There is a definitely a difference in the behavior in that the SSM parameter is not removed when enable_manage_gitlab_token is set to false. That allows to reuse the exact same runner token the next time terraform apply is called (contrary to registering a new runner when it's set to true). I do not agree that the parameter should be removed. It does however, to my understanding, not function correctly the first time terraform apply runs, because the token is not correctly stored by the user script when the token parameter in SSM is null. Thats when the error in my first post is logged and this can probably be fixed by adding the correct permission. After using the workarround of creating the SSM parameter manually, subsequent calls to terraform destroy / terraform apply function correctly and the SSM parameter is no longer touched, which is the desired behavior for enable_manage_gitlab_token = false

rewb0rn avatar Jun 20 '22 15:06 rewb0rn

All right, so let's check what's going on here. I searched in version 5.0.2 for usages of var.enable_manage_gitlab_token. There are 2 usages: main.tf line 472 and 482. The resources created there are policies. These policies are never referenced.

In gitlab-runner.tpl I found this code

# fetch Runner token from SSM and validate it
token=$(aws ssm get-parameters --names "${secure_parameter_store_runner_token_key}" --with-decryption --region "${secure_parameter_store_region}" | jq -r ".Parameters | .[0] | .Value")

valid_token=true
if [[ `echo $token` != "null" ]]
then
  valid_token_response=$(curl -s -o /dev/null -w "%%{response_code}" --request POST -L "${runners_gitlab_url}/api/v4/runners/verify" --form "token=$token" )
  [[ `echo $valid_token_response` != "200" ]] && valid_token=false
fi

if [[ `echo ${runners_token}` == "__REPLACED_BY_USER_DATA__" && `echo $token` == "null" ]] || [[ `echo $valid_token` == "false" ]]
then
  token=$(curl --request POST -L "${runners_gitlab_url}/api/v4/runners" \
    --form "token=${gitlab_runner_registration_token}" \
    --form "tag_list=${gitlab_runner_tag_list}" \
    --form "description=${giltab_runner_description}" \
    --form "locked=${gitlab_runner_locked_to_project}" \
    --form "run_untagged=${gitlab_runner_run_untagged}" \
    --form "maximum_timeout=${gitlab_runner_maximum_timeout}" \
    --form "access_level=${gitlab_runner_access_level}" \
    | jq -r .token)
  aws ssm put-parameter --overwrite --type SecureString  --name "${secure_parameter_store_runner_token_key}" --value="$token" --region "${secure_parameter_store_region}"
fi

Thus the instance always needs the policy to access the SSM (at least GetParameter and if the runner is not registered, it needs PutParameter too).

The runner token SSM parameter is created in line 12 and if you run a terraform destroy the SSM parameter will be removed. At least I do not see any special logic here which prevents it.

Am I overlooking something?

kayman-mk avatar Jun 20 '22 19:06 kayman-mk

Thanks for taking the time! I am new to terraform and AWS so I am not sure I can help with debugging. But per the description of the parameter, the desired functionality would be to not create or destroy the SSM parameter with terraform apply and terraform destroy and instead only read / write it in the EC2 instance, correct? At least how I understand the description, the desired behavior is that the token is stored and retrieved from SSM but not being removed on terraform destroy, thus being not managed by terraform but still being used for creating the runner. When terraform apply is called, the token parameter is currently created in SSM with a value of null, regardless whether enable_manage_gitlab_token is true or false. Would it help to grant the GetParameter and PutParameter policies to the instance but remove the creation of the SSM parameter from terraform apply when its false?

rewb0rn avatar Jun 21 '22 08:06 rewb0rn

According to the docs it should work the way you described it. But it looks totally different in the code.

Can you please shed some light on this, @npalm

kayman-mk avatar Jun 21 '22 21:06 kayman-mk

I also think we should remove old scenarios that were kept for backwards compatibility. I think we should only support to manage the token via SSM.

npalm avatar Jul 20 '22 21:07 npalm

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 15 days.

github-actions[bot] avatar Jan 01 '23 03:01 github-actions[bot]