cdk-github-runners icon indicating copy to clipboard operation
cdk-github-runners copied to clipboard

feature request: attach custom tags to the ec2

Open pharindoko opened this issue 1 year ago • 17 comments

Hey @kichik,

I need the possiblity to attach custom labels to an ec2. Background: I want to be able to track the costs in cost explorer based on certain labels.

  1. I imagine to have an optional step in the stepfunction in between to match labels for certain runners based on the data provided. Basically a lambda that I can provide to return back labels in a certain format.
  2. These labels will be attached additionally to the new created ec2.

br,

@pharindoko

pharindoko avatar Mar 17 '24 09:03 pharindoko

Do you mean tags?

kichik avatar Mar 17 '24 11:03 kichik

Do you mean tags?

Yes I mean tags btw 😄

pharindoko avatar Mar 17 '24 13:03 pharindoko

I believe it should already be copying the provider tags to the EC2 instance. So you should be able to use cdk.Tags.of(provider).add('tagName', 'tagValue'). Does that not work? Or do you need it even more dynamic than that?

kichik avatar Mar 17 '24 18:03 kichik

I believe it should already be copying the provider tags to the EC2 instance. So you should be able to use cdk.Tags.of(provider).add('tagName', 'tagValue'). Does that not work? Or do you need it even more dynamic than that?

I would need it more dynamic. Having a lambda function or some typescript function in between, that I can overwrite, would be nice. I need to do some mapping and match values.

pharindoko avatar Mar 17 '24 19:03 pharindoko

What data would this Lambda need? Requested labels? The entire webhook?

kichik avatar Mar 17 '24 19:03 kichik

What data would this Lambda need? Requested labels? The entire webhook?

The entire webhook to get things like org, repo, job run, step and the labels.

pharindoko avatar Mar 17 '24 20:03 pharindoko

I'll have to think about the best way to enable such a feature request.

For now you can use the environment variables already supplied by the actions runner code. GITHUB_REPOSITORY, GITHUB_RUN_ID, etc. are already there. Check out the integration test. It runs export to show them all. You can then give your instances permission to tag themselves. If you want to be real fancy, you can have a runner hook automatically take care of it. One of the discussion threads has a usage example.

kichik avatar Mar 17 '24 21:03 kichik

Will do. Thanks for the hint @kichik.

pharindoko avatar Mar 17 '24 22:03 pharindoko

I'll have to think about the best way to enable such a feature request.

For now you can use the environment variables already supplied by the actions runner code. GITHUB_REPOSITORY, GITHUB_RUN_ID, etc. are already there. Check out the integration test. It runs export to show them all. You can then give your instances permission to tag themselves. If you want to be real fancy, you can have a runner hook automatically take care of it. One of the discussion threads has a usage example.

I had some time to really understand the solution you propose. Yes this definitely will work but I fear that this could be hard to debug, monitor and to modify.

Could it be another option to provide some kind of pre-deployment step in the stepfunction that can be overwritten to select and add tags that can be attached e.g. to the ec2 or other components ?

pharindoko avatar Apr 03 '24 21:04 pharindoko

I still don't have a good idea of what something like that would look like. For example, one of the big problems is that it can't be generic. Settings tags is only supported natively on EC2 and through some hacks on ECS (RunTask doesn't have an option for tags).

Maybe some sort of an escape hatch is in place. Even that I'm not sure how to do elegantly as task objects don't have any option to override until they're rendered and then you have to deal with the entire step function JSON. We would have to take out all of the parameters into an object, pass it to a custom lambda, and then use the results as input for the actual step. But then when this is not used, we just skip the pass and directly pass the parameters?

kichik avatar Apr 05 '24 14:04 kichik

I still don't have a good idea of what something like that would look like. For example, one of the big problems is that it can't be generic. Settings tags is only supported natively on EC2 and through some hacks on ECS (RunTask doesn't have an option for tags).

Maybe some sort of an escape hatch is in place. Even that I'm not sure how to do elegantly as task objects don't have any option to override until they're rendered and then you have to deal with the entire step function JSON. We would have to take out all of the parameters into an object, pass it to a custom lambda, and then use the results as input for the actual step. But then when this is not used, we just skip the pass and directly pass the parameters?

  1. Hmm got it. Yes so to do it in a generic manner doesn`t make sense.

But to do it specifically for the ec2 provider could be an option. e.g. here: https://github.com/CloudSnorkel/cdk-github-runners/blob/f64d5a4c0f23abbe98cfc4dc6532c107b33d56f6/src/providers/ec2.ts#L415

  • The lambda for the step function in that case has to be injected as Lambda.function.
  /**
   * Determine custom tags to assign to new runners.
   *
   * @default null
   */
  readonly customTagsFunction?: lambda.Function;

}
  • optional step will be added
getStepFunctionTask(parameters: RunnerRuntimeParameters): stepfunctions.IChainable {
    // we need to build user data in two steps because passing the template as the first parameter to stepfunctions.JsonPath.format fails on syntax

    const params = [
      stepfunctions.JsonPath.taskToken,
      this.logGroup.logGroupName,
      parameters.runnerNamePath,
      parameters.githubDomainPath,
      parameters.ownerPath,
      parameters.repoPath,
      parameters.runnerTokenPath,
      this.labels.join(','),
      parameters.registrationUrl,
    ];

    if (this.customTagsFunction) {
      const customTags = new stepfunctions_tasks.LambdaInvoke(
        this,
        'Get custom tags',
        {
          lambdaFunction: this.customTagsFunction,
          payloadResponseOnly: true,
          resultPath: '$.tags',
          payload: stepfunctions.TaskInput.fromObject({
            runnerName: stepfunctions.JsonPath.stringAt('$$.Execution.Name'),
            owner: stepfunctions.JsonPath.stringAt('$.owner'),
            repo: stepfunctions.JsonPath.stringAt('$.repo'),
            installationId: stepfunctions.JsonPath.numberAt('$.installationId'),
            error: stepfunctions.JsonPath.objectAt('$.error'),
          }),
        },
      );
    };

    const passUserData = new stepfunctions.Pass(this, `${this.labels.join(', ')} data`, {
      parameters: {
        userdataTemplate: this.ami.os.is(Os.WINDOWS) ? windowsUserDataTemplate : linuxUserDataTemplate,
      },
      resultPath: stepfunctions.JsonPath.stringAt('$.ec2'),
    });

pharindoko avatar Apr 06 '24 23:04 pharindoko

It just occurred to me that assigning the tags before a job is assigned to the runner may result in unexpected results. Especially if you're using organization level registration. There is no guarantee the runner will pick up the job you think it will if the labels are shared.

If the labels are not shared, each provider can use its own static tags. This won't work right now as the tags come from the launch template which is generated by the image builder. I think that part might be resolved by fleets (#518). My current plan is to move EC2 provider to using fleets with SSM document. It would require the provider to create its own launch template. Once it's the provider creating the launch template, using cdk.Tags.of(provider).add(...) would work as expected.

I still can't get behind a function specifically for tags specifically for EC2 provider. If we could think of a way to make a generic escape hatch where you have a way to edit the step function after being generated, I think that would make more sense. You can technically already do it by overriding the entire JSON. But that's not a realistic method as you'd have to recreate everything.

kichik avatar Apr 11 '24 13:04 kichik

I haven`t thought about this - but auto tagging should help me ... https://medium.com/@mithun.kadyada/lambda-function-for-auto-tagging-ec2-instances-cff38c30bdc8

One thing to mention: it would be nice if you could add organization and repository statically to each ec2 machine created by the ec2-provider. Is this possible ? @kichik

pharindoko avatar Apr 18 '24 19:04 pharindoko

How would the auto-tagging Lambda have access to the webhook info needed to properly tag the instance?

it would be nice if you could add organization and repository statically to each ec2 machine created by the ec2-provider. Is this possible ?

Not sure what you mean by that. Are you asking to have access to the repository or organization where the instance was registered? As in some tag that the Lambda function could later read?

kichik avatar Apr 19 '24 17:04 kichik

How would the auto-tagging Lambda have access to the webhook info needed to properly tag the instance?

it would be nice if you could add organization and repository statically to each ec2 machine created by the ec2-provider. Is this possible ?

Not sure what you mean by that. Are you asking to have access to the repository or organization where the instance was registered? As in some tag that the Lambda function could later read?

Yes I would like to have a simple way to read out the relation between the runner and the github org + repo when the runner is started.

pharindoko avatar Apr 20 '24 00:04 pharindoko

Might this information be available in the state machine log?

kichik avatar Apr 24 '24 13:04 kichik

Yes that's true but hard to query for each machine I would assume.

I would try to get the information when a new instance is started. And I assume it's easier to read the information from the ec2's instance tags.

pharindoko avatar Apr 24 '24 15:04 pharindoko

Hey @kichik,

I would like to get more overview about the costs. it would be still nice to have the tags for owner + repo or set all of the params as tags. Attaching tags directly is the easiest way. I understand that not all providers support tags, but it would be nice if we can activate it at least for ec2s. Query the logs and attach the tags seems to me like a workaround.

The params are already there: https://github.com/CloudSnorkel/cdk-github-runners/blob/f0aebce56e38e7e7c37603db2313a2f0b61483a7/src/providers/ec2.ts#L407

Would be great to append them here: https://github.com/CloudSnorkel/cdk-github-runners/blob/f0aebce56e38e7e7c37603db2313a2f0b61483a7/src/providers/ec2.ts#L447

  TagSpecifications: [ 
    { 
      ResourceType: "instance",
      Tags: [ // TagList
        { // Tag
          Key: "STRING_VALUE",
          Value: "STRING_VALUE",
        },
      ],
    },
  ],

I could create a PR and test it. Will I have your support ?

pharindoko avatar May 27 '24 15:05 pharindoko

Turns out all resources do support tags at deploy time. So we should be able to tag with the provider construct path and its labels. The problem is AWS tags don't allow commas. Runner labels don't seem to have any restrictions besides commas, so I'm not sure what's the best solution for that would be.

As for tagging with the owner and repo, I still don't see how we can do that in a generic and/or non-clunky way. We can't trust what the step function registers it as because of organization level registration. Adding a separate Lambda to query it from GitHub once the runner is assigned sounds as clunky as reading the log. And it will have limited provider support. At that point I would still prefer the self-tagging route.

kichik avatar May 29 '24 13:05 kichik

Would #582 help? Would it be enough?

kichik avatar May 29 '24 23:05 kichik

Turns out all resources do support tags at deploy time. So we should be able to tag with the provider construct path and its labels. The problem is AWS tags don't allow commas. Runner labels don't seem to have any restrictions besides commas, so I'm not sure what's the best solution for that would be.

As for tagging with the owner and repo, I still don't see how we can do that in a generic and/or non-clunky way. We can't trust what the step function registers it as because of organization level registration. Adding a separate Lambda to query it from GitHub once the runner is assigned sounds as clunky as reading the log. And it will have limited provider support. At that point I would still prefer the self-tagging route.

Thanks that you took the time to write. You're right. In this case I would even prefer a query to Github once the runner is assigned or as you mentioned self-tag the instance. There was a hook available right ? Would mean I could add some custom shell script to self-tag.

pharindoko avatar May 30 '24 07:05 pharindoko

Would #582 help? Would it be enough?

This was basically what I meant but I would need more information. :) and you already delivered the reason why it's a bad idea.

pharindoko avatar May 30 '24 07:05 pharindoko

I guess I would try the github way first. It's easy to change it as stepfunction. Can you at least add the organization tag to the instance in #582 ?

pharindoko avatar May 30 '24 11:05 pharindoko

In this case I would even prefer a query to Github once the runner is assigned or as you mentioned self-tag the instance. There was a hook available right ? Would mean I could add some custom shell script to self-tag.

Yeah. The image builder will set ACTIONS_RUNNER_HOOK_JOB_STARTED pointing to a script that it adds. The script will self-tag.

Can you at least add the organization tag to the instance in https://github.com/CloudSnorkel/cdk-github-runners/pull/582 ?

Sadly, no. Organization is not available during deployment time. We don't know which app/organization/repository/etc. will register with the orchestrator after it's deployed.

kichik avatar May 30 '24 13:05 kichik

#583 worked for me to assign the owner tag to the ec2 instance.

pharindoko avatar May 31 '24 07:05 pharindoko

I would have a generic proposal:

Regarding the hook script solution: (https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/running-scripts-before-or-after-a-job)

A basic script runner-started.sh or runner-started.ps1 is created for linux/windows when a new image will be built with aws imagebuilder. This script outputs default env variables like the GITHUB_RUN_ID, GITHUB_REPOSITORY, GITHUB_JOB etc.(https://docs.github.com/en/actions/learn-github-actions/variables#default-environment-variables).

This script will be referenced at provider startup when the runner configuration will be done e.g. for linux export ACTIONS_RUNNER_HOOK_JOB_STARTED="/opt/runner/runner-started.sh"

If somebody wants to overwrite the script (like me to attach ec2-tags :) !) it can be done adding a custom addComponent function to the image builder setup in cdk.

builder.addComponent(
  RunnerImageComponent.custom({
    name: 'Overwrite started-runner.sh',
    commands: [
      'set -ex',
      'echo "..." > /opt/runner/runner-started.sh',
    ],
  }),
);

pharindoko avatar May 31 '24 08:05 pharindoko

You should already be able to do it. The following worked for me.

const builder = Ec2RunnerProvider.imageBuilder(stack, 'builder');
builder.addComponent(
  RunnerImageComponent.environmentVariables({
    ACTIONS_RUNNER_HOOK_JOB_STARTED: '/home/runner/runner-started.sh',
  }),
);
builder.addComponent(
  RunnerImageComponent.custom({
    name: 'Self-tag',
    commands: [
      'echo "#!/bin/bash" > /home/runner/runner-started.sh',
      'echo \'TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 60")\' >> /home/runner/runner-started.sh',
      'echo \'INSTANCE_ID=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/instance-id)\' >> /home/runner/runner-started.sh',
      'echo \'aws ec2 create-tags --resources "$INSTANCE_ID" --tags Key=owner,Value=$GITHUB_REPOSITORY_OWNER Key=repository,Value=$GITHUB_REPOSITORY\' >> /home/runner/runner-started.sh',
      'chmod +x /home/runner/runner-started.sh',
    ],
  }),
);

const provider = new Ec2RunnerProvider(stack, 'EC2', {
  labels: ['self-tag-provider'],
  imageBuilder: builder,
});
provider.grantPrincipal.addToPrincipalPolicy(new iam.PolicyStatement({
  actions: ['ec2:CreateTags'],
  resources: ['*'],
  conditions: {
    StringEquals: {
      'aws:Resource': 'instance/${ec2:InstanceID}',
    },
  },
}));

kichik avatar May 31 '24 23:05 kichik

I'll try it out and will close it for now

pharindoko avatar Jun 02 '24 12:06 pharindoko