artillery
artillery copied to clipboard
Fargate tasks not starting with "ResourceInitializationError: unable to pull secrets or registry auth"
I'm trying to use the new Fargate approach in eu-central-1. (The same test repo has been used with Artillery Pro in eu-west-2 before.)
I've confirmed I have a VPC in the region, 3 public subnets, and that Artillery is correctly automatically using those subnets, so I don't think networking per se is the problem.
I've set up a user with the permissions as documented today. I created what's in the docs as a policy and attached this directly to a user group which my user is in – I wasn't totally clear how a role should be used if created.
It seems the ecr:GetAuthorizationToken
is part of a policy that Artillery itself sets up via a worker role, and that's why it isn't in the documented policy to be set up manually in AWS. But I'm not sure what to try now to get it to run.
Version info:
Artillery: 2.0.0-37
Node.js: v18.18.0
OS: darwin
Running this command:
artillery run-fargate --count 1 --region eu-central-1 --overrides '{\"config\": {\"phases\": [{\"duration\": 1, \"arrivalRate\": 1}]}}' --output reports/report.json --record api-donations.yaml
I expected to see this happen:
A test run on Fargate
Instead, this happened:
Test stopped with:
Launching workers... [14:23:04]
Waiting for Fargate... [14:23:05]
Waiting for workers to start: deprovisioning: 1 [14:23:37]
[
{
attachments: [ [Object] ],
attributes: [ [Object] ],
availabilityZone: 'eu-central-1a',
clusterArn: 'arn:aws:ecs:eu-central-1:[AWS_ACCT_ID]:cluster/artilleryio-cluster',
connectivity: 'CONNECTED',
connectivityAt: 2023-10-07T13:23:09.090Z,
containers: [ [Object] ],
cpu: '4096',
createdAt: 2023-10-07T13:23:05.779Z,
desiredStatus: 'STOPPED',
enableExecuteCommand: false,
executionStoppedAt: 2023-10-07T13:23:15.947Z,
group: 'family:artilleryio-loadgen-worker_fargate_artilleryio-cluster_8fa978b3a50ce517e081ee7c126a354204807b1b_155552',
healthStatus: 'UNKNOWN',
lastStatus: 'STOPPED',
launchType: 'FARGATE',
memory: '8192',
overrides: {
containerOverrides: [Array],
inferenceAcceleratorOverrides: [],
taskRoleArn: 'arn:aws:iam::[AWS_ACCT_ID]:role/artilleryio-ecs-worker-role'
},
platformVersion: '1.4.0',
platformFamily: 'Linux',
stopCode: 'TaskFailedToStart',
stoppedAt: 2023-10-07T13:23:39.068Z,
stoppedReason: 'ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve ecr registry auth: service call has been retried 1 time(s): AccessDeniedException: User: arn:aws:sts::[AWS_ACCT_ID]:assumed-role/artilleryio-ecs-worker-role/1c0acd7ed5f84dff888dd1811f2922ce is not authorized to perform: ecr:GetAuthorizationToken on resource: * because no identity-based policy allows the ecr:GetAuthorizationToken action status code: 400, request id: ed9fb8dd-d720-4607-986d-8790c14d35b9',
stoppingAt: 2023-10-07T13:23:25.972Z,
tags: [],
taskArn: 'arn:aws:ecs:eu-central-1:[AWS_ACCT_ID]:task/artilleryio-cluster/1c0acd7ed5f84dff888dd1811f2922ce',
taskDefinitionArn: 'arn:aws:ecs:eu-central-1:[AWS_ACCT_ID]:task-definition/artilleryio-loadgen-worker_fargate_artilleryio-cluster_8fa978b3a50ce517e081ee7c126a354204807b1b_155552:1',
version: 4,
ephemeralStorage: { sizeInGiB: 20 }
}
]
Error: Worker init failure, aborting test
Error: Worker init failure, aborting test
at waitForTasks2 ([project-dir]/node_modules/@artilleryio/platform-fargate/lib/commands/run-cluster.js:14:19311)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async [project-dir]/node_modules/@artilleryio/platform-fargate/lib/commands/run-cluster.js:13:873
Error: Worker init failure, aborting test
at waitForTasks2 ([project-dir]/node_modules/@artilleryio/platform-fargate/lib/commands/run-cluster.js:14:19311)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async [project-dir]/node_modules/@artilleryio/platform-fargate/lib/commands/run-cluster.js:13:873
Cleaning up... [14:23:47]
⠼ Error: error sending test data to Artillery Cloud
Test report may be incomplete
Request ID: a8b4c10f-6219-49a3-b989-0a389ad947ff
Thanks @NoelLH! Looking into it - that permission should be added automatically without you needing to do anything.
Hi, we encountered the same issue today. We resolved it by manually removing all resources on AWS which forced artillery to recreate everything again, so it looks like some old setting got cached somewhere.
thanks for chiming in @peldax! @NoelLH - could you try one of:
- running the test in a different AWS account, or
- removing the old Artillery Pro CloudFormation stack, and then trying again
Everything is working as expected on my end, I've not been able to reproduce the issue.
Thanks both!
I'm tight for time at the moment so trying to avoid setting up a distinct AWS account for this if possible @hassy.
I first removed all CloudFormation stacks I could find in all relevant regions & waited for the resource deletions (there was stuff from Artillery Pro and also old Serverless Artillery experiments), but this seemed to make no difference.
I then delete the IAM role "artilleryio-ecs-worker-role" which had no permissions attached, and that changed the AccessDenied
detail to:
authorized to perform: iam:CreatePolicy on resource: policy
artilleryio-ecs-worker-policy because no identity-based policy allows the
iam:CreatePolicy action
Each time, it seems to create the worker role again OK but not any permissions/policies for it.
I think I've sorted this for our account.
I believe the problems were a combination of the all-or-nothing approach to the worker role creation, and 2 errors in the Artillery docs for Fargate which meant some of the required permissions weren't there when enough of the IAM resources were repeatedly deleted for Artillery to attempt their recreation:
-
arn:aws:iam::123456789000:policy/ecs-worker-policy
should bearn:aws:iam::123456789000:policy/artilleryio-ecs-worker-policy
-
iam:AttachRolePolicy
is required for resourcearn:aws:iam::123456789000:role/artilleryio-ecs-worker-role
, not [just] for the policy
I am unable to use fargate now with the new task definitions that have parameter store secrets. I was able to run in fargate a few months ago. This is what I am getting as a reason for task stopping.
ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve secrets from ssm: service call has been retried 1 time(s): invalid ssm parameters: /artilleryio/ARTIFACTORY_AUTH,/artilleryio/ARTIFACTORY_EMAIL,/artilleryio/NPMRC,/artilleryio/NPM_REGISTRY,/artilleryio/NPM_SCOPE,/artilleryio/NPM_SCOPE_REGISTRY,/artilleryio/NPM_TOKEN'
Attempting use of Artillery for the first time in an AWS account, and experiencing the same thing as @zeeshanpolaris above.
Looks like there is a function ensureParameterExists
that is likely intended to do this conditional parameter creation. But, I see no code references invoking it.
Perhaps this was missed in testing of the migration of the fargate support code in https://github.com/artilleryio/artillery/pull/2297 ? ( parameters already existing in test environment? )
@RobMullen @zeeshanpolaris apologies, fix incoming
@RobMullen @zeeshanpolaris apologies, fix incoming
Thank you. Appreciate it.
Thanks again for reporting the issue @zeeshanpolaris @RobMullen
Fix is in this PR: https://github.com/artilleryio/artillery/pull/2354
A canary version of Artillery will be published once we merge to main
which you can try to check if running a test works. (You can install the canary with npm install -g artillery@canary
) Will also publish v2.0.3 later today.
Thanks. I added those default values manually and got it working. However, I had to add these two additional permissions for cloudwatch logs in the policy used by the role.
{ "Effect": "Allow", "Action": [ "logs:CreateLogStream", "logs:CreateLogGroup" ], "Resource": [ "arn:aws:logs:RegionHiddenForSecurity:AcountNumberHiddenForSecurity:log-group:artilleryio-log-group/*" ] }
Thank you very much, @hassy , for jumping on this quickly!! I too have worked around this via manual creation of the default parameter store entries. Will remove the parameter store entries and try out the canary out when it becomes available.