spinnaker icon indicating copy to clipboard operation
spinnaker copied to clipboard

Best practice advice for externally managed SQS

Open deadlysyn opened this issue 1 year ago • 6 comments

Issue Summary:

Observed SQS bootstrap log messages despite having externally-managed SQS/SNS configured in ~/.hal/default/profiles/echo-local.yml.

Cloud Provider(s):

AWS

Environment:

  • Spinnaker 1.32.3
  • Halyard stable
  • Running on AWS/EKS with SQS/SNS

Feature Area:

Echo / Pubsub

Description:

Our goal is clean separation between "infra" and "service" management.

  • Followed https://spinnaker.io/docs/setup/other_config/triggers/amazon
  • Created SNS topic and SQS queue externally (all infra managed in IaC)
  • Used the following echo-local.yml:
pubsub:
  enabled: true
  amazon:
    enabled: true
    subscriptions:
    - name: sqs
      topicARN: arn:aws:sns:us-foo-1:012345678901:spinnaker
      queueARN: arn:aws:sqs:us-foo-1:987654321098:spinnaker
      messageFormat: CUSTOM
      templatePath: /mnt/deploy.json
  • templatePath comes from a configmap, seems to work... can trigger test pipeline via SNS.
  • Upgraded spinnaker
  • Saw the following "SQS bootstrap" logs (here) with above config in place:
2023-12-12 18:32:49.470  INFO 1 --- [           main] c.n.s.e.p.aws.SQSSubscriberProvider      : Bootstrapping SQS for SNS topic: arn:aws:sns:us-foo-1:012345678901:spinnaker
2023-12-12 18:32:49.470  INFO 1 --- [           main] c.n.s.e.p.aws.SQSSubscriberProvider      : Using template: /mnt/deploy.json for subscription: sqs
  • Ran terraform plan
  • Saw config drift, various queue settings adjusted
  • Drift appears related to spinnaker defaults for queue creation e.g. this

I updated terraform to match what Spinnaker is doing to avoid plan noise, but it raised the question of why it tried to bootstrap an existing queue.

The pubsub docs say "Spinnaker will do this part for you, provided that you specify what would be a valid name for the QueueARN in the echo-local.yml."

From reading the feature, "If the queue you've specified as a subscriber doesn't exist, Spinnaker will create it."

The docs make it sound like creating will be tried if any valid name is provided, and the code seems to say it is created as provided only if it doesn't exist. It exists, but we still saw metadata (timeouts, policy) adjusted during the upgrade which caused the drift.

Steps to Reproduce:

Unfortunately I haven't been able to repro. I've reverted the terraform changes, re-ran hal deploy, but see no spinnaker SQS bootstrap logs or terraform drift. I haven't found existing github issues with similar reports. The only suspect piece to me (and I am not a java developer so most likely missed something) is that ensureQueueExists takes more than ARNs as input (made me think it was more of a diff of queue config vs just name/exists check), but stopped digging there and it may just be expected coupling between check/create.

Additional Details:

Based on the slight disparity between docs/code and inability to find similar answer or example config in existing issues, I'd like to verify best practice when using both externally managed SQS and SNS. What should echo-local.yml look like in this case? Would we simply drop queueARN?

pubsub:
  enabled: true
  amazon:
    enabled: true
    subscriptions:
    - name: sqs
      topicARN: arn:aws:sns:us-foo-1:012345678901:spinnaker
      messageFormat: CUSTOM
      templatePath: /mnt/deploy.json

deadlysyn avatar Jan 05 '24 22:01 deadlysyn

This issue hasn't been updated in 45 days, so we are tagging it as 'stale'. If you want to remove this label, comment:

@spinnakerbot remove-label stale

spinnakerbot avatar Feb 22 '24 18:02 spinnakerbot

This issue is tagged as 'stale' and hasn't been updated in 45 days, so we are tagging it as 'to-be-closed'. It will be closed in 45 days unless updates are made. If you want to remove this label, comment:

@spinnakerbot remove-label to-be-closed

spinnakerbot avatar Apr 07 '24 18:04 spinnakerbot

This issue is tagged as 'stale' and hasn't been updated in 45 days, so we are tagging it as 'to-be-closed'. It will be closed in 45 days unless updates are made. If you want to remove this label, comment:

@spinnakerbot remove-label to-be-closed

spinnakerbot avatar May 23 '24 00:05 spinnakerbot

This issue is tagged as 'stale' and hasn't been updated in 45 days, so we are tagging it as 'to-be-closed'. It will be closed in 45 days unless updates are made. If you want to remove this label, comment:

@spinnakerbot remove-label to-be-closed

spinnakerbot avatar Jul 07 '24 12:07 spinnakerbot