atlantis Atlantis apply all after a failed apply; outputs Ran Apply for 0 projects

I have a repo that uses the default workspace but there are a number of different project folders.

Atlantis version: 0.8.3 Terraform version: v0.12.8

version: 3
projects:
  - name: qa
    dir: qa_acct/qa_env
    terraform_version: v0.12.8
    autoplan:
      when_modified: ["../../projects/*", "*.tf*", "../../modules/*"]
      enabled: false
  - name: staging
    dir: prod_acct/staging_env
    terraform_version: v0.12.8
    autoplan:
      when_modified: ["../../projects/*", "*.tf*", "../../modules/*"]
      enabled: false
  - name: prod
    dir: prod_acct/prod_env
    terraform_version: v0.12.8
    autoplan:
      when_modified: ["../../projects/*", "*.tf*", "../../modules/*"]
      enabled: false

Plans are generated for all three projects as normal after commenting exactly atlantis plan. Immediately afterword, commenting atlantis apply attempts to apply all three environments as expected. In this case, there was an apply error due to an AWS IAM policy being misconfigured and the plans were not successfully applied. A commit was pushed to fix this issue and another atlantis apply was submitted. Note, there was not another atlantis plan after the fix commit was pushed. Atlantis behaved as if it had forgotten about the failed plans and assumed they had been applied successfully when, in fact, they had not been. I believe the expected behavior should be to reject the apply since new commits were made and force another plan be run, correct?

The result was the following:

Ran Apply for 0 projects:

Automatically merging because all plans have been successfully applied.

Locks and plans deleted for the projects and workspaces modified in this pull request:

* dir: `prod_acct/prod_env` workspace: `default`
* dir: `prod_acct/staging_env` workspace: `default`
* dir: `qa_acct/qa_env` workspace: `default`

Sep 11 '19 01:09 mlehner616

Yeah it's a bug. If autoplan had been enabled then there would have been new plans generated and the apply wouldn't have worked.

Sep 14 '19 00:09 lkysow

@lkysow Thanks for the confirmation. This bug is killing us right now. We want people to be able see non-locking plans being run (in our normal CI pipleine) before Approvals are submitted so they can actually validate their code before blocking other development. If we wanted to dig into solving this, where would be a good place to start looking? I took a really quick glance through the repo and nothing jumped out at me.

Thank you for building this tool by the way, I really appreciate the work that went into this.

Sep 16 '19 21:09 mlehner616

After re-reading the ticket, this isn't technically a bug (although for your use-case it may as well be). Atlantis is just doing what you told it, it's up to you to run atlantis plan if you've pushed a new commit and don't have autoplan running. Coupled with automerge is the real issue here. If you didn't have automerge you'd quickly realize that you didn't re-run plan and there wouldn't be an issue.

Also if you were running with the -d or -p flags you'd get an error that "the plan doesn't exist for that project, please run plan". When we added the apply-all command (i.e. atlantis apply) we didn't replicate the behaviour. I'm not sure if it ever makes sense to not give an error in this case but I'd at least like to add a flag that lets you keep the old behaviour in case people were relying on it.

If we were to add some functionality to detect this case, it would be here: https://github.com/runatlantis/atlantis/blob/master/server/events/project_command_builder.go#L204 after Atlantis has found no pending plans. It could then exit with an error in this case.

I think a path forward may be:

new flag --allow-no-plan-apply which defaults to false now (breaking change)
thread that flag through and then check it at the line above

Sep 19 '19 19:09 lkysow

Well, I actually think your original interpretation made sense to me. To clarify, we would never want atlantis to apply without having the most up to date plan saved and locked.

What we’re doing instead is just running validate, fmt, lint andterraform plan --lock=false in vanilla gitlab CI. Devs open an MR and need to fix any issues there, and get all approvals first, before the atlantis plan. The problem we were solving by doing it this way was autoplan opening locks too early in the process and thus blocking other MRs that were ready to be applied.

I still think this is a bug. Yes I wanted autoplan disabled but that just means I want the developers to run it if and only if all the pre-apply requirements are met. I would expect the apply step to run the same validation that the plan is locked and up to date and apply based on that. Turning off autoplan shouldn’t affect those checks. What seems to be happening with autoplan disabled is the apply is ignoring the plans and ultimately just applies nothing.

I can confirm there are plans and locks are created when they are supposed to be. It appears that the atlantis apply step is just ignoring those if a second apply is run after this first one fails. Expected behavior would be for the apply step to either force a replan if the MR was updated, or attempt to re-apply the original plan. It’s doing neither of these right now.

Sep 19 '19 20:09 mlehner616

One thing I did notice was that if the apply does fail, the saved plans are deleted but the locks are left open (this may be the actual bug here). If we removed those after a failed apply, that would basically force the plan step. I don’t know if that’s the best solution but I think it would work.

Sep 19 '19 20:09 mlehner616

it's up to you to run atlantis plan if you've pushed a new commit and don't have autoplan running.

Our team has autoplan on, but pushing a new commit doesn't cause Atlantis to redo the plan (because Bitbucket).

One thing I did notice was that if the apply does fail, the saved plans are deleted but the locks are left open (this may be the actual bug here). If we removed those after a failed apply, that would basically force the plan step. I don’t know if that’s the best solution but I think it would work.

I agree we should either have Atlantis not delete the plans, or error if an apply is attempted without any plans.

@lkysow - what's the reason for Atlantis to delete the plans after a failed apply? It could have failed because a transient provider issue, and re-running apply on the same plan would later succeed.

Sep 20 '19 17:09 kipkoan

HI everyone I met this issue too. Any work in progress to fix this bug ? I removed the locks as mentioned above and redo "atlantis plan". It still shows "Ran Plan for 0 projects:"

Dec 31 '20 09:12 ishallbethat

Running plan on the same PR after a failed to apply should not be any different than if atlantis does not delete the plan, it is just an extra step.

But if someone else in another PR modify the environment you are running plan against you will have a problem no matter what but by re-running a plan you could actually find the drift.

I do not think this is a bug, it is a bit annoying to run plan again but since terraform is idempotent it should only apply the difference.

Dec 31 '20 20:12 jamengual

I can run atlantis plan again and I am still getting the output "Ran Plan for 0 projects:"

If I run with atlantis plan -p *-production it will apply.

Apr 06 '22 09:04 evanstachowiak

with autoplan, you need to define every directory you want autoplan on/off in your atlantis.yaml otherwise it does not work, is what you guys are doing?

if this was a bug, no one will be using atlantis so I want to make sure if this is specific to multi-dir structure etc. For that, we need to see the altlantis.yaml files and dir structure so we can have a better idea.

This could be as simple as better documentation of autoplan with some examples.

Apr 06 '22 18:04 jamengual

@jamengual I am using an atlantis.yaml that was previously working. I think around v0.19.* this started breaking. It is about 50 projects, each with its own project name so that the -p wildcard flag can be used. The pattern for the naming is ${service_name}-${environment}.

I discovered that if i run atlantis apply -p *-environment, then the command will run, but it will run for ALL projects, regardless of what files have changed.

I have autoplan on, but if I run atlantis plan manually, it doesn't seem to make a difference.

Also of note, I am using custom workflows, not sure if that makes a difference.

Apr 14 '22 08:04 evanstachowiak

@evanstachowiak Please test with the pre-release image, we did some bug fixes there and I wonder if that could be the issue:

docker pull ghcr.io/runatlantis/atlantis:v0.19.3-pre.20220408

Apr 14 '22 15:04 jamengual

is this still an issue with v0.19.8?

Aug 26 '22 02:08 jamengual

Hello @jamengual I was able to reproduce this issue on v0.19.8, using the testdrive repository.

It only happened when using pre workflow hooks, such as the following:

---
repos:
  - id: /.*/
    pre_workflow_hooks:
    - run: echo "hello world"

The server logs for the execution:

{"level":"info","ts":"2022-09-22T13:58:58.502-0300","caller":"server/server.go:869","msg":"Atlantis started - listening on port 4141","json":{}}
{"level":"info","ts":"2022-09-22T13:58:58.502-0300","caller":"scheduled/executor_service.go:46","msg":"Scheduled Executor Service started","json":{}}
{"level":"info","ts":"2022-09-22T13:59:09.305-0300","caller":"events/events_controller.go:533","msg":"parsed comment as command=\"apply\" verbose=false dir=\"\" workspace=\"\" project=\"\" flags=\"\"","json":{"gh-request-id":"X-Github-Delivery=dfb30ec0-3a97-11ed-9f80-6ecf217e25c6"}}
{"level":"info","ts":"2022-09-22T13:59:14.712-0300","caller":"events/working_dir.go:225","msg":"creating dir \"/home/gus/workspace/opensource/apply-for-0-projects-test/atlantis_linux_amd64/data/repos/GusAntoniassi/atlantis-example/1/default\"","json":{"repo":"GusAntoniassi/atlantis-example","pull":"1"}}
{"level":"info","ts":"2022-09-22T13:59:15.360-0300","caller":"runtime/pre_workflow_hook_runner.go:50","msg":"successfully ran \"echo \\\"hello world\\\"\" in \"/home/gus/workspace/opensource/apply-for-0-projects-test/atlantis_linux_amd64/data/repos/GusAntoniassi/atlantis-example/1/default\"","json":{"repo":"GusAntoniassi/atlantis-example","pull":"1"}}

Sep 22 '22 17:09 GusAntoniassi

yes it's still an issue @jamengual

Sep 23 '22 21:09 evanstachowiak

I wonder if this is related to this : https://github.com/runatlantis/atlantis/pull/1633

Sep 23 '22 21:09 jamengual

Hello @jamengual I was able to reproduce this issue on v0.19.8, using the testdrive repository.

It only happened when using pre workflow hooks, such as the following:

---
repos:
  - id: /.*/
    pre_workflow_hooks:
    - run: echo "hello world"

The server logs for the execution:

{"level":"info","ts":"2022-09-22T13:58:58.502-0300","caller":"server/server.go:869","msg":"Atlantis started - listening on port 4141","json":{}}
{"level":"info","ts":"2022-09-22T13:58:58.502-0300","caller":"scheduled/executor_service.go:46","msg":"Scheduled Executor Service started","json":{}}
{"level":"info","ts":"2022-09-22T13:59:09.305-0300","caller":"events/events_controller.go:533","msg":"parsed comment as command=\"apply\" verbose=false dir=\"\" workspace=\"\" project=\"\" flags=\"\"","json":{"gh-request-id":"X-Github-Delivery=dfb30ec0-3a97-11ed-9f80-6ecf217e25c6"}}
{"level":"info","ts":"2022-09-22T13:59:14.712-0300","caller":"events/working_dir.go:225","msg":"creating dir \"/home/gus/workspace/opensource/apply-for-0-projects-test/atlantis_linux_amd64/data/repos/GusAntoniassi/atlantis-example/1/default\"","json":{"repo":"GusAntoniassi/atlantis-example","pull":"1"}}
{"level":"info","ts":"2022-09-22T13:59:15.360-0300","caller":"runtime/pre_workflow_hook_runner.go:50","msg":"successfully ran \"echo \\\"hello world\\\"\" in \"/home/gus/workspace/opensource/apply-for-0-projects-test/atlantis_linux_amd64/data/repos/GusAntoniassi/atlantis-example/1/default\"","json":{"repo":"GusAntoniassi/atlantis-example","pull":"1"}}

pre_workflow_hooks run before any atlantis.yaml file is parsed.

after that if no atlantis.yaml is defined it it will do nothing.

Sep 26 '22 19:09 jamengual

This issue is stale because it has been open for 1 month with no activity. Remove stale label or comment or this will be closed in 1 month.'

Mar 18 '23 01:03 github-actions[bot]

@jamengual Hello! Recently I reproduced that problem on v0.25.0 Also, I'm using pre-workflow hooks as described above. Is it possible to reopen this issue to fix this bug?

Sep 06 '23 13:09 scytem

can you describe the steps you took to reproduce it?

Sep 06 '23 14:09 jamengual

Sure! atlantis-0:/$ atlantis version atlantis v0.25.0 (commit: a12823e) (build date: 2023-08-11T20:51:19.440Z)

Repos config:

repos:
  - id: "/.*/"
    branch: "/.*/"
    workflow: check
    allow_custom_workflows: true
    allowed_overrides: [workflow, delete_source_branch_on_merge]
    apply_requirements: [approved]
    pre_workflow_hooks:
      - run: python3 code/atlantis_config_merge.py # script for generating atlantis.yaml
  workflows:
    check:
      plan:
        steps:
        - run: echo "check passed"
    terragrunt-tst:
      plan:
        steps:
        - env:
          ...
        - run: |
            if [ ! -d "/tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM" ]; then
              mkdir -p /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM
            fi
        - run: terragrunt run-all plan -out ./plan.tfplan --terragrunt-non-interactive &> /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt || cat /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt
        - run: terragrunt run-all show -json ./plan.tfplan --terragrunt-non-interactive 2> /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/show_stderr.txt 1> ./plan.json || cat /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/show_stderr.txt
        - run: /tmp/infracost breakdown --path=. --format=json --log-level=info --out-file=./infracost.json --project-name=$REPO_REL_DIR 2>> /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt 1>> /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt || cat /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt
        - run: /tmp/infracost output --path=./infracost.json --format=json --out-file=./infracost-report.json 2>> /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt 1>> /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt || cat /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt
        - run: |
            /tmp/infracost comment gitlab --repo $BASE_REPO_OWNER/$BASE_REPO_NAME \
              --merge-request $PULL_NUM \
              --path ./infracost-report.json \
              --gitlab-token $ATLANTIS_GITLAB_TOKEN \
              --behavior new \
              --show-all-projects
        # script for output formatting. Not sure if it's relevant for this issue. Just to share
        - run: python3 /opt/terragrunt_output_formatter.py --file /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/output.txt --output-file /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/fmt_output.txt && cat /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM/fmt_output.txt
        - run: rm -rf /tmp/$BASE_REPO_OWNER-$BASE_REPO_NAME-$PULL_NUM
      apply:
        steps:
        - env:
          ...
        - run: terragrunt run-all apply ./plan.tfplan --terragrunt-non-interactive

atlantis.yaml example:

projects:
- autoplan:
    when_modified:
    - '**/*.hcl'
    - '*.hcl'
  dir: accounts/...
  name: ...
  workflow: terragrunt-tst

As a result, I have an MR message:

Ran Apply for 0 projects:

atlantis apply -p ... solves the problem, but it's not comfortable to use it every time

Sep 06 '23 14:09 scytem

This issue is stale because it has been open for 1 month with no activity. Remove stale label or comment or this will be closed in 1 month.'

Oct 08 '23 01:10 github-actions[bot]

atlantis atlantis copied to clipboard

Atlantis apply all after a failed apply; outputs Ran Apply for 0 projects

atlantis
atlantis copied to clipboard