atlantis icon indicating copy to clipboard operation
atlantis copied to clipboard

Stale/Old Plans Issue

Open srlightbody opened this issue 4 years ago • 8 comments

Running into an interesting case based on our current configuration and would like some input on how to resolve it. Atlantis version is 0.14.0 . We have repos set up with the following files in the root of the repo - variables_daily.tf variables_staging.tf variables_production.tf

These correspond to distinct Terraform workspaces, and is how we delineate the environments. For those repositories our atlantis.yaml looks like this -

version: 3
projects:
- dir: ./
  autoplan:
    when_modified: ["./variables_daily.tf","/modules/*/*.tf"]
  workspace: daily
- dir: ./
  autoplan:
    when_modified: ["./variables_staging.tf"]
  workspace: staging
- dir: ./
  autoplan:
    when_modified: ["./variables_production.tf"]
  workspace: production

The issue we're seeing is this:

  1. A developer creates a PR that has changes to variables_production.tf and variables_staging.tf
  2. Atlantis then plans both staging and production as expected
  3. The developer realizes they did not mean to include variables_production.tf and adds a commit to the PR that removes that change, so the PR now only includes a change to variables_staging.tf
  4. Atlantis re-plans due to the commit, but since variables_production.tf is not in the PR, it only re-plans staging.
  5. The developer then runs atlantis apply thinking they have safely removed production from the PR
  6. The initial plan that was created for production is still valid, and is applied, even though the PR no longer contains a production change

One way to mitigate this might be to flip the 'disable appply all' option to true, which would force them to be more explicit about it. There is some resistance to this, as developers like being able to bulk apply to daily and staging together.

The other good option seems to be forcing atlantis to destroy existing plans on any new commit to the PR. At the moment I do not see a direct way to do that, I'm thinking of using a custom workflow to always run atlantis unlock before any plan. Would that do the trick? And, if not, would a "destroy-on-plan" flag that set atlantis's behavior to trash all existing plans on a re-plan be a useful feature?

srlightbody avatar Jul 21 '20 17:07 srlightbody

that setup sounds super confusing, but I've hit the same issue in a different manner so I'm curious as to a solution.

grimm26 avatar Jul 24 '20 02:07 grimm26

We're using Atlantis/Terraform to control the entire stack for a bunch of services, including the k8s deployments, which is kind of how we ended up here. So far it has worked really well other than running into some weird cases like this one. It lets us be pretty specific about required approvals for different environments using codeowners, and also lets developers do basic infrastructure updates for their apps without having to get too in depth into the terraform code (mostly k8s image tag bumps).

It doesn't look like this is getting much attention, so I'm guessing it's not an issue a ton of people are running into. It's definitely a pretty specific use case I'd say.

srlightbody avatar Aug 03 '20 17:08 srlightbody

I just stumbled onto this issue, and I'm also concerned. This is definitely a bug that could cause major issues.

ghostsquad avatar Sep 17 '20 18:09 ghostsquad

This happened to us as well in several occasions. It can be VERY dangerous as it seems that a mistake which has been deleted/fix during the development flow of a PR will run and can cause unexpected problems.

I will try to investigate exactly how Atlantis works under the hood and try to find a proper solution. One quick solution that comes to mind is to delete all previously existing plan files after each commit.

@lkysow are you aware of this bug?


UPDATE: According to this line it uses this funtion which does exactly what the comment below states

		// Any generated plans should be untracked by git since Atlantis created
		// them.

In order to fix this bug we need to run .tfplans files that exist inside the folders that have been modified similar to what we are using to find the modifiedFiles (like here)

angeloskaltsikis avatar Nov 26 '20 21:11 angeloskaltsikis

@lkysow Is this something that can be fixed? We would like to see this fixed and can help with a PR if you provide some guidance.

Symbianx avatar Feb 04 '21 10:02 Symbianx

We would like also this to be solved and we are willing to help with the PR as well. @Symbianx what do you think to design the solution together?

angeloskaltsikis avatar Feb 04 '21 10:02 angeloskaltsikis

I like the suggestion of the OP of "The other good option seems to be forcing atlantis to destroy existing plans on any new commit to the PR." Adding at least an option for an implied unlock when a new commit is pushed would solve this.

grimm26 avatar Feb 04 '21 20:02 grimm26

Hey guys, Just for awareness that I have created a PR that should fix this bug. In case anyone wants to take a look and make any suggestion you are more than welcome.

angeloskaltsikis avatar Feb 15 '21 07:02 angeloskaltsikis

Closed due to inactivity, if this still needed comment and we will reopen but check the latest docs/features first. Thanks.

jamengual avatar Oct 09 '22 05:10 jamengual