atlantis
atlantis copied to clipboard
Stale/Old Plans Issue
Running into an interesting case based on our current configuration and would like some input on how to resolve it. Atlantis version is 0.14.0 . We have repos set up with the following files in the root of the repo - variables_daily.tf variables_staging.tf variables_production.tf
These correspond to distinct Terraform workspaces, and is how we delineate the environments. For those repositories our atlantis.yaml looks like this -
version: 3
projects:
- dir: ./
autoplan:
when_modified: ["./variables_daily.tf","/modules/*/*.tf"]
workspace: daily
- dir: ./
autoplan:
when_modified: ["./variables_staging.tf"]
workspace: staging
- dir: ./
autoplan:
when_modified: ["./variables_production.tf"]
workspace: production
The issue we're seeing is this:
- A developer creates a PR that has changes to variables_production.tf and variables_staging.tf
- Atlantis then plans both staging and production as expected
- The developer realizes they did not mean to include variables_production.tf and adds a commit to the PR that removes that change, so the PR now only includes a change to variables_staging.tf
- Atlantis re-plans due to the commit, but since variables_production.tf is not in the PR, it only re-plans staging.
- The developer then runs
atlantis apply
thinking they have safely removed production from the PR - The initial plan that was created for production is still valid, and is applied, even though the PR no longer contains a production change
One way to mitigate this might be to flip the 'disable appply all' option to true, which would force them to be more explicit about it. There is some resistance to this, as developers like being able to bulk apply to daily and staging together.
The other good option seems to be forcing atlantis to destroy existing plans on any new commit to the PR. At the moment I do not see a direct way to do that, I'm thinking of using a custom workflow to always run atlantis unlock before any plan. Would that do the trick? And, if not, would a "destroy-on-plan" flag that set atlantis's behavior to trash all existing plans on a re-plan be a useful feature?
that setup sounds super confusing, but I've hit the same issue in a different manner so I'm curious as to a solution.
We're using Atlantis/Terraform to control the entire stack for a bunch of services, including the k8s deployments, which is kind of how we ended up here. So far it has worked really well other than running into some weird cases like this one. It lets us be pretty specific about required approvals for different environments using codeowners, and also lets developers do basic infrastructure updates for their apps without having to get too in depth into the terraform code (mostly k8s image tag bumps).
It doesn't look like this is getting much attention, so I'm guessing it's not an issue a ton of people are running into. It's definitely a pretty specific use case I'd say.
I just stumbled onto this issue, and I'm also concerned. This is definitely a bug that could cause major issues.
This happened to us as well in several occasions. It can be VERY dangerous as it seems that a mistake which has been deleted/fix during the development flow of a PR will run and can cause unexpected problems.
I will try to investigate exactly how Atlantis works under the hood and try to find a proper solution. One quick solution that comes to mind is to delete all previously existing plan files after each commit.
@lkysow are you aware of this bug?
UPDATE: According to this line it uses this funtion which does exactly what the comment below states
// Any generated plans should be untracked by git since Atlantis created
// them.
In order to fix this bug we need to run .tfplans
files that exist inside the folders that have been modified similar to what we are using to find the modifiedFiles (like here)
@lkysow Is this something that can be fixed? We would like to see this fixed and can help with a PR if you provide some guidance.
We would like also this to be solved and we are willing to help with the PR as well. @Symbianx what do you think to design the solution together?
I like the suggestion of the OP of "The other good option seems to be forcing atlantis to destroy existing plans on any new commit to the PR." Adding at least an option for an implied unlock
when a new commit is pushed would solve this.
Hey guys, Just for awareness that I have created a PR that should fix this bug. In case anyone wants to take a look and make any suggestion you are more than welcome.
Closed due to inactivity, if this still needed comment and we will reopen but check the latest docs/features first. Thanks.