terragrunt
terragrunt copied to clipboard
Restrict Locals Evaluation with Include Dirs
This PR is an attempt to address #1640.
We have a Terragrunt project that has a rather large number of modules (close to the order of 100) and we've noticed that Terragrunt is very slow to start when using the run-all command. This was discovered to be due to the fact that Terragrunt is running over all modules in the working directory and evaluating their locals, which was causing a few modules to be initialized during this phase as some locals were/are referencing dependencies.
It was discovered that this behavior is the same even when specifying --terragrunt-strict-include. It is my belief that Terragrunt shouldn't need to evaluate locals of any modules outside of those specified in the --terragrunt-include-dir flags (and subsequent module dependencies).
This PR attempts to resolve this by changing the implementation of FindConfigFilesInPath under the config package, which originally walks over the entire working directory looking for Terragrunt modules. The implementation has been changed to only walk over the directories specified by --terragrunt-include-dir flags. Two tests have been added to the test suite to test these changes.
I have verified that all tests pass under the config package with these changes, and I have also tested the build with these changes on our large Terragrunt project and can see significant startup time improvement.
@yorinasub17 I think you may be best to review this one.
Believe I have found a small issue with the code in this PR, will take a look a bit later and see if I can address it. Reverting this PR to draft/WIP status for now.
This implementation makes sense to me, and would also provide a workaround for https://github.com/gruntwork-io/terragrunt/issues/1033
@celestialorb please @ mention me when this PR is ready for formal review!
@yorinasub17 This is ready for a formal review again. I was right about the small issue I mentioned previously in that this original implementation did not account for globs in the --terragrunt-include-dir flags. I have added another small test case and re-implemented FindInConfigFiles with the use of zglob.Glob to account for this.
A small oddity I encountered is that the ordering of the file paths returned by zglob.Glob didn't seem to be determinate, so I added a sort function to the end of FindInConfigFiles, which meant that I had to reorder the expected results of a few test cases. I'm assuming the ordering of the file paths returned by FindInConfigFiles has no effect and thus does not matter.
Ah looks like the build failed for TestIncludeDirsStrict (in test folder). Here are the test logs:
%~> go test -v -run TestIncludeDirsStrict$ . [0]
=== RUN TestIncludeDirsStrict
=== PAUSE TestIncludeDirsStrict
=== CONT TestIncludeDirsStrict
integration_test.go:3251: [TestIncludeDirsStrict] Full contents of validate-all stdout:
integration_test.go:3254: [TestIncludeDirsStrict]
integration_test.go:3251: [TestIncludeDirsStrict] Full contents of validate-all stderr:
integration_test.go:3254: [TestIncludeDirsStrict]
integration_test.go:4089:
Error Trace: integration_test.go:4089
integration_test.go:1964
Error: Received unexpected error:
Could not find any subfolders with Terragrunt configuration files
Test: TestIncludeDirsStrict
--- FAIL: TestIncludeDirsStrict (0.00s)
FAIL
FAIL github.com/gruntwork-io/terragrunt/test 2.781s
FAIL
Ah looks like the build failed for
TestIncludeDirsStrict(intestfolder). Here are the test logs:%~> go test -v -run TestIncludeDirsStrict$ . [0] === RUN TestIncludeDirsStrict === PAUSE TestIncludeDirsStrict === CONT TestIncludeDirsStrict integration_test.go:3251: [TestIncludeDirsStrict] Full contents of validate-all stdout: integration_test.go:3254: [TestIncludeDirsStrict] integration_test.go:3251: [TestIncludeDirsStrict] Full contents of validate-all stderr: integration_test.go:3254: [TestIncludeDirsStrict] integration_test.go:4089: Error Trace: integration_test.go:4089 integration_test.go:1964 Error: Received unexpected error: Could not find any subfolders with Terragrunt configuration files Test: TestIncludeDirsStrict --- FAIL: TestIncludeDirsStrict (0.00s) FAIL FAIL github.com/gruntwork-io/terragrunt/test 2.781s FAIL
Seems this is stemming from this line:
includedModulesWithNone := runValidateAllWithIncludeAndGetIncludedModules(t, testPath, []string{}, true)
which is running a validate all with strict include and no given modules, and is thus expecting no modules to be returned...however that test is also checking to ensure that Terragrunt does not throw any errors; which it will in this case now since it's not detecting any Terragrunt configurations in FindInConfigFiles anymore.
I think the best course of action here might be to change that error to warning since it should be expected to not find any Terragrunt configuration files with strict include on and no include directories specified.
Hmm I think the error message is actually correct. I would want terragrunt to give me a non-zero exit code when it finds no terragrunt.hcl config that it should run on.
Can you actually just remove that test condition? I think using strict-include without include-dir should be considered an error given the intention of strict-include, so we shouldn't be testing that condition.
Hmm I think the error message is actually correct. I would want terragrunt to give me a non-zero exit code when it finds no terragrunt.hcl config that it should run on.
Can you actually just remove that test condition? I think using strict-include without include-dir should be considered an error given the intention of strict-include, so we shouldn't be testing that condition.
What about cases where a script or CI system might generate a Terragrunt command with strict include based off of a set of changed modules? That set might be empty and could generate and run a Terragrunt command with --terragrunt-strict-include and no --terragrunt-include-dir flags. I don't think I'd consider that to be an error.
Personally, I think the CI system or script should handle the case where there are no directories and not let terragrunt figure it out. That's a really simple array count check in bash, or whatever language it's in.
In fact, I would argue that terragrunt should really be erroring much earlier with a clear error message if --terragrunt-strict-include was passed in without --terragrunt-include-dir.
As a user, I would want terragrunt to complain to me that --terragrunt-strict-include expects --terragrunt-include-dir and that it's undefined behavior if you pass in --terragrunt-strict-include without the other.
Personally, I think the CI system or script should handle the case where there are no directories and not let terragrunt figure it out. That's a really simple array count check in bash, or whatever language it's in.
In fact, I would argue that terragrunt should really be erroring much earlier with a clear error message if
--terragrunt-strict-includewas passed in without--terragrunt-include-dir.As a user, I would want terragrunt to complain to me that
--terragrunt-strict-includeexpects--terragrunt-include-dirand that it's undefined behavior if you pass in--terragrunt-strict-includewithout the other.
You might be interested in taking a look at #1600 then.
Ah you got me. I missed that, and if @brikis98 made that decision, then I'm fine going forward with this.
I would still argue that this should be handled early in the pipeline though, as a special check in https://github.com/gruntwork-io/terragrunt/blob/f99b3fc2fb3ac7e4e5b5046fa126c80f15a4bc87/configstack/stack.go#L114 . That is probably the cleanest, and would still cause an error if you run terragrunt in a dir without any terragrunt.hcl file.
I am also interested to see this merged, is there a specific reason why it is stuck for over a year?
I am also interested to see this merged, is there a specific reason why it is stuck for over a year?
Shortly after this my team ditched Terragrunt in favor of plain Terraform, and thus I lost motivation to follow through on this.