terragrunt icon indicating copy to clipboard operation
terragrunt copied to clipboard

Terragrunt cache fails to update modules which have been updated.

Open gtmtech opened this issue 2 years ago • 2 comments

I am using terragrunt version v0.36.6

My terragrunt setup looks a little like this:

I have a terragrunt.hcl which includes a main.hcl:

# terragrunt.hcl

include "feature" {
  path   = find_in_parent_folders("main.hcl")
  expose = true
}

I have a main.hcl which takes a virtually empty terraform root module, and generates some modules dynamically inside it:

# main.hcl

terraform {
  source = "/path/to/my/modules//empty_placeholder"
}

generate "modules" {
  path      = "modules.tf"
  if_exists = "overwrite_terragrunt"
  contents  = local.module_blocks
}

The contents of modules.tf end up being one or more (usually many) module blocks which each call one of many different modules.

# .terragrunt-cache/xxx/xxx/empty_placeholder/modules.tf

# Generated by Terragrunt. Sig: xxxxxxxxxxxxx
module "foo__bar" {
  # Providers
  providers = {
    aws = aws.foo__bar__us-east-1
  }

  # Source
  source = "/path/to/my/modules//features/some-feature"
}

Finally these modules inside them have terraform code which calls other modules:

# /path/to/my/modules//features/some-feature/main.tf

module "baz" {
  source = "../../components/baz"
}

All this works great most of the time - terragrunt init works, terragrunt plan works, terragrunt apply works.

The problem happens then when I realise I need to fix something in my terraform modules - for example, I had a simple error on plan:

Error: Unsupported argument
│ 
│   on .terraform/modules/foo__bar/features/some-feature/main.tf line 4, in module "baz":
│    4:   myattribute = false
|
│ An argument named "myattribute" is not expected here

Ah I think, yeah I forgot I dont need myattribute any more when calling baz, so I need to remove it. I remove it from the some-feature/main.tf and then I rerun terragrunt plan.

Error: Unsupported argument
│ 
│   on .terraform/modules/foo__bar/features/some-feature/main.tf line 4, in module "baz":
│    4:   myattribute = false
|
│ An argument named "myattribute" is not expected here

However I would expect terragrunt to realise that this module has now changed, and not use a cached old version.

Although in my codebase I can see I have no reference to myattribute whatsoever, in my .terragrunt-cache folder its a different story:

cd .terragrunt-cache/xxx/xxx/empty_placeholder
cd .terraform/modules/
grep -r 'myattribute' *

...
foo__bar/features/my-feature/main.tf:  myattribute = false

Somehow terragrunt hasn't figured out that my source module has changed, and so it needs to invalidate the cache and reget the modules from source.

Is this a common experience? Is there a workaround or a fix? Any information would be useful in solving this problem as I currently have to destroy the .terragrunt-cache folder every time I make a change to the code to force terragrunt to pick it up, which is a bit annoying as all the amazing speed gains of terragrunt are then missed.

gtmtech avatar Jun 30 '22 20:06 gtmtech

Hello, I think detection of new changes is not working because it is done through generate block from included HCL, can be tried to run terragrunt --terragrunt-source-update flag to refresh dependencies

I will look more maybe can find more details

denis256 avatar Jul 01 '22 09:07 denis256

We've been experiencing this since upgrading from 0.36.0 to 0.38.1. The difference is that our code isn't generated. It's just that changes in our local module code aren't being reflected in the cache automatically anymore, while they were before.

EDIT: What I'm experiencing is #2171.

geekofalltrades avatar Jul 13 '22 19:07 geekofalltrades

Hi @denis256 @geekofalltrades and anyone else.

We tracked this caching problem down to a terraform issue and not a terragrunt issue.

When terraform checks out many modules of the same source, it internally symlinks one, and copies the rest. It does this due to hashicorp saying that some people actually generate files using terraform code (example being archive_file I guess), and that if you generate files in the local directory in one module, it shouldn't be appearing in another module, just because they are the same source location.

So it doesn't look as though there's any fix coming from terraform anytime soon. Anyway, this means when we update our source location for the modules, if its on local disk, then only the first module will pick it up (due to the symlink) and the others wont.

The way we got around this problem was to add :

  after_hook "after_hook" {
    commands = ["init"]
    execute  = ["${get_repo_root()}/bin/fix-cache.sh"]
  }

With fix-cache.sh doing something like this:

cache_dir=".terraform"
if [[ ! -d "${cache_dir}" ]]; then
  echo cache dir not found.
  exit
fi

for module_dir in $(find "${cache_dir}/modules" -depth 1 -type d); do
  echo "replacing ${module_dir} with symlink"
  rm -rf "$module_dir" && ln -s "$SOURCE" "$module_dir"
done

Where for us $SOURCE represents the local location on disk where the modules are picked up from.

This cleans up the terraform symlink problem, and thus the cache problem we were experiencing.

I think it would be good if terragrunt could do this work, but perhaps as an opt-in flag, in case some users really do want to be generating files locally as part of their terraform runs, and not be specifying absolute paths to a temp directory to do so.

gtmtech avatar Dec 29 '22 13:12 gtmtech