opentofu
opentofu copied to clipboard
Implement TF_MODULE_CACHE_DIR
OpenTofu Version
1.6.1
Use Cases
When doing init on a large number of projects using modules, storage size used by modules copy in .terrafrom dir can quickly grow up to hundreds of megabytes, frequently just temporarily.
Special case is for CD pipelines where these modules are downloaded and stored multiple times.
Attempted Solutions
None really effective sfo far.
Proposal
We already have TF_PLUGIN_CACHE_DIR to efficiently cache provider binaries.
We need also TF_MODULE_CACHE_DIR to do the same with modules.
The objective is to cache also modules to limit both the amount of the generated network traffic (to download those copies) and the amount of storage needed (to temporarily store those copies).
A side effect is fewer computational and communication resources needed, quicker execution.
UPDATE: TF_MODULE_CACHE_DIR cannot be implemented "ina trivial way" because modules can write on any file.
References
n/a
Note
#1086 can be a duplicate
Thanks for submitting this issue!
If I understand correctly, you are trying to solve two problems:
- Dedupe module storage to save space
- Don't pull down the same modules multiple times on the same system to save bandwidth between multiple projects / instances
I think 1 is pretty well captured in https://github.com/opentofu/opentofu/issues/1086, but 2 is unique and interesting.
This could be quite useful on a busy CI/CD system and don't think the implementation would be terribly difficult. I'll try to have the core team take a look at this.
If I understand correctly, you are trying to solve two problems:
- Dedupe module storage to save space
- Don't pull down the same modules multiple times on the same system to save bandwidth between multiple projects / instances
You understood correctly.
I think 1 is pretty well captured in #1086, but 2 is unique and interesting.
I am not sure whether the linked issue relates to a single "workspace"/project (just dedupe inside .terraform directory) or to the "execution" system (implement a global cache as seen for providers).
I am thinking at the latter.
This could be quite useful on a busy CI/CD system and don't think the implementation would be terribly difficult. I'll try to have the core team take a look at this.
Exactly both for your comment about usefulness and for your difficulty evaluation. Of course, in case it is not clear, my idea is to make this behavior non-default, to avoid breaking compatibility (if any).
I think that TF_PLUGIN_CACHE_DIR solves both problems only for providers. A more general solution would solve the same problems for all things. IMHO.
Correct #1086 only apples to deduping a single project.
In my understanding, TF_MODULE_CACHE_DIR would be where modules would be downloaded to system wide. An instance would then be copied into $PWD/.terraform/ for use in the current project.
They can't just be linked to TF_MODULE_CACHE_DIR/... as writes can happen to items within module.path. We would need a writable instance in the local .terraform directory.
Edit: Providers are also a bit different as they are (theoretically) read only binaries and can not self/modify on execution, whereas modules can (for better or worse)
I wasn't aware there could be writes into the the .terraform/module subdirs during a plan/apply/destroy ... my bad!
In which cases?
This doesn't mean that caching modules is not doable or useful. You'd "just" make a copy to the project dir from the cache instead of downloading that once again over the network. Maybe not good for storage, but good for network traffic.
Anyway, deduping to the project is something good.
Any module can write files to or manipulate files in their paths.module as they see fit, it's why #1086 is so difficult. Any data script or provider can be given paths.module to manipulate it as they wish.
Would you mind updating the original issue to point at #1086 for the deduplication and have this one only focus on TF_MODULE_CACHE_DIR ?
I've added a needs-rfc label here so that we can design and discuss possible approaches.
We're happy to accept RFCs from both the core team and community members, and we feel that it's a valuable way to ensure that we're all on the same page before development begins.
It may also be worth keeping an eye on this issue from @brikis98 in the terragrunt repo here.
When we do discuss how we want to do this, I personally think it would be worth making sure we align with an approach that works for terragrunt's run-all functionality too.
Any module can write files to or manipulate files in their paths.module as they see fit, it's why #1086 is so difficult. Any data script or provider can be given paths.module to manipulate it as they wish.
Would you mind updating the original issue to point at #1086 for the deduplication and have this one only focus on TF_MODULE_CACHE_DIR ?
Hey, @cam72cam , I'm curious about this. Is it intrinsic to how modules work that this has to be allowed? I haven't found a need to do this as a module author, so I'm curious if it would break anything outright if modules were cloned, then had their file system permissions set to read-only immediately afterwards.
It seems like if it doesn't explicitly break anything in OpenTofu internals, it might be possible to have usage of a persistent module store for module cloning when users of tofu don't expect the module to do any filesystem updates. I think some folks might find having cloned modules be read-only a plus from a security perspective too.
A module can write.
With a local-exec provisioner in a null_resource or just with a local_file resource.
Of course this can fail due to read-only or full file systems. But this would be a different thing.
But once the module writes, the "cache" is not valid any more.
It's a tricky problem as laid out in https://github.com/opentofu/opentofu/issues/1086. Supposedly, many modules exist today that change functionality depending on if they are local, remote, or remote+for/count. It's a legacy mess that will at the very least need to be well documented, if not feature flagged at some point.
I solved the module caching problem a few years ago by using Git submodules.
(I manage dozens of shared modules that are used across hundreds of projects, so I've definitely run into this problem before.)
Rather than pointing to a remote URL for the module, I configure a Git submodule pointing to a Git tag for the module, and I invoke the module using source = <local-path>. The modules are only downloaded once, and re-used over and over again.
You lose the "magic" of being able to do fuzzy version requirements like ~> 1.2, but that's arguably something that people generally should not be relying on anyway for production services (e.g., you should always know precisely what you're shipping to production).
You could probably do something Dependabot-like by writing a script that can pro-actively check for newer versions, and open a PR (or equivalent) for you when there's a new release that matches your expected version range.
Side-note: Many people freak out when I mention Git submodules because they've experienced horror stories. Having said that, if you sit down and read the docs, and experiment with rolling the HEAD forward and backward in the .gitmodules file, you learn it's really not that complicated. It just works a little differently from other systems.
@skyzyx I like the git submodule idea also, and that was actually my first idea when exploring options. It's really a little more seamless. The reason I went with the "vendor" config I did, was to be able to pin explicit versions and have dependabot update them. There's been a feature request open for years now for dependabot to update git submodule versions only for tagged releases, but it's had no movement unfortunately for me.
(Though, once I had the vendor config, I realized I could also use it to pin terraform and provider versions, populate the provider cache, and generate a single lock file I can use through the whole project. So I think in the end, it works better for my use case anyway.)
Assigning myself to write the RFC
Is there some way that it could optimistically use a symlink, then in the uncommon case where the module does use a local_file or local-exec provisioner, it does a copy on apply, ideally with something like --reflink=auto where supported.
And maybe even do better than that, since I suspect in many cases the local_file or local provisioner only needs a folder to store something in, and could just use an empty directory. Although figuring out if that is the case would likely be difficult. Maybe there should be something like module.scratch_path that points to an initially empty directory specific to each instance of the module where it can write files.
Hi @tmccombs!
Some time ago I created https://github.com/opentofu/opentofu/pull/2049 proposing a new way to create temporary files in an actually-temporary location instead of modifying the source directory of the module itself. I didn't actually find this issue when I was initially researching that, so I didn't create a backlink to here until today.
One of the open questions in that RFC is whether we can find a suitable heuristic to decide whether a particular module is safe to treat as immutable or not. Unfortunately it's possible for any provider to write files to disk if written to do so, and so I'm not sure that just sniffing for uses of the hashicorp/local provider and local-exec provisioner would be sufficient, but perhaps it is! I think we'd need to do some more research to understand better what patterns are being used for temporary files in modules today and then decide which of them we want to support.
A last-resort idea I discussed in the proposal was to ask module package authors to explicitly opt in to being treated as immutable via something entirely new that no existing module package would contain. Of course, the key drawback of that is that no existing module would opt-in, and it's unclear whether there's a strong enough incentive for new module authors to add something to their module proactively (rather than waiting for their users to complain about it).
Unfortunately it's possible for any provider to write files to disk if written to do so, and so I'm not sure that just sniffing for uses of the hashicorp/local provider and local-exec provisioner would be sufficient
Ah, yeah, that's a good point. Perhaps providers could indicate if a feature doesn't write to the filesystem, or if it does, which paths it would write to. Of course that wouldn't help for existing provider versions.
ask module package authors to explicitly opt in to being treated as immutable via something entirely new that no existing module package would contain
FWIW, I would find that useful. In my case I consume a lot of modules where my team is responsible for maintaining them, so opting in to that would be worth it for me.