terraform-provider-template
terraform-provider-template copied to clipboard
Directory management with the template or other provider
Terraform Version
0.12-dev
Affected Resource(s)
template provider, template_dir data source
Expected behavior
This is a proposal to improve the way user files are managed in Terraform.
One user story
User story: "I would like to deploy static websites to S3, using only Terraform." Problem description:
- Fact nr.1: the current
aws_s3_bucket_object
is capable of uploading a well-defined file to S3 - Fact nr.2: Using
count
(or the soon-to-be for-each) it is possible to create a set ofaws_s3_bucket_object
s to upload a set of files. This requires a list-type variable that can be iterated. - Problem nr.1: there is no Terraform-native way to read a list of files from a directory. (As mentioned in hashicorp/terraform/issues/16697 and worked around with https://github.com/saymedia/terraform-s3-dir)
-
Problem nr.2: lists are not supported as Terraform input parameters (at least
-var
didn't support them a few versions ago).
So if I have a directory with a bunch of files (not exactly identified by name for Terraform), I have no way of iterating through them.
The one exception is, if they are templates and I need them rendered on the same machine. In that case the template_dir
resource in the template
provider will do it for me.
Generic user story
It seems that there are a few cases where people want to iterate through a directory. An additional one is hashicorp/terraform/issues/6065 where the user wants to use the file provisioner to keep a directory up-to-date but the file provisioner doesn't keep track of changes. It seems that a provider iterating through directories would be useful.
Originally, I thought this should be a custom directory provider and just write it for my use-case. Then I thought why not try to do it in a Terraform-native-way and try to integrate with the current providers.
The closest provider that iterates through directories is the already mentioned template_dir
resource. Unfortunately, if I generally want to read the folders but not write them, I'm out of luck. As it was raised in #34, there is no template_dir
data source, only resource.
Proposed solution nr.1 (already done)
New data source: template_dir
. This can be considered a non-breaking improvement since it didn't exist before. Based on how the template_dir
resource is defined (and trying to keep in line with it), here's how it would look like:
data "template_dir" "solution1" {
source_dir = "website" # Same as in the resource
exclude = "\.tmp$" # Improvement: regular expression applied to the relative path
vars { # Same as in the resource
"name": "mywebsite"
}
render: true # Improvement: details below (consider files as templates)
}
Output variable rendered
(same as in the resource) would be a map
where the keys are the file names and optionally the values are the rendered files (in case of templates) or empty if render = false
. (You can create a list of files using keys(data.template_dir.solution1.rendered)
and iterate through it, for example to upload them to S3.)
So, the idea is to stuff this functionality into the template provider, even though the template provider is a bit more than just a file-system reader.
I already programmed this solution for 0.11 because I needed it for my use-case. Unfortunately, because of a bug in 0.11 (hashicorp/terraform/issues/19258), this will only work correctly in 0.12. I'm in the process of upgrading it and I'll share afterwards.
Pro: in-line with current expectations on how Terraform works; non-breaking change
Con: It's abusing the template
provider to add functionality.
Proposed solution nr.2
New provider: io
or built-in functions.
It seems that generic file reads (and writes) are in demand. In hashicorp/terraform/issues/16697 there is a discussion about a set of builtin functions to manage files in a directory. It seems that those discussions didn't amount to anything, but they might still be good ideas.
So the request: is it possible to get a bit of brainstorming together for an overarching solution? Is it better to use providers or builtin functions? I'm willing to look into any of it that makes sense because I would rather have a terraform-native solution (eventually) to this problem, than a custom provider.
Until a better idea emerges, I'll keep working on solution nr.1.
In case anyone comes across this thread and is desperate for a way to iterate a directory I threw together https://github.com/jakexks/terraform-provider-glob
(edit:) I should've started with this: thanks for your contribution, I'm sure that it will help a lot of people struggling with the problem.
One of the reasons why this didn't really get far is the problem that the terraform state file can get very big if we list a bunch of file contents in it. I'm guessing it's the case with your solution too, based on the glob_contents_list
property.
Did you try it with, say a 5MB list of files? How big did the terraform state file get?
It'll certainly get massive, and probably won't work well (or at all) for binary files! Definitely a "use at your own risk" thing.
My use case is terraform enterprise, as runs happen in a container that throws away any local files. A module I wanted to use uses template_dir to write some templates to disk that I expect to be there next run, but aren't. Hashicorp support recommended refactoring to outputs 🤷♂
I imagine you could do something with count = "length()"
and the file()
interpolation function to avoid storing too much in the state file, but you'd have to stay aware of the issue.
Sorry for the long silence here, @greg-szabo and everyone else. Thanks for documenting this use-case!
This use-case is close to my heart because in my previous job (before I was on the Terraform team at HashiCorp) I had this very same need and used it as the motivation for a proposal I opened at the time, in hashicorp/terraform#3310.
We've been gradually making incremental progress towards a different approach on this than I originally made there, using a combination of different smaller features. The Terraform 0.12 release has laid some groundwork, but there are still some parts to fill in. Once we get there, the configuration might look something like this, assuming the goal is to just upload files from disk as-is, without any local processing:
locals {
source_dir = "${path.module}/htdocs"
source_files = {
for fn in filelist("${local.source_dir}/**")) :
pathrel(fn, local.source_dir) => fn
}
}
resource "aws_s3_bucket_object" "file" {
for_each = local.source_files
bucket = var.s3_bucket_name
key = each.key
source = each.value
}
This is a combination of the for_each
feature discussed in hashicorp/terraform#17179 and the filelist
and pathrel
functions we were discussing in hashicorp/terraform#16697. The first of these needed some internal redesign to support, which we've completed in master
already so work on this should be unblocked after the 0.12.0 release (though given that we've been focused on config language improvements for so long, we are likely to need to take a break catch up on some other Terraform subsystems for a while first). I'd also asked that we hold on implementing new built-in functions until after 0.12.0, but they are a lot more straightforward than for_each
so hopefully won't take long to get done.
Combining this with template rendering would of course make life a little more complex, since a subset of the files would need to have their content passed through templatefile
rather than just read directly from disk in aws_s3_bucket_object
, but should be doable with a suitable file naming convention to allow recognizing the ones that need to be rendered as templates and dealing with them separately:
locals {
source_dir = "${path.module}/htdocs"
source_files = {
for fn in filelist("${local.source_dir}/**")) :
pathrel(fn, local.source_dir) => fn
}
template_files = {
for k, fn in local.source_files : k => fn
if length(fn) >= 5 && substr(fn, length(fn)-5) == ".tmpl"])
}
rendered_files = {
for k, fn in local.template_files :
k => templatefile(fn, local.template_vars)
}
template_vars = { /* whatever variables the templates expect */ }
}
resource "aws_s3_bucket_object" "file" {
for_each = local.source_files
bucket = var.s3_bucket_name
key = each.key
source = contains(keys(rendered_files), k) ? null : each.value
content = contains(keys(rendered_files), k) : rendered_files[k] : null
}
The templatefile
function in Terraform 0.12 is aiming to supersede the template_file
data source by allowing templates from files to be rendered directly where they are needed. template_dir
was created primarily as a workaround to allow inserting dynamic data from Terraform into a zip file before uploading it to AWS Lambda, but environment variables are now a better option and so its original purpose is no longer relevant either (though I know that some folks have found other uses for it). The main idea here is to make templates first-class in the Terraform language, because manipulation of template strings is such a common operation when combining different components into a working system.
Another usecase for a filelist() function is the kubernetes_config_map resource.
Currently if you want to add multiple files to a configMap you need to do so manually, which is error prone and time consuming.
Example:
resource "kubernetes_config_map" "nginx-cfgmap" {
metadata {
name = "nginx-cfgmap"
}
data = {
"nginx.conf" = file("configs/nginx/nginx.conf")
"fastcgi.conf" = file("configs/nginx/fastcgi.conf")
...
}
}
How it could look like:
resource "kubernetes_config_map" "nginx-cfgmap" {
metadata {
name = "nginx-cfgmap"
}
data = {
for fn in filelist("${path.root}/configs/nginx/**") :
pathrel(fn, "${path.root}/configs/nginx/") => file(fn)
}
}