terraform Extending terraform with custom functions

Current Terraform Version

0.14

Use-cases

Instead of writing a provider, there is some functionality that is best suited for a custom function.

Attempted Solutions

object({}) module, too complicated, at this point its better to just extend terraform with Go.

provider, too complicated, a function that takes one or two inputs and returns a single value is too simple for a provider.

Proposal

I cannot find any documentation on extending terraform with custom functions; is it possible to do this?

Feb 05 '21 19:02 ghost

I've worked around this with sub-modules. Pass inputs to the submodule (parameters), do some processing (with locals), then provide an output (return). As long as it's something that Terraform already supports.

Feb 09 '21 00:02 skyzyx

Yea, I found a way around this using yamlencode with a submodule that provides output. But, it's really too simple to be a module imo.

Feb 09 '21 00:02 ghost

Defining a whole module just so you can re-use code is way more work than should be necessary. It should be possible to define a function in locals and call it, just like any other builtin function. The body of the function would just have to be a bunch of variable expressions like you find in locals, but for the difference that the function inputs would automatically be variables local to the function, and the return value would be a subset of the locals, something like this:

locals {
  func myfunc(var1: string, var2: list(string)) => map(string) {
     var3 = ...expression involving var1, var2, and any local.*...
     var4 = ...expression involving previous vars...
     return var3
  }
}

This basically just means that while in the function, the locals is extended with a few vars that get reset every time the function is called. Doesn't seem too crazy.

Mar 26 '21 13:03 schollii

The function should only not be able to do any side effect, it should only return a value

Apr 12 '21 09:04 troisdiz

They should also be able to be defined in providers, where if you are using a specific provider it adds custom provider specific functions to be available to be called directly.

May 14 '21 00:05 AdamCoulterOz

A use-case where I find myself wanting custom functions: when defining multiple resources with count or for_each, locals don't work well to transform derived properties. Inline logic works acceptably, but I would like to see if something better would work

They should also be able to be defined in providers, where if you are using a specific provider it adds custom provider specific functions to be available to be called directly.

Or a module. Perhaps a new resource type "library"?

Jul 09 '21 16:07 AubreySLavigne

Firstly, I think user-defined functions is a good idea for users, so I'm not arguing with anyone about that. Even more than that, I love the idea of Provider-provided custom functions.

But the language in this thread suggests that there's a belief that creating a module is a bunch of extra work/overhead, which is quite simply false.

@withernet said:

Yea, I found a way around this using yamlencode with a submodule that provides output. But, it's really too simple to be a module imo.

I don't view a submodule as anything more complicated than just some Terraform in a subdirectory. You're already going to write the logic; does it matter where it lives?

Modules aren't complicated. IMO, the majority of .tf files should be written as modules because there are a lot of benefits in exchange for mild overhead.

@schollii said:

Defining a whole module just so you can re-use code is way more work than should be necessary. It should be possible to define a function in locals and call it, just like any other builtin function.

The issue I take is with your statement: "Defining a whole module […]". Defining a module is not complicated; it's not a whole big thing — it's a tiny, little thing.

Unless HashiCorp were to implement user-defined functions in an unexpected way that completely blows my mind, I can't imagine the code you'd write to implement and execute user-defined functions would be very different from exposing parameters (as variables) and return values (as outputs) from a *.tf file in a subdirectory. Many of the primitives are already there (see other issues I've filed for a wishlist of new primitives I'd love to see added to Terraform).

@AdamCoulterOz said:

They should also be able to be defined in providers, where if you are using a specific provider it adds custom provider specific functions to be available to be called directly.

Empowering Providers to include context-specific functions would be amazing. Then again, this is also addressed with a vendor providing both a Provider as well as a Module.

Importing a "function" that someone else wrote?

module "imported_function" {
    source = "…"

    var1 = "abc"
    var2 = 123
}

# Return values are always a map/object (in the programming sense) on the module.
my_result = module.imported_function.my_return

When writing a "function", you use the variable, locals, and output primitives, which are the same as parameters, function body, and return concepts.

Instead of importing with go.mod/pip/npm/Composer/Bundler/NuGet/maven/whatever, you import via URL and git tag. You pass parameters, then access the result.

Could the syntax be simpler? Maybe/Probably. But probably not much simpler.

Jul 09 '21 20:07 skyzyx

@shyzyx I dont' think anyone would argue that importing a function that someone else wrote is easy. In my experience this is a rare need, rather I find I often would like to refactor expressions (not 20 or 100 lines, just part of a line) that I use in a few places as part of transformations. For that, a module is way more work, and just does not scale well. If something is a lot of work, people won't use it and you end up with repetition.

if HCL supported user-defined functions:

I wish I could refactor this expression into a function, because I use it in several places in this loop

move cursor up and write this:

locals {
    myfunc(a type, b type) -> type = { 
      ...code that uses a, b...
      return something
   }
}

move cursor back down to where you want to use the function and replace the expression with local.myfunc(a, b)
DONE

With modules:

I wish I could refactor this expression into a function, because I use it in several places in this loop
Navigate to your filesystem in IDE or shell and create folder
create main.tf in that folder
define one variable entry per parameter
write the same code as ...code that uses a, b... in the previous workflow
define one output
Navigate back to the window from step 1
add a module "myfunc" block with values for params, which BTW is N+3 LOC for every "invocation"
if you want to "call" the module in a loop, you need to add a for_each line in the module invocation, and in many cases you will have to rewrite your loop entirely so that the values to be computed can be used in a for_each (you will be duplicating the loop logic)
move cursor back down to where you want to use the "function" and replace the expression with module.myfunc.something
DONE

Jul 09 '21 21:07 schollii

Here is another syntax, using the similarities between a function body and a locals:

myfunc {
   args = object({a=string, b=string}) // "args" is reserved keyword in anything but "locals"
   var1 = ...use attributes of args (eg myfunc.args.a), local.whatever...
   var2 = ...use attributes of args, local.whatever...
   return = myfunc.var2 // "return" is reserved keyword in anything but "locals"
} -> type

so defining and using a one-liner function could look like this:

myfunc {
  args = object({a=string, b=string})
  return = "${myfunc.args.arg1} + ${myfunc.args.arg2}"
} -> string

locals {
  var1 = {
      for k, v in var.map1: k => myfunc(a, b)
  }
}

New syntax is minimal:

a "locals" block named something other than "locals"
a function call on the block (HCL just needs to copy the arguments (if there are any) into the block's args "fields", and copy the value of block's return field to caller.
a return type after the block (this makes it really obvious that it is a function and allows for type constraints / checking)
support args = object({a=string, b=string}) outside of "variable" block (but only in function block, not locals block)

To export a function from a module you could use

output "myfunc" {value = myfunc}

although for a module that acts as a library of functions this is onerous, better have a convention like in Go (hide by default, capitalize to export) or Python (export by default, prepend underscore to hide).

Jul 09 '21 21:07 schollii

I think we're forgetting the reasoning why it's bad to use a module for a function; modules are not robust (for example, Go lang robustness), and therefore any kind of logic that may need to be maintained for any amount of time longer than a one-off function will become unmaintainable. For example, an unbearable dumpster fire of yamlencode is an unacceptable solution. Yes it works, but, the reality is that it's a hack to accommodate for a lack of functionality.

The initial reason why I opened this issue is because I wanted to export a function that was multi-cloud; to allocate resources between AWS, GCP, and AliCloud for a zookeeper, pulsar, and bookkeeper clusters; can anyone of you, with a straight face attest to how much of a solution a module would be for this?

While it's cool to suggest interim methods to deal with this problem; I'd rather the focus be on why this is essential functionality instead of how to come up with ways to just "work with how it is now".

Jul 09 '21 22:07 ghost

Before making a fool of myself, please correct me if I am missing something. I am very new to Terraform and I may be using an antipattern or missing some other intended functionality.

That said, here is another example for why this is needed: I want to do some more complex validation on inputs to one of my modules. Validation blocks may not call other modules. The input is an object with some values that are optional(number) and some that are number. If the optional values are not null (and always for the required values), they must be integers, and they must be within a range that is different for each field. Currently, I have to copy-paste the same validation code multiple times. Here is a small excerpt of the condition: (Note: the "block" terminology in this code snippet is not related to Terraform blocks)

    condition = (
      (
        can(parseint(var.block_numbers.private, 10)) &&
        var.block_numbers.private >= 0 &&
        var.block_numbers.private <= 15
      ) && (
        can(parseint(var.block_numbers.public, 10)) &&
        var.block_numbers.public >= 64 &&
        var.block_numbers.public <= 79
      ) && (
        var.block_numbers.kubernetes == null
        || (
          can(parseint(var.block_numbers.public, 10)) &&
          var.block_numbers.public >= 96 &&
          var.block_numbers.public <= 111
          )
      )

This code seems absurd, to the point that I considered using regex to handle this, but that just seemed even worse and much less explicit.

It may make sense to give each value its own variable instead of being in an object, but that does not get rid of the code duplication.

Aug 06 '21 17:08 aidan-mundy

@schollii: I was thinking about things in terms of how Terraform functions at its core today. What are things that could be small adjustments to how Terraform works under the covers that could expose the function-y functionality.

I don't work for HashiCorp and can't speak for them, but I've spent plenty of time poking around at the internals and have written a fair amount of code using their hclsyntax library and I feel like I have a decent understanding where Terraform exists at this point in history.

@aidan-mundy: While it seems like using modules for this is frowned upon by others, if you're looking for solution in current Terraform, I literally use modules for this problem of "shared validation".

Essentially, I don't use the built-in validate {} block for this. Instead, I collect the variables into a list/map in locals, then pass each of them through a module using module-level foreach. The module's one job is to (a) do nothing (if OK), or (b) fail with an error. After I pass the variables though the module, if I'm still alive, then everything passed validation.

Now, I'm supporting a lot of user-provided values, and I need to be able to access more than one variable at a time in order to validate.

@withernet: I'm going to boldly say that multi-cloud isn't a real thing. Now let me explain what I mean.

The APIs for each of these cloud providers are different, and even in equivalent services, the functionality of those services is often different or incomplete when comparing one against the other. I believe that this is why those "fog" libraries from 10–15 years ago all failed. For example Google Storage straight-up copied the Amazon S3 APIs at the time. But since then, the features of those services has diverged a bit.

Because of the way that the providers expose resources, I don't expect that building a single set of code (whether it's a re-usable module, or any other Terraform code) which supports multiple clouds at the same time based on something like cloud = "aws" is possible as Terraform exists today. In reality, you'd need to write three sets of HCL code to support three different cloud providers — assuming you're looking for standing up a stack in one cloud, then turning around and standing up the same stack in a different cloud. Alternatively, there is no issue with standing up a stack with resources in different clouds if you need to.

But having a single set of code where the public interfaces (variables + outputs) are identical across clouds? I am extremely skeptical about someone being able to pull that off in a meaningful, production-ready way.

All: Lastly, there's a question of "maybe this isn't the right tool for the job?" By which I don't mean Terraform, but rather HCL. Back when I started working on some of my modules, it was with Terraform 0.9 and Terraform 0.10. It wasn't powerful enough at the time to do the things I needed it to do, so I switched to using a programming language to generate the HCL I needed.

Even now, there is a case in one of my modules (for New Relic monitoring) where the module itself can't deduce certain information on its own. So I wrote a small Go program to hit some APIs, look up the data I need, generate a map of that data, use hclwriter to generate an HCL tree from that data, and write it to disk inside the module directory. Whenever I go to run the test suite or do a release build of the module (that other teams consume), I re-run the script to pick up the latest data and add it to the module's repo.

“When all you have is a hammer, everything looks like a nail.”

Perhaps looking elsewhere in your toolbox will allow you to find a better tool for the job so that you can get your work done, rather than being bothered that Terraform isn't the tool you want it to be, all by itself.

¯\_(ツ)_/¯

Aug 06 '21 18:08 skyzyx

@aidan-mundy there is actually a bug (or feature requirement) to make it more useful that I've reported. Basically, in it's current state it's more of a problem than it's worth #28344 .

@skyzyx yes, the cloud APIs have different implementations. However, I'm not talking about making a single "universal" provider with a "single" universal function that has support between all cloud provider APIs. We have provision requirements that rest between the provision and configuration layers; so no cloud API requirements. A function that can be used between all of them that I can integrate into our configuration would be "meta" and usable between cloud platforms. You can think of this as additional "vendor" support. It's unrealistic to think vendors will only extend terraform as a provider.

Aug 06 '21 21:08 ghost

@withernet Unfortunately, this does not appear to be related to my problem. I want the default to be null, and that ticket appears to only be for nested objects, which I do not have in this usecase.

@skyzyx I appreciate the suggestion, and may look into it in the future if this becomes a bigger pain point. That said, it seems horribly clunky and is not explicit about intent. To have to use a totally different functionality in the language to validate input variables when there is a "validation" block specifically for input variables doesn't make too much sense.

Aug 07 '21 15:08 aidan-mundy

Looks like the underlying DSL used here HCL (Hashicorp Configuration language) actually supports user-defined functions -- So it might just be a matter of integrating this with Terraform itself (as the calling application).

Dec 01 '21 05:12 prologic

@prologic hmmm, that looks deceivingly easy.......

Dec 01 '21 06:12 aidan-mundy

Just wanted to say, that without Custom User-defined functions in Terraform, this is the type of butt ugly type thing I've had to do to drive the input of a resource from the output of another:

resource "swarm_cluster" "cluster" {
  dynamic "nodes" {
    for_each = concat(
      digitalocean_droplet.manager,
      digitalocean_droplet.worker,
      digitalocean_droplet.storage,
    )
    content {
        hostname = nodes.value.name
        tags = {
          "role" = contains(nodes.value.tags, "role:manager") ? "manager" : "worker",
          "labels" = join(
            "&",compact(
              [
                for tag in nodes.value.tags : (
                  contains(split(":", tag), "label") ? format("%s=%s", split(":", tag)[1], split(":", tag)[2]) : ""
                )
              ]
            )
          )
        }
        public_address  = nodes.value.ipv4_address
        private_address = nodes.value.ipv4_address_private
    }
  }
  lifecycle {
    prevent_destroy = false
  }
}

I hope this highlights the importance of this feature, which AFAICT is already baked into HCL itself.

Dec 13 '21 00:12 prologic

For any kind of table-driven configuration it is absolutely must have.

Oct 21 '22 13:10 redbaron

Any updates on this? We have repeating routines that must be in a custom function to keep the code clean and easily maintainable:

  subdomain_name = trimsuffix(substr(replace(var.infra_name, "_", "-"), 0, 64 - length(var.tld) - 2), "-") # FQDN can not be longer than 64 chars because of SSL cert https://docs.aws.amazon.com/acm/latest/APIReference/API_RequestCertificate.html
  srv1_subdomain_name = trimsuffix(substr(replace("srv1-${var.infra_name}", "_", "-"), 0, 64 - length(var.tld) - 2), "-")   
  srv2_subdomain_name = trimsuffix(substr(replace("${var.srv2_name}-${var.infra_name}", "_", "-"), 0, 64 - length(var.tld) - 2), "-")
  srv3_subdomain_name = trimsuffix(substr(replace("${var.srv3_name}-${var.infra_name}", "_", "-"), 0, 64 - length(var.tld) - 2), "-")

Ideally, this should look like this:

  subdomain_name = custom_function_trim_domain_name(var.infra_name)
  srv1_subdomain_name = custom_function_trim_domain_name("srv1-${var.infra_name}")   
  srv2_subdomain_name = custom_function_trim_domain_name("${var.srv2_name}-${var.infra_name}")
  srv3_subdomain_name = custom_function_trim_domain_name("${var.srv3_name}-${var.infra_name}")

Dec 23 '22 05:12 speller

@skyzyx

But the language in this thread suggests that there's a belief that creating a module is a bunch of extra work/overhead, which is quite simply false.

Sorry for the late response, but this is worth discussion. I think you're being disingenuous. Frankly it is more work and overhead by way of making the code less maintainable.

The reasons why are relatively well documented in the software world with regards to languages generally:

Execution in the Kingdom of Nouns This blog post explains, in the context of java, why trying to express everything as nouns (java classes / terraform modules) and banishing verbs (functions) leads to truly horrible code.
Go To Considered Harmful is legendarily famous and yet surprisingly few have read it. The primary thrust is: There's an inherent gap between our imagination of the running process and the real thing. This gap is bad and should be reduced where possible. Our imagination is tightly bound to the written code. So if the structure of the code does not map well to the flow of process execution we, as humans, lose the thread behaviour all too easily.

Simply: functions and objects have two completely different use cases and trying to express one as the other leads to code that works but becomes unmaintainable too easily.

I'm not passing comment on your own code, @skyzyx. YMMV.

My own experience is that adding excessive modules and defining everything in expressions instead of a step-by-step functions leads complexity² (complexity squared).

In an attempt to give a glimpse into what I mean:

foo(bar(baz(bob(bonno))))

The fact this is concise isn't the main advantage. This code shows a clear execution order at a glance. The full wiring is easy to understand. Obviously you can go the wrong way here and single expressions rapidly get too hard to read also.

Putting the same thing in modules and maintaining it for a while:

module "c" {
  source = "../functions/bar"
  c = module.f.y
}

module "a" {
  source = "../functions/bob"
  a = bonno
}

module "d" {
  source = "../foo"
  d = module.c.z
}

module "b" {
  source = "../functions/baz"
  f = module.c.z
}

module "f" {
  source = "../functions/baz"
  f = module.a.x
}

At first glance you have next to zero knowledge of what this does or what the execution order is.

It's hard to even notice that a "spare" module got left behind by previous maintenance. Could you even find the "spare" in only a couple of seconds now I've told you?

Jan 25 '23 13:01 couling

Not sure why I didn't link #28339 to this, here it is now.

It pushes my comment https://github.com/hashicorp/terraform/issues/27696#issuecomment-808201319 a little further, and that comment got a lot of upvotes indicating that maybe #28339 is a good way to advance discussion?

Jan 25 '23 20:01 schollii

Looks like the underlying DSL used here HCL (Hashicorp Configuration language) actually supports user-defined functions -- So it might just be a matter of integrating this with Terraform itself (as the calling application).

Are there any examples of using this method?

Mar 15 '23 17:03 celik0311

so is this article making stuff up https://www.devopsschool.com/blog/detailed-guide-for-how-to-write-a-custom-function-in-terraform/

Or is it now possible to define custom functions in our terraform code?

May 02 '23 18:05 es-cs1

I think it is rubbish. Seems like the author based this on a chatGPT hallucination.

Plus the mechanism doesn't actually make sense as described: it says to put the code in a .tf file and that any language can be used, yet there is no mention of the language used in the .tf file shown as example. Also see https://developer.hashicorp.com/terraform/language/functions which is the latest, "The Terraform language does not support user-defined functions". So the only way this could be true is yet-to-be-announced functionality.

I posted a comment, I'll see if it gets rejected.

May 02 '23 21:05 schollii

Indeed, that article is describing language features that do not exist and never have existed.

May 03 '23 00:05 apparentlymart

I reached out the the above website and the author agreed to take down the page claiming it was "half cooked content and work was in progress". IMHO it looked like AI generated content. For future readers curious about the article's content, it can still be found via the "Wayback Machine - Internet Archive".

May 06 '23 20:05 couling

Is this now finally coming?

Oct 12 '23 06:10 Satak

@Satak It is being explored, but there is not a final design for it yet. Thanks!

Oct 16 '23 21:10 crw

Closed via #34394

Mar 07 '24 17:03 jbardin

@jbardin Correct me if I'm wrong but that "only" adds provider defined functions to terraform. Sounds great, but that's not really what was discussed here, is it? The original intent it seems to me was to define functions in the terraform code itself.

I would love to use functions as well, that's why I subscribed. But I am not able to write a terraform provider myself sadly.

Mar 07 '24 17:03 martinrohrbach