rouge icon indicating copy to clipboard operation
rouge copied to clipboard

Terraform: / in string within interpolated function are not interpreted properly

Open Sayrus opened this issue 1 year ago • 2 comments

Name of the lexer Terraform

Code sample

locals {
        example   = "${function_call("", "", "/")}"
        project_name = project_name
}

image

https://rouge.jneen.net/v4.5.1/terraform/bG9jYWxzIHsKICAgICAgICBleGFtcGxlICAgPSAiJHtmdW5jdGlvbl9jYWxsKCIiLCAiIiwgIi8iKX0iCiAgICAgICAgcHJvamVjdF9uYW1lID0gcHJvamVjdF9uYW1lCn0

Additional context The function call is has a correct syntax function_call(string, string, string) enclosed with the interpolation sequence${...}. (Code won't pass a linter, you can use example = "${function_call("", "", "/")}/interpolated_string" instead)

Expected behavior: image

Sayrus avatar Nov 12 '24 17:11 Sayrus

After looking a bit more into it. The root cause is that "/ (%r/"\//) is considered a start for :regexps introduced in https://github.com/rouge-ruby/rouge/pull/1490. However, as far as I can tell, Regex string are not a special object in Terraform.

The visual test uses the following example (Note that GitHub highlighting considers this a string):

## Object with regular expression
resource "aws_cloudfront_distribution" "s3_distribution" {
  aliases = ["www.${replace(var.domain_name, "/\\.$/", "")}"]
}

The documentation for replace says the following:

If substring is wrapped in forward slashes, it is treated as a regular expression, using the same pattern syntax as [regex](https://developer.hashicorp.com/terraform/language/functions/regex).

The issue from https://github.com/rouge-ruby/rouge/issues/1304 was that $ was highlighted as an error when the entire thing should have been a string literal. I tried looking for a definition of what could be a Regexp in HCL (https://github.com/hashicorp/hcl/blob/main/hclsyntax/spec.md) but I couldn't find much.

I see several things that can be done:

  • Remove regexps and ensure strings do not pop for $ but that may break some users use-cases.
  • Ensure a single or double quote can get us out of the expression and backtrack or ensure there is an end to the regex with a lookahead.
  • Find a way to tokenize a Regex first, then dig into it.

Unfortunately, I am not familiar enough with the project to know if the last two are actually implementable or if the lexer only goes forward.

Sayrus avatar Nov 28 '24 15:11 Sayrus

Regular expressions are also broken with single quotes: https://rouge.jneen.net/v4.5.1/terraform/cmVzb3VyY2UgImF3c19lbGIiICJ3ZWIiIHsKICBuYW1lID0gInRlcnJhZm9ybS1leGFtcGxlLWVsYiIKCiAgIyBUaGUgc2FtZSBhdmFpbGFiaWxpdHkgem9uZSBhcyBvdXIgaW5zdGFuY2VzCiAgYXZhaWxhYmlsaXR5X3pvbmVzID0gWyIke2F3c19pbnN0YW5jZS53ZWIuKi5hdmFpbGFiaWxpdHlfem9uZX0iXQoKICBsaXN0ZW5lciB7CiAgICBpbnN0YW5jZV9wb3J0ICAgICA9IDgwCiAgICBpbnN0YW5jZV9wcm90b2NvbCA9ICJodHRwIgogICAgbGJfcG9ydCAgICAgICAgICAgPSA4MAogICAgbGJfcHJvdG9jb2wgICAgICAgPSAiaHR0cCIKICB9CgogICMgVGhlIGluc3RhbmNlcyBhcmUgcmVnaXN0ZXJlZCBhdXRvbWF0aWNhbGx5CiAgaW5zdGFuY2VzID0gWyIke2F3c19pbnN0YW5jZS53ZWIuKi5pZH0iXQp9CiMjIE9iamVjdCB3aXRoIHJlZ3VsYXIgZXhwcmVzc2lvbgpyZXNvdXJjZSAiYXdzX2Nsb3VkZnJvbnRfZGlzdHJpYnV0aW9uIiAiczNfZGlzdHJpYnV0aW9uIiB7CiAgYWxpYXNlcyA9IFsid3d3LiR7cmVwbGFjZSh2YXIuZG9tYWluX25hbWUsICIvXFwuJC8iLCAiIil9Il0KICBhbGlhc2VzID0gWyJ3d3cuJHtyZXBsYWNlKHZhci5kb21haW5fbmFtZSwgJy9cXC4kLycsICIiKX0iXQp9Cgpsb2NhbHMgewogICAgICAgIGV4YW1wbGUgICA9ICIke2Z1bmN0aW9uX2NhbGwoIi8iKX0iCiAgICAgICAgcHJvamVjdF9uYW1lID0gcHJvamVjdF9uYW1lCn0

image

I'll open a Merge Request with the removal of Regexps and the proper escape.

Sayrus avatar Nov 28 '24 15:11 Sayrus