bicep Module resource outputs can't be reference outside of the module

Module resource outputs can't be reference outside of the module

Open jkotalik opened this issue 2 years ago • 11 comments

Bicep version run bicep --version via the Bicep CLI, az bicep version via the AZ CLI or via VS code by navigating to the extensions tab and searching for Bicep

On main with https://github.com/Azure/bicep/pull/4971 enabled.

Describe the bug

Currently output that are resources can't be used to access properties on the resource.

This is a contrived example which doesn't necessarily work, but gets the point across.

resource myresource 'Microsoft.Sql/servers@2021-08-01-preview' = {
  name: 'myothersql'
  properties: {
    administratorLogin: mymodule.outputs.sql.properties.administratorLogin
  }
}

module mymodule 'test.bicep' {
  name: 'mymodule'
}

// test.bicep
resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'mysql'
}

output sql

Error: stage app failed: resources.DeploymentsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="InvalidTemplate" Message="Deployment template validation failed: 'The template resource 'myresource' at line '1' and column '268' is not valid: The template function 'reference' is not expected at this location. Please see https://aka.ms/arm-template-expressions for usage details.. Please see https://aka.ms/arm-template-expressions for usage details.'." AdditionalInfo=[{"info":{"lineNumber":1,"linePosition":268,"path":"properties.template.resources[0]"},"type":"TemplateViolation"}]

Additional context

The main problem here is that there seems to be a double reference call when you access mymodule.outputs.sql.properties.administratorLogin

Feb 24 '22 22:02 jkotalik

We discussed an outlining solution to this problem as a way to make resource-typed outputs more deterministic for the deployment engine to reason about. Here's an example of what I mean by outlining:

// my-module.bicep
resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'my-db'
  properties: {
  }
}

output resource sql = sql

// main.bicep
module mod './my-module.bicep' = {
  name: 'my-module'
}

output string hostname = mod.outputs.sql.properties.hostname

The output expression in main.bicep right now looks like:

[reference(reference('mod').properties.outputs.sql).properties.hostname]

This is the double-reference that's causing heck in the deployment engine. Since the thing inside the reference reference('mod').properties.outputs.sql is complex, the deployment engine can't reason about it.

An outlining solution would simplify this expression inside main.bicep to something like:

[reference(resourceId('Microsoft.Sql/servers', 'my-db').properties.hostname]

This is a much simpler expression, and the hope is that the deployment engine can reason about it. The limitations brought about by outlining are twofold:

The expression being outlined (the resource id of the resource being output) must be deterministic (no randomness)
The expression being outlined must have the same meaning in the context of the outer module - or we need to be able to rewrite it to have the same meaning

I think we fail this second test in some really common cases.

For example consider what happens when my-module operates at a different scope or on a different resource group. We need to account for that.

The really vexing cases involve some commonly-used patterns for resource naming. Here's an example of a change we could make to my-module.bicep.

// my-module.bicep
resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'my-db-${uniqueString(resourceGroup().id)}'
  properties: {
  }
}

output resource sql = sql

If we outline the name expression, we end up with:

[reference(resourceId('Microsoft.Sql/servers', concat('my-db-', uniqueString(resourceGroup().id))).properties.hostname]

The problem of course is that a function like resourceGroup() may not be valid (subscription-scoped deployment) or my not have the same meaning in the outer module. It means we have to treat all possible input to the name of the resource being output as deploy-time constants.

Apr 08 '22 17:04 rynowak

The really vexing cases involve some commonly-used patterns for resource naming. Here's an example of a change we could make to my-module.bicep.
// my-module.bicep
resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'my-db-${uniqueString(resourceGroup().id)}'
  properties: {
  }
}

output resource sql = sql
If we outline the name expression, we end up with:
[reference(resourceId('Microsoft.Sql/servers', concat('my-db-', uniqueString(resourceGroup().id))).properties.hostname]
The problem of course is that a function like resourceGroup() may not be valid (subscription-scoped deployment) or my not have the same meaning in the outer module. It means we have to treat all possible input to the name of the resource being output as deploy-time constants.

If we are dealing with a module that is also written in Bicep, the compiler could update the outputs of the nested deployment to account for resource properties accessed in the parent template. An "output inlining" solution might look like this:

Module template:

{
  ...
  "outputs": {
    "sql": { 
      ...
    },
    "sql_properties_hostname": {
      "type": "string",
      "value": "[reference('sql').hostname]"
    }
  }
}

Parent template:

{
  ...
  "outputs": {
    "hostname": {
      "type": "string",
      "value": "[reference('mod').outputs.sql_properties_hostname]"
    }
  }
}

This is essentially what a user would do in a template directly authored in ARM JSON, and Bicep could make this pattern easier. However, I see two major drawbacks to the "output inlining" solution:

The compiled output of template is no longer fully deterministic, but could be altered by a larger compilation context. This would be especially problematic for modules stored in the registry, given that they are published as compiled JSON.
What would the compiler do about modules that are already in ARM JSON?

I'm not sure this issue could be fully solved without making resource a first-class parameter and output type in ARM, not just in Bicep.

Jun 02 '22 17:06 jeskew

The compiled output of template is no longer fully deterministic, but could be altered by a larger compilation context. This would be especially problematic for modules stored in the registry, given that they are published as compiled JSON.

I think you could make this work if you could do the same transformation for each case. We already include enough metadata that the semantic model for an ARM-JSON template knows about resource-typed-outputs.

The bigger problem is secrets since outputs are not supposed to contain them, or array access inside a loop.

Jun 02 '22 17:06 rynowak

This idea is half-baked at the moment, but I wonder if there should be a type-level distinction here? Perhaps resource would not be a supported output type, but resourceId would:

// my-module.bicep
resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'my-db'
  properties: {
  }
}

output resourceId sql = sql.id // or maybe just `= sql`

Both resource and resourceId would be allowable as parameter types, and passing a resourceId-typed symbol as the value for a resource-typed parameter would result in an implicit conversion:

// my-other-module.bicep
param sql resource

var hostname = sql.properties.hostname

// main.bicep
module mod './my-module.bicep' = {
  name: 'my-module'
}

module otherMod './my-other-module.bicep' = {
  name: 'my-other-module'
  params: {
    sql: mod.outputs.sql // totally legit! no warnings or errors
  }
}

output string hostname = mod.outputs.sql.properties.hostname // error! resourceId type has no `properties` property

That sounds like a lot of complexity to encode in the type system, but it would surface the distinction between a resource module output that's used in the parent template vs how that type can be used when it's passed into another module.

Jun 02 '22 19:06 jeskew

That sounds like a lot of complexity to encode in the type system, but it would surface the distinction between a resource module output that's used in the parent template vs how that type can be used when it's passed into another module.

I'm not sure from your description why the distinction is valuable to users 🤔 . The goal of the work I started was to just use resources everywhere, whether your goal is to access the id or access the resource body. If we can nail that promise then it seems overall simpler.

Jun 02 '22 21:06 rynowak

I'm not sure from your description why the distinction is valuable to users

The distinction is valuable only in the sense that it turns the runtime error reported by the OP into a compile-time error. The ARM engine doesn't allow lazy evaluation of the resource ID, so a template can't directly use a resource output by a module. Taking a resource output from one module and providing it as a parameter for another module is allowed, though, because it doesn't result in a reference(reference(... expression. A resourceId type would differ from a simple string in that could allow users to dereference resource properties that can be derived from the ID (name, type, id, maybe apiVersion depending on how the type is declared), but I don't see how a parent template could access resource body properties of a module output without changes to the deployment engine.

Jun 02 '22 21:06 jeskew

@jeskew - I think I see what you're suggesting now, but it feels like a workaround to me without solving the problem of dynamic lookups in the deployment engine. I think what you're proposing is reasonable, coherent, and could be implemented.

All of the scenarios where I've wanted to use resource-typed-outputs require accessing the resource body. If there are scenarios that just require the identifiers and we want to unblock them then it makes sense - to me it just feels underwhelming, but I just might be missing of the use cases for (name, type, id, etc). I know that there was discussion in the past on a "typed resource id" idea, and this would hit that mark.

If we're able to implement dynamic lookups in the future would we regret implementing "typed resource ids" now?

Alternative if we're able to implement "output secrets" then the output outlining approach you described might work really well.

Jun 04 '22 01:06 rynowak

It's very important to consider the impact to what-if and preflight here. I hypothesize that the 99% case is users wanting to output resources from a module whose identifying characteristic are static w.r.t the outer template, and would certainly want the deployment engine to be able to deterministically calculate the deployment graph at the start of the deployment - in order to support what-if & preflight.

If we come up with a solution that allows 'unsafe' behavior w.r.t predictability, I think we need to surface this to users so that they can consider the tradeoffs. Think SQL cursors - powerful but most of the time worth avoiding as they impact the query plan.

Jun 06 '22 15:06 anthony-c-martin

We discussed this yesterday as a team, and the consensus was that we should block resource-typed module outputs while we explore updating the ARM deployment engine to support nested reference expressions.

The behavior currently supported under the BICEP_RESOURCE_TYPED_PARAMS_AND_OUTPUTS_EXPERIMENTAL -- passing resources between sibling modules -- would still be possible should a user output a resource name from one module and use an existing reference in a sibling module:

// my-module.bicep
resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'my-db'
}

output dbName string = sql.name

// my-other-module.bicep
param dbName string

resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: dbName
}

output hostname = sql.properties.hostname

// main.bicep
module mod './my-module.bicep' = {
  name: 'my-module'
}

module otherMod './my-other-module.bicep' = {
  name: 'my-other-module'
  params: {
    sql: mod.outputs.sql
  }
}

output string hostname = otherMod.outputs.hostname

resource-typed parameters would be supported and could be removed from under a feature flag.

Jun 09 '22 13:06 jeskew

While working on a proof-of-concept of the outlining approach proposed above, I encountered a few cases where the ARM code generated would be less than ideal.

In ARM JSON templates, parameters are dereferenced by string, so expressions like [parameters(parameters('foo'))] are legal. Given the following template:

{
    ...
    "parameters": {
        "foo": {
            "type": "string,
        },
        "bar": {
            "type": "string",
            "defaultValue": "[dataUri('snap')]"
         },
        "baz": {
            "type": "int",
            "defaultValue": "[add(mul(2, 2), 2)]"
        }
    },
    "resources": [
        {
            "type": "Microsoft.Sql/servers",
            "apiVersion": "2021-08-01-preview"
            "name": "[parameters(parameters('foo'))]"
            ....
        }
    ],
    "outputs": {
        "sql": {
            "type": "string",
            "value": "[reference(parameters(parameters('foo')).id]",
            "metadata": {
                "resourceType": "Microsoft.Sql/servers@2021-08-01-preview"
            } 
        }
    }
}

and the following parent template:

param fizz string

module mod './nested.json' = {
  name: 'mod'
  params: {
    foo: fizz 
  }
}

output hostname = mod.outputs.sql.properties.hostname

Codegen in the parent template is obliged to generate an ARM expression like

[
    createObject(
        'foo', parameters('fizz'),
        'bar', dataUri('snap'),
        'baz', add(mul(2, 2), 2)
    )[parameters('fizz')]
]

with the generated code getting more and more unwieldy if the nested template declares additional parameters or uses complex default value expressions. I can see this easily running into ARM's expression length limit or pushing a template over the 4MB limit if a multi-KB expression needs to be repeated multiple times.

If a loop variable is used as part of a resource name, the generated code may need to covert the for expression (or copy expression if the nested template is in ARM JSON) to a map expression. Given the following template:

param num int

var arr = [for i range(0, 100): {
  foo: 'foo${i}'
}]

resource sql 'Microsoft.Sql/servers@2021-08-01-preview' = existing {
  name: 'server-${arr[num]}'
}

outputs resource sql = sql

and the following parent template:

param parentNum int

module mod './nested' = {
  name: 'mod'
  params: { 
    num: parentNum
  }
}

output hostname = mod.outputs.sql.properties.hostname

the generated code for hostname would look something like:

[
    format(
        'server-{0}',
        map(
            range(0, 100), 
            lambda(
                'i', 
                format('foo{0}', lambdaVariables('i'))
            )
        )[parameters('parentNum')]
   )
]

As with case 1, my main concern here is running into template limits for more complex examples or repeat inclusions.

Some cases along these lines could be simplified via constant folding once that lands in the compiler, but only the ARM runtime would be able to simplify expressions that contain a parameters(...) function expression, as parameter values cannot be predicted at compile time.

Jul 28 '22 17:07 jeskew

Given we already have some cases where customers are hitting the 4MB limit, I am not particularly concerned with making the problem worse. I would optimize for enabling the functionality and then later we can figure out how to generate more efficient JSON and/or raise the 4MB limit.

Aug 01 '22 13:08 alex-frankel

bicep bicep copied to clipboard

Module resource outputs can't be reference outside of the module

bicep
bicep copied to clipboard