Feature request: Support JSMEPath in `output_export_values`
Currently we need to use the Terraform functions to filter/manipulate the results from the output field. But the output always store the whole response, which increase the state file size.
The JSMEPath is a query language for JSON. refs: https://jmespath.org
There's a concern about backward compatibility, take the following json as an example:
{
"locations": {
"foo": "bar"
}
}
In the current design, when user specifies the locations.foo, the output will be
{
"locations": {
"foo": "bar"
}
}
But with JSMEPath, the result will be
bar
To avoid breaking changes, we might need to add a prefix for the JSMEPath query.
Awesome. Commenting to follow.
Approximately how far away is milestone v2.0.0 ?
Guessing more than one year ?
Love that we're doing this. Following this issue to see how the progress goes.
We might be able to create a provider flag @ms-henglu to toggle on/off whether the state file should store all the output and default it to not storing. Something along the lines of store_output_in_state = false?
Hi @stemaMSFT - I think it's not possible to allow user use the data and not store in state at same time.
@ms-henglu if we tie the behavior of the provider behind a flag, it will still need the full data? Could I not store the smaller data overhead in state, change the flag, and then let the provider fetch that data?
There's a problem about combing multiple JMES query results.
For example, the task is retrieving the IDs and names of the automation accounts.
Solution 1:
The current way is using azapi_resource_list to list all accounts and export everything in the response, then use HCL to query the data:
data "azapi_resource_list" "test" {
type = "Microsoft.Automation/automationAccounts@2023-11-01"
parent_id = "/subscriptions/subscription_id"
response_export_values = ["value"]
}
output "ids" {
value = data.azapi_resource_list.test.output.value[*].id
}
output "names" {
value = data.azapi_resource_list.test.output.value[*].name
}
But this solution stores everything in the response which makes the terraform state extremely large. So we introduce the JMES feature.
Solution 2:
data "azapi_resource_list" "test" {
type = "Microsoft.Automation/automationAccounts@2023-11-01"
parent_id = "/subscriptions/subscription_id"
response_export_values = ["jmes:value[*].name", "jmes:value[*].id"]
}
The problem is how to combine the multiple query results.
There're some options:
Option 1:
output = {
"jmes:value[*].name" = ["name1", "name2"]
"jmes:value[*].id" = ["id1", "id2"]
}
It's difficult to reference the values, for example, user needs to use ....output."jmes:value[*].name" to get the names. And when they update the expression, two places need to be updated.
Option 2:
output = {
"jmes1" = ["name1", "name2"]
"jmes2" = ["id1", "id2"]
}
It uses jmes1 and jmes2 as the keys, so the user could use ....output.jmes1 to reference the first jmes query result. However, when the user adds a new expression at the beginning of the response_export_values, the indexes are changed too.
Option 3:
Only support one jmes expression in response_export_values.
output = {
"jmes" = ["name1", "name2"]
}
The JMES expression is powerful, I think supporting only one expression should be enough and in the future, it's still possible to use option 2.
Hi @stemaMSFT , @grayzu , this is the summary of the issue mentioned today, WDYT?
Adding to above, Option 4: We can have one single key "jmes" mapped to a list of all values referenced by jmes. e.g.
output = {
"jmes" = [["name1","name2"], ["id1","id2"]]
}
When referencing in the output, we can use the index within this list to get needed data.
My thoughts on the matter, we could use a map[string] in a new input, and map[any] output attributes (to avoid breaking changes). The input map key identifies the query in the output, the map value is the JMES query:
data "azapi_resource_list" "test" {
type = "Microsoft.Automation/automationAccounts@2023-11-01"
parent_id = "/subscriptions/subscription_id"
response_queries = {
names = "value[*].name"
ids = "value[*].id"
}
}
output "ids" {
value = data.azapi_resource_list.test.response_query_results.names
}
output "names" {
value = data.azapi_resource_list.test.response_query_results.ids
}
JMES vs gjson
Please can I make an impassioned plea to use gjson as I think it's a much better choice:
accessing array members:
mylist.1.value
Equivalent splat notation:
mylist.#.name -> ["name1", "name2"]
@ms-henglu can you take a look at gjson? I’ll also take a look. Thanks for the idea @matt-FFFFFF
Thanks @matt-FFFFFF , this is fantastic!
@stemaMSFT , about gjson vs jsme path, I think I prefer jsme path, because the azure cli also uses JSME path to query the results. refs: https://learn.microsoft.com/en-us/cli/azure/use-azure-cli-successfully-query?tabs=concepts%2Cbash
I like Matt's idea and the azure cli uses a similar design: https://learn.microsoft.com/en-us/cli/azure/use-azure-cli-successfully-query?tabs=concepts%2Cbash#rename-properties-in-a-query. I think we can keep/deprecate the old response_export_values and introduce this new response_queries. WDYT? @stemaMSFT
cc @hqhqhqhqhqhqhqhqhqhqhq
@ms-henglu
Will the solution still mark sensitive values as such?
Hi @matt-FFFFFF - You mean mark the credentials in the output as sensitive? No, because the sensitive couldn't be set conditionally.
Refs: https://github.com/hashicorp/terraform-plugin-sdk/issues/736
Hi @matt-FFFFFF @stemaMSFT , also wanted to clarify about the final design. Should we:
- keep
response_export_values, and make the type dynamic so it can receive a list or map as value. Does both types of value support jmes, or does only the map value support jmes? Or - keep
response_export_valuesas it is (no jmes), also addresponse_queriesproperty that supports jmes.
My opinion is option 2, because the code for it will be cleaner & doesn't change the current behaviour What are your thoughts? cc @ms-henglu
Hi @hqhqhqhqhqhqhqhqhqhqhq , I just talked with @stemaMSFT , and for the 2.0 beta release, we'll go with option 1.
- Keep response_export_values, and make the type dynamic so it can receive a list or map as value.
- If the user uses the array, the behavior will be consistent with before and no jmes support for this case.
- If the user uses the map, all values will be JMES expression, so we don't need to add the
jmes:prefix.
Thanks
@ms-henglu @stemaMSFT regarding jmes vs. gjson, etc... have you considered JSONPath? currently its only one what has any JSON querying RFC, look for RFC9535.
Thanks @DariuszPorowski for the suggestion! But we decided to use the JSME Path because it's also used in azure cli.
This feature has been supported by https://github.com/Azure/terraform-provider-azapi/pull/591