terraform icon indicating copy to clipboard operation
terraform copied to clipboard

Performance Issue for Terraform Plan with Terraform v1.3 and Terraform Enterprise

Open juicybaba opened this issue 1 year ago • 3 comments

Terraform Version

1.3.3

Terraform Configuration Files

We are using Terraform Enterprise v202205-1 to manage our Azure environment. After changing workspaces to use terraform v1.3.3 from v1.1.9, we noticed that the performance of terraform plan got degraded significantly, so far the apply looks fine.

Also following logs can be seen in the Structured Run Output but not under the traditional output Console UI and there were no such logs with v1.1.9. And it seems like those logs are all related to data resources.

Is everything above an expected behavior?

image

Debug Output

...

Expected Behavior

There shouldn't be too much impact on performance since there is no change in code and target environment.

Actual Behavior

For my test, the plan take ~4 min to complete with terraform v1.3.3 compare to 1 min with terraform v1.1.9

Steps to Reproduce

queue a run manually from Terraform Enterprise UI.

Additional Context

No response

References

No response

juicybaba avatar Oct 26 '22 19:10 juicybaba

Hi @juicybaba,

I'm not sure what the difference in structured output is, but it's not surprising that there are some differences since v1.1. That might not be directly related to any changes in v1.3, have you compared v1.2 at all?

If this is significantly slower than v1.2, it's probably the same issue as #32071. Can you estimate the number of resource instances being managed by this configuration?

jbardin avatar Oct 26 '22 20:10 jbardin

@jbardin Thanks for your reply.

To your question, there are ~240 resources under my test workspace.

A quick update, I just uploaded 1.2.9 to TFE and did the same test (queue a run via UI manually) as earlier. I can see the same behaviors ( longer plan time and new logs in structured output)

Not sure if the difference in the structured output is related to the plan performance issue, but please see the screenshots.

1.1.9 image

1.2.9 and 1.3.3 the output is too long to upload, the difference is, apart from the logs in 1.1.9, a bunch of (maybe all) data resources are visible in logs. And the error (doesn't block apply) under data resource confuses me too.

image

juicybaba avatar Oct 26 '22 22:10 juicybaba

Thanks @juicybaba. 240 resource instances is not many, so I doubt there is any impact from that alone. The only difference I can recall in v1.2 is that Terraform does not build a test apply graph as part of the planning process, to catch cases where the proposed plan will fail immediately during apply. An additional 3min seems a bit much for this, but I can't rule it out.

Looking at the trace logs may be required to see what actual differences there are in the plan timing.

The structured run output errors I'm not familiar with, since those are generated by TFE. There may be a TFE update to handle new output from the CLI.

jbardin avatar Oct 27 '22 14:10 jbardin

@jbardin which particular should I look at? The plan with trace enabled is ~ 60MB with ~170k lines. 1.1.9 has less size and lines compare to 1.2.9

juicybaba avatar Oct 31 '22 22:10 juicybaba

I'm not sure yet where to look, mostly I was going to start by scanning the timestamps and see if there was any obvious place where more time was spent. The AzureRM provider is notoriously noisy, I should have mentioned you can instead use TF_LOG_CORE=trace to drop all the provider logs, which may give us something easier to compare.

jbardin avatar Nov 01 '22 13:11 jbardin

@jbardin I did a test with another workspace which only has ~100 resources and half of them are data resources, I can see a few reading entries for each data resource in 1.2.9 only. Not sure if this is related to the performance issue.

the plan time is 80s (1.2.9) vs 40s (1.1.9)

image

image

juicybaba avatar Nov 01 '22 18:11 juicybaba

IIRC that was part of a fix to show some missing CLI UI output for data sources, but had to functional difference, nor was it changed in the log output.

jbardin avatar Nov 01 '22 19:11 jbardin

@jbardin does this change impact the performance? The trace log has very similar entry # for both versions, is there someone who can provide a deeper look?

juicybaba avatar Nov 02 '22 00:11 juicybaba

The deeper look would be here. It's mostly just to make sure the UI and structured run output get correct notifications about the data source operations, and overall it reduced the number of calls. It's not something that I would expect to make any impact.

jbardin avatar Nov 02 '22 12:11 jbardin

@jbardin I was wondering if those data resource operations cause the issue, but who can do a deeper look at this performance issue in general? or is it expected for v1.2.9?

juicybaba avatar Nov 08 '22 23:11 juicybaba

@juicybaba, if you need assistance and are using Terraform Enterprise, you can file a ticket through the normal support channels. Without anything else to go on, comparing the complete TF_LOG_CORE=trace output from each run would be the place to start. That would give some data about the time intervals between calls, and possibly highlight where the difference is coming from.

jbardin avatar Nov 09 '22 20:11 jbardin

@juicybaba For the normal support channels: Terraform Cloud/Enterprise: please email [email protected] or open a new request. Thanks!

crw avatar Nov 09 '22 22:11 crw

Since we don't have usable example here, and Terraform is likely working as expected, I'm going to close this out for now. If you do come up with data showing what happens in your situation, feel free to post here or file a new issue.

Thanks!

jbardin avatar Dec 07 '22 16:12 jbardin

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Jan 07 '23 02:01 github-actions[bot]