terraform-provider-null icon indicating copy to clipboard operation
terraform-provider-null copied to clipboard

Terraform Provider Null Crashed

Open grahamschuckman opened this issue 2 years ago • 2 comments

Terraform CLI and Provider Versions

Terraform v1.4.6 on linux_amd64

Terraform Configuration

NA

Expected Behavior

Plan to succeed.

Actual Behavior

Plan failed with GRPC provider crashing.

Steps to Reproduce

  1. if [ "${TERRAFORM_PLAN}" == "false" && "${TERRAFORM_NO_PLAN}" == "true" ]; then echo "############################################################" echo "# PLAN NOT RUN #" echo "# EXERCISE CAUTION #" echo "############################################################" else terraform plan -no-color $FLAG_REFRESH $PARALLELISM $TARGET fi

How much impact is this issue causing?

Medium

Logs

https://gist.github.com/grahamschuckman/04691d286d2e0e20774fc34ae5239b56

Additional Information

We have been encountering regular problems with the GRPC plugin timing out during a plan, but this is the first instance we've seen of the crash output.

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

grahamschuckman avatar Jul 20 '23 19:07 grahamschuckman

For context: we were running this on AWS CodeBuild using the image: aws/codebuild/amazonlinux2-x86_64-standard:4.0. We have 15 GB memory, 8 vCPUs on the build.

grahamschuckman avatar Jul 20 '23 19:07 grahamschuckman

Hi there @grahamschuckman 👋🏻 , thanks for reporting this and sorry you're running into this issue.

At first glance I don't see anything that points to a bug in the Terraform null provider itself, this seems more likely to be resource constraints on the CI machine, which is highly dependent on your configuration setup. In terms of where the resource limitation is happening, the debug output you provided points to an RPC (ValidateResourceConfig) that is called during the first stages of terraform plan which is validation.

In theory, you'd get the same resource limitation error reported by running just terraform validate if that was the actual problem. However, if there are multiple processes running on the AWS CodeBuild machine, that resource limitation could happen at any part during the terraform plan process as it makes multiple RPC calls, the validation RPC just happens to be the one reporting the error with a crash.

Are you able to isolate the null provider and reproduce the crashes with a smaller terraform config? Without more context, I don't think we'll be able to provide useful help outside of an "increase the amount of memory on the machine" tip, which doesn't feel helpful 🙁

Reference for other providers running into this problem: https://github.com/hashicorp/terraform-provider-aws/issues/20274

austinvalle avatar Jul 31 '23 15:07 austinvalle