terraform icon indicating copy to clipboard operation
terraform copied to clipboard

gcs backend: fix race condition on locking

Open oprudkyi opened this issue 2 years ago • 12 comments

  • when Lock can't write lock file it tries to read exists lock file via lockError() and lockInfo()
  • there is some probability that there already no file (deleted by other process) and lockInfo() and lockError() returns error without filled Info field and info.ID
  • but LockWithContext() expects either no error or filled err.Info.ID
  • for such a case the patch returns dummy err.Info.ID so waiting for real lock continues

Fixes #30149

Target Release

1.4.x

Draft CHANGELOG entry

BUG FIXES

  • Fixed crash when many concurrent processes trying to obtain the same lock

oprudkyi avatar Apr 20 '23 22:04 oprudkyi

CLA assistant check
All committers have signed the CLA.

hashicorp-cla avatar Apr 20 '23 22:04 hashicorp-cla

Thanks for this submission! I've notified the appropriate team.

crw avatar Apr 21 '23 22:04 crw

Hi @crw Could you please ask team if there any way to check and process PR ? Because the issue hurts regularly

oprudkyi avatar May 20 '23 18:05 oprudkyi

I have done so. The backends are maintained by each providerer team, in this case the Google / GCP provider team. Backend maintenance is generally not prioritized as highly as provider work, so unfortunately it is difficult to predict when the team might be able to work this into their schedule. Thanks for your submission and persistence!

crw avatar May 22 '23 18:05 crw

Hi @crw I have tried all possible ways to reach google team, even via google paid support and there is no activity. May be it is possible to make an exception to the rules and someone from hashicorp probably can take a look. Because we have this bug daily and it make our lives miserable (it breaks pipelines, leaves lock files etc )

oprudkyi avatar Aug 08 '23 11:08 oprudkyi

I have been referring to the Google Cloud Provider team at HashiCorp, not a team at Google. I have raised it a number of times with that team. If you are a paying customer of HashiCorp Terraform Cloud or Enterprise, you can further raise the issue with paid support: https://support.hashicorp.com/hc/en-us. Thanks!

crw avatar Aug 23 '23 21:08 crw

@crw oh, now I understand why Google support was confused by my request. I now owe them an apology

oprudkyi avatar Aug 24 '23 06:08 oprudkyi

Yes, my apologies, I could have been clearer. In the future I will be clearer about that.

crw avatar Aug 24 '23 19:08 crw

is anything can be done here to move it forward ?

oprudkyi avatar Oct 12 '23 15:10 oprudkyi

@oprudkyi My apologies, I do not have much control over the review of backend PRs. I did ping the team again with your request.

crw avatar Oct 13 '23 22:10 crw

Fixed in other tool. closing it now as irrelevant

oprudkyi avatar Aug 03 '24 07:08 oprudkyi

Re-opening so the GCS team can review this PR.

crw avatar Aug 23 '24 18:08 crw