gloo icon indicating copy to clipboard operation
gloo copied to clipboard

Add lock for translation

Open kdorosh opened this issue 3 years ago • 8 comments

Description

Updates validation server to use a shared translator that is synchronized which prevents data races across translation

Checklist:

  • [ ] I included a concise, user-facing changelog (for details, see https://github.com/solo-io/go-utils/tree/master/changelogutils) which references the issue that is resolved.
  • [ ] If I updated APIs (our protos) or helm values, I ran make -B install-go-tools generated-code to ensure there will be no code diff
  • [ ] I followed guidelines laid out in the Gloo Edge contribution guide
  • [ ] I opened a draft PR or added the work in progress label if my PR is not ready for review
  • [ ] I have performed a self-review of my own code
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have added tests that prove my fix is effective or that my feature works

BOT NOTES: resolves https://github.com/solo-io/gloo/issues/7202

kdorosh avatar Sep 21 '22 15:09 kdorosh

/skip-changelog for now

kdorosh avatar Sep 21 '22 15:09 kdorosh

/kick docker build flake? https://storage.googleapis.com/solo-public-build-logs/logs.html?buildid=3283b0be-0b04-4280-a5f1-bbc82ce7b19d

kdorosh avatar Sep 21 '22 15:09 kdorosh

This will definitely fix the problem with concurrent writes in plugins. I'm pretty confident that it won't deadlock. Are we also planning to share a single translator between validation and translation?

elcasteel avatar Sep 21 '22 16:09 elcasteel

@elcasteel I think that's a good idea and the right direction forward; torn on whether we want to make that specific change in our backport however. Still reasoning about the shared memory here

kdorosh avatar Sep 21 '22 16:09 kdorosh

Visit the preview URL for this PR (updated for commit 1ebcd00):

https://gloo-edge--pr7207-lock-translation-vsvuq7kv.web.app

(expires Fri, 30 Sep 2022 17:47:18 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

github-actions[bot] avatar Sep 21 '22 16:09 github-actions[bot]

@elcasteel I think that's a good idea and the right direction forward; torn on whether we want to make that specific change in our backport however. Still reasoning about the shared memory here

We shared the translator before we started reusing plugins so I don't believe there are any other issues with shared memory. I think if we are trying to avoid waiting on locks the best set up would be to give the validation webhook its own translator (and own validator) and then to have a separate validator for the proxy reconciler that can share a translator with translation. I think it's worth scale testing with a single translator for everything.

elcasteel avatar Sep 21 '22 16:09 elcasteel

Issues linked to changelog: https://github.com/solo-io/gloo/issues/7202

solo-changelog-bot[bot] avatar Sep 21 '22 19:09 solo-changelog-bot[bot]

/kick look into https://storage.googleapis.com/solo-public-build-logs/logs.html?buildid=25a10be6-8fc4-4a45-92ba-e3bad5e9d94c

kdorosh avatar Sep 22 '22 06:09 kdorosh

Issues linked to changelog: https://github.com/solo-io/gloo/issues/7202 https://github.com/solo-io/skv2/issues/375 https://github.com/solo-io/gloo/issues/7213

solo-changelog-bot[bot] avatar Sep 23 '22 15:09 solo-changelog-bot[bot]