kops icon indicating copy to clipboard operation
kops copied to clipboard

Parallelize the way we read files from VFS

Open toninis opened this issue 3 years ago • 14 comments

In our deployments we have more than 150 instance groups. Reading all those states takes enormous amount of time . This is a short implementation to parallelize the work.

PS I am working on a way to also parallelize the getCloudGroups function without hitting the AWS API limits.

toninis avatar Mar 16 '22 11:03 toninis

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: toninis / name: Antonis Stamatiou (ccc16b13d679697b223f4cd27b59ecc6a4e39e48)

Welcome @toninis!

It looks like this is your first PR to kubernetes/kops 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kops has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Mar 16 '22 11:03 k8s-ci-robot

Hi @toninis. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 16 '22 11:03 k8s-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign rifelpet after the PR has been reviewed. You can assign the PR to them by writing /assign @rifelpet in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Mar 16 '22 11:03 k8s-ci-robot

/ok-to-test

olemarkus avatar Mar 16 '22 18:03 olemarkus

/retest

toninis avatar Mar 17 '22 08:03 toninis

/test pull-kops-e2e-kubernetes-aws

toninis avatar Mar 17 '22 09:03 toninis

/test pull-kops-bazel-test

toninis avatar Mar 17 '22 10:03 toninis

@olemarkus Test fail on //cmd/kops:go_default_test . I am not sure how my change affects those tests . There is something else going on that I am not able to figure out .

toninis avatar Mar 17 '22 11:03 toninis

it looks like the order of resources returned from VFS is now nondeterministic because it happens in parallel. Fixing this may be as simple as sorting the results which may have a one-time impact on the ordering of the test outputs and you'll need to run ./hack/update-expected.sh to update them all.

rifelpet avatar Mar 17 '22 15:03 rifelpet

it looks like the order of resources returned from VFS is now nondeterministic because it happens in parallel. Fixing this may be as simple as sorting the results which may have a one-time impact on the ordering of the test outputs and you'll need to run ./hack/update-expected.sh to update them all.

I Tried that and it did not work. But you are right. The expected manifests are incosistent . I will dig up a bit and try to mitigate this . Thanks 😄

toninis avatar Mar 18 '22 09:03 toninis

The easy way to sort is probably to move from a slice to a map, and then re-iterate over the names and return them in order.

We should decide whether we want to adopt generics (yet) - this code is the best possible in go1.17, but with generics we could produce something cleaner (and possibly more ... generic!)

justinsb avatar Mar 31 '22 13:03 justinsb

/retest

toninis avatar Jul 15 '22 11:07 toninis

@toninis: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kops-bazel-test 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-bazel-test
pull-kops-e2e-cni-weave 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-weave
pull-kops-e2e-cni-calico-ipv6 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-calico-ipv6
pull-kops-e2e-cni-amazonvpc 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-amazonvpc
pull-kops-e2e-cni-cilium 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-cilium
pull-kops-e2e-cni-calico 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-calico
pull-kops-e2e-cni-flannel 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-flannel
pull-kops-e2e-cni-kuberouter 23163732c3fbc6330dc225e954198778e13d5240 link true /test pull-kops-e2e-cni-kuberouter
pull-kops-e2e-aws-karpenter 0041f31ce49b2cd11d5b1102ad11b373b99c6bf1 link true /test pull-kops-e2e-aws-karpenter
pull-kops-test 0041f31ce49b2cd11d5b1102ad11b373b99c6bf1 link true /test pull-kops-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Aug 31 '22 08:08 k8s-ci-robot

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 29 '22 09:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Jan 06 '23 21:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar Feb 05 '23 22:02 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 05 '23 22:02 k8s-ci-robot