snc icon indicating copy to clipboard operation
snc copied to clipboard

[POC] Performance tuning measures for single node OpenShift

Open spaparaju opened this issue 4 years ago • 21 comments

Goals

Enable various levels of Single node OpenShift experience for CRC users based on the availability of resources on their laptops

Please check out this doc for trying out instructions and documentation around changes been made as part of this POC,

spaparaju avatar Nov 25 '20 11:11 spaparaju

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: spaparaju To complete the pull request process, please assign praveenkumar after the PR has been reviewed. You can assign the PR to them by writing /assign @praveenkumar in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot avatar Nov 25 '20 11:11 openshift-ci-robot

Hi @spaparaju. Thanks for your PR.

I'm waiting for a code-ready member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot avatar Nov 25 '20 11:11 openshift-ci-robot

/ok-to-test /hold /wip /retest

Note: removed messages about missing labels.

gbraad avatar Nov 25 '20 12:11 gbraad

/retest

spaparaju avatar Nov 25 '20 17:11 spaparaju

/retest

spaparaju avatar Nov 25 '20 19:11 spaparaju

/retest

spaparaju avatar Nov 26 '20 14:11 spaparaju

/retest

spaparaju avatar Nov 27 '20 15:11 spaparaju

/retest

spaparaju avatar Nov 29 '20 13:11 spaparaju

/retest

spaparaju avatar Nov 30 '20 10:11 spaparaju

/retest

spaparaju avatar Dec 02 '20 05:12 spaparaju

I didn't review the patches but jumped into the testing side, right now I can see the console is not up/running.

$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.6.6     True        False         False      115m
cloud-credential                           4.6.6     True        False         False      9h
cluster-autoscaler                         4.6.6     True        False         False      9h
config-operator                            4.6.6     True        False         False      9h
console                                              Unknown     Unknown       Unknown    1s
csi-snapshot-controller                    4.6.6     True        False         False      8h
dns                                        4.6.6     True        False         False      122m
etcd                                       4.6.6     True        False         False      9h
image-registry                             4.6.6     True        False         False      8h
ingress                                    4.6.6     True        False         False      8h
insights                                   4.6.6     True        False         False      9h
kube-apiserver                             4.6.6     True        False         False      9h
kube-controller-manager                    4.6.6     True        False         False      9h
kube-scheduler                             4.6.6     True        False         False      9h
kube-storage-version-migrator              4.6.6     True        False         False      8h
machine-api                                4.6.6     True        False         False      9h
machine-approver                           4.6.6     True        False         False      9h
machine-config                             4.6.6     True        False         False      9h
marketplace                                4.6.6     True        False         False      121m
monitoring                                 4.6.6     True        False         False      8h
network                                    4.6.6     True        False         False      9h
node-tuning                                4.6.6     True        False         False      8h
openshift-apiserver                        4.6.6     True        False         False      117m
openshift-controller-manager               4.6.6     True        False         False      8h
openshift-samples                          4.6.6     True        False         False      9h
operator-lifecycle-manager                 4.6.6     True        False         False      9h
operator-lifecycle-manager-catalog         4.6.6     True        False         False      9h
operator-lifecycle-manager-packageserver   4.6.6     True        False         False      115m
service-ca                                 4.6.6     True        False         False      9h
storage                                    4.6.6     True        False         False      9h

Auth takes around 5-10 mins before stabilize.

$ oc login -u kubeadmin -p mItim-MEKoX-di6Ru-SxpKX https://api.crc.testing:6443
The server uses a certificate signed by an unknown authority.
You can bypass the certificate check, but any data you send to the server could be intercepted by others.
Use insecure connections? (y/n): y

The connection to the server oauth-openshift.apps-crc.testing was refused - did you specify the right host or port?

<CRC_VM> $ crictl logs <auth_operator>
E1202 04:05:01.841821       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused
E1202 04:05:02.381506       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused
E1202 04:05:03.684121       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused
E1202 04:05:06.265778       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused
E1202 04:05:11.421934       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused
E1202 04:05:21.739874       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused
E1202 04:05:25.687774       1 base_controller.go:250] "OAuthRouteCheckEndpointAccessibleController" controller failed to sync "key", err: Get "https://oauth-openshift.apps-crc.testing/healthz": dial tcp 192.168.130.11:443: connect: connection refused

Logs can't be fetched even the node in ready state, which might be because these PR is not rebase on the master?

$ oc get node
NAME                 STATUS   ROLES           AGE   VERSION
crc-s6vhx-master-0   Ready    master,worker   9h    v1.19.0+43983cd

$ oc logs oauth-openshift-85f74b7bd-ljcs6 -n openshift-authentication
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log oauth-openshift-85f74b7bd-ljcs6))

praveenkumar avatar Dec 02 '20 06:12 praveenkumar

@praveenkumar I will be sharing the instructions and CRC binary to test.

spaparaju avatar Dec 02 '20 06:12 spaparaju

@spaparaju Sure also meanwhile can you rebase it to master, looks like it have some conflicts?

praveenkumar avatar Dec 02 '20 06:12 praveenkumar

@spaparaju: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot avatar Dec 10 '20 23:12 openshift-ci-robot

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: spaparaju To complete the pull request process, please assign praveenkumar after the PR has been reviewed. You can assign the PR to them by writing /assign @praveenkumar in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Jul 07 '21 14:07 openshift-ci[bot]

/retest

spaparaju avatar Jul 07 '21 15:07 spaparaju

/retest

spaparaju avatar Jul 07 '21 18:07 spaparaju

/retest

spaparaju avatar Jul 07 '21 20:07 spaparaju

/retest

spaparaju avatar Jul 08 '21 01:07 spaparaju

@spaparaju: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-snc f681fbf8cf97a6aaa969d5a98e12af648f943263 link /test e2e-snc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci[bot] avatar Jul 08 '21 01:07 openshift-ci[bot]

@spaparaju: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci[bot] avatar Mar 02 '22 15:03 openshift-ci[bot]

Closing this since some of the suggested changes for performance already applied and some we can't because it change cluster behavior and use of alpha/beta api.

praveenkumar avatar Feb 15 '23 10:02 praveenkumar