istio.io icon indicating copy to clipboard operation
istio.io copied to clipboard

Bump master to next version

Open howardjohn opened this issue 1 year ago • 10 comments

Move https://github.com/istio/istio.io/pull/15555 to my fork

howardjohn avatar Aug 21 '24 21:08 howardjohn

/test doc.test.multicluster

craigbox avatar Aug 21 '24 23:08 craigbox

needs update to https://istio.io/latest/blog/2019/introducing-istio-operator/ (if not more things that refer to the install doc for the operator)

It might be best to leave those files in and we can address them in a follow up PR?

craigbox avatar Aug 21 '24 23:08 craigbox

/retest for funsies

craigbox avatar Aug 23 '24 04:08 craigbox

@craigbox: The /retest command does not accept any targets. The following commands are available to trigger required jobs:

  • /test doc.test.dualstack
  • /test doc.test.multicluster
  • /test doc.test.profile-ambient
  • /test doc.test.profile-default
  • /test doc.test.profile-demo
  • /test doc.test.profile-minimal
  • /test doc.test.profile-none
  • /test gencheck
  • /test lint

The following commands are available to trigger optional jobs:

  • /test update-ref-docs-dry-run

Use /test all to run all jobs.

In response to this:

/retest for funsies

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

istio-testing avatar Aug 23 '24 04:08 istio-testing

sigh, I was only being silly.

/retest for realsies.

craigbox avatar Aug 23 '24 08:08 craigbox

default: gateway API failure due to this fixed; now just authz-tcp flake multicluster: Helm has failed to deploy an egress gateway? This merits debugging I have not found out how to do locally. Perhaps I could have three instances of kind? none: webhooks aren't being deleted when revisions are removed: https://github.com/istio/istio/issues/36905

At least one test works around that like this

kubectl get validatingwebhookconfiguration --no-headers=true | awk '/^istio/ {print $1}' | xargs kubectl delete validatingwebhookconfiguration
kubectl get mutatingwebhookconfiguration --no-headers=true | awk '/^istio/ {print $1}' | xargs kubectl delete mutatingwebhookconfiguration

craigbox avatar Aug 23 '24 10:08 craigbox

is this already sorted out? Or any help needed?

kfaseela avatar Aug 27 '24 10:08 kfaseela

There was some discussion on #docs. its not sorted out

howardjohn avatar Aug 27 '24 17:08 howardjohn

I've identified the things that I think are causing the flakes, and would love to have someone take a look at trying to fix them.

Some that are in separate issues already:

  • https://github.com/istio/istio.io/issues/15505
  • https://github.com/istio/istio.io/issues/15603
  • https://github.com/istio/istio/issues/36905

craigbox avatar Aug 28 '24 09:08 craigbox

/retest

kfaseela avatar Sep 02 '24 09:09 kfaseela

/test doc.test.multicluster

kfaseela avatar Sep 02 '24 13:09 kfaseela

@nshankar13 : any idea about the multi-cluster failures?

kfaseela avatar Sep 02 '24 21:09 kfaseela

/test doc.test.profile-none

kfaseela avatar Sep 06 '24 11:09 kfaseela

@howardjohn @craigbox @dhawton I was finally able to run the multicluster test locally, and want to ask you if you can review the below Pr and get it merged, so that we can rerun the multicluster test here.. There are certain tests that have got mixed, and some steps which are supposed to be run only in non gateway api cases seem to be running in gateway api test as well. So may be we can fix that first and then would like to rerun this.

https://github.com/istio/istio.io/pull/15662

Background for the fix: The failing gwapi test here, says egress gateway cannot be deployed, while the document clearly says we can skip those steps for gateway api.

https://storage.googleapis.com/istio-prow/pr-logs/pull/istio_istio.io/15595/doc.test.multicluster_istio.io/1830594623492329472/artifacts/tests-setup-multicluster-a9bbe6/TestDocs/setup/install/external-controlplane/gtwapi_test.sh/gtwapi_test.sh/_test_context/gtwapi_test.sh_output.txt

kfaseela avatar Sep 06 '24 20:09 kfaseela

Done

craigbox avatar Sep 06 '24 21:09 craigbox

/test doc.test.multicluster

kfaseela avatar Sep 06 '24 21:09 kfaseela

while the gwapi test passed this time, the webhook deletion failure is still there. let me debug that now

kfaseela avatar Sep 06 '24 21:09 kfaseela

while the gwapi test passed this time, the webhook deletion failure is still there. let me debug that now

That's a "known issue" in istioctl. We can work around it in the tests at least

craigbox avatar Sep 06 '24 21:09 craigbox

while the gwapi test passed this time, the webhook deletion failure is still there. let me debug that now

That's a "known issue" in istioctl. We can work around it in the tests at least

Do we have an issue raised for that already? is this applicable only on master? Coz the test passes on older versions

kfaseela avatar Sep 06 '24 21:09 kfaseela

istio/istio#36905

craigbox avatar Sep 06 '24 21:09 craigbox

istio/istio#36905

That is a very old issue, and if you see, this test passed on my gwapi PR without any issues. Doesn't that mean the cleanup problem is happening only on master?

kfaseela avatar Sep 06 '24 21:09 kfaseela

cc @zirain , any pointers on the distributed tracing failure? https://prow.istio.io/view/gs/istio-prow/pr-logs/pull/istio_istio.io/15595/doc.test.profile-none_istio.io/1832017004517658624

kfaseela avatar Sep 06 '24 21:09 kfaseela

Pass, I never saw any suggestion it was fixed: changed and the behaviour is consistent. Didn't do any more research sorry.

craigbox avatar Sep 06 '24 21:09 craigbox

Pass, I never saw any suggestion it was fixed: changed and the behaviour is consistent. Didn't do any more research sorry.

The only webhook that is failed to cleanup is

  "mutatingWebhookConfigurations": [
    "istio-revision-tag-default"
  ],

and this seems to be not deleted on the remote cluster, where the command used to delete is

$ istioctl manifest generate -f remote-config-cluster.yaml --set values.defaultRevision=default | kubectl delete --context="${CTX_REMOTE_CLUSTER}" -f -

kfaseela avatar Sep 06 '24 22:09 kfaseela

Currently there is a failure at the helm install step. not sure why the test is doing istioctl for ingress and helm for egress. While both should work, let us first see if istioctl works.

helm install --set global.tag=1.24-alpha.89d73dc1f639d1625135923c381e4ec222247059 istio-egressgateway manifests/charts/gateway -n external-istiod --kube-context=kind-doc-cluster3 --set service.type=ClusterIP
kubectl get pod -l app=istio-egressgateway -n external-istiod --context=kind-doc-cluster3 -o 'jsonpath={.items[*].status.phase}'

kfaseela avatar Sep 08 '24 20:09 kfaseela

/test doc.test.multicluster

kfaseela avatar Sep 09 '24 15:09 kfaseela

It looks like the telemetry test is just timing out, haven't looked closer into exactly where/why that's stalling...

panic: test timed out after 1h0m0s
	running tests:
		TestDocs (1h0m0s)
		TestDocs/tasks/observability/distributed-tracing/skywalking/test.sh (32m27s)
		TestDocs/tasks/observability/distributed-tracing/skywalking/test.sh/test.sh (32m27s) 

Maybe https://github.com/istio/istio.io/pull/15515 broke something where it's waiting for an older version of a resource that will never be created now, or did that not get backported to release/1.23 but should have been?

mikemorris avatar Sep 09 '24 19:09 mikemorris

It looks like the telemetry test is just timing out, haven't looked closer into exactly where/why that's stalling...

panic: test timed out after 1h0m0s
	running tests:
		TestDocs (1h0m0s)
		TestDocs/tasks/observability/distributed-tracing/skywalking/test.sh (32m27s)
		TestDocs/tasks/observability/distributed-tracing/skywalking/test.sh/test.sh (32m27s) 

Maybe #15515 broke something where it's waiting for an older version of a resource that will never be created now, or did that not get backported to release/1.23 but should have been?

That is in the master branch already.

dhawton avatar Sep 09 '24 20:09 dhawton

Able to reproduce the external controlplane issue locally, looks like there is some issue on 1.24 with external namespace usage for installation of istio components. Will debug further and post updates

kfaseela avatar Sep 10 '24 08:09 kfaseela

Able to reproduce the external controlplane issue locally, looks like there is some issue on 1.24 with external namespace usage for installation of istio components. Will debug further and post updates

https://github.com/istio/istio/pull/53073 hopefully should fix the external controlplane problem

kfaseela avatar Sep 10 '24 15:09 kfaseela