envoy
envoy copied to clipboard
configs: Fix default config
Signed-off-by: Ryan Northey [email protected]
Commit Message: configs: Fix default config Additional Description:
rather than proxying to the envoy website which often responds with a redirect this updates the default config to just post a link to the website and docs
Risk Level: Testing: Docs Changes: Release Notes: Platform Specific Features: [Optional Runtime guard:] [Optional Fixes #Issue] [Optional Deprecated:] [Optional API Considerations:]
cc @mattklein123 @alyssawilk
lgtm, //test/config_test:example_configs_test is failing due to changes in cluster config.
//test/config_test:example_configs_test is failing due to changes in cluster config
i thought i had fixed it, but obviously not, ill check again...
hmm, i see it wants a cluster to be configured
im wondering if its valid to have 0 clusters configured, certainly Envoy starts/works with the config in the PR
Are you still having trouble after the scheme header landed? If so I can try to help debug
yep
in my pr #16899 it loads the default config loads of times and does a grep test to check the server is responding with "default" output - ie envoy website content
the failure is a bit wierd - it always fails on the arm build, it usually fails on the x86 build, it very rarely (but it happens) fails on only some on the x86 build
testing locally it hardly ever fails, but does some times
ive come to the conclusion that there is some caching going on and perhaps different netlify endpoints respond differently accordingly
i havent tested with google again since you landed the scheme mangling but previously it was similar (i didnt test extensively)
there is another pr here - to test it in isolation (which is flakey) #17200
here is an example of it failing on azp - https://dev.azure.com/cncf/envoy/_build/results?buildId=81664&view=logs&j=f1a49366-51c0-540f-0294-f73c2bb311a9&t=7fd036c9-61ec-5b59-3645-e90950dc67a6&l=1760
its probs not that helpful as it doesnt show you the response/code but its trying to redirect
there is value in having an example of Envoy forwarding to another website.
what about we add a loopback proxy listener in the config
i quite like the idea that the example will work without a network connection, altho its a bit "Envoy is a proxy, but thats not what we are going to show you" i guess if it doesnt proxy at all
the other thing is dealing with historical versions which have the same problem (worse afaict) - this pr would be ~easy to backport i think
heres a more helpful failure on the packaging pr https://dev.azure.com/cncf/envoy/_build/results?buildId=82041&view=logs&j=7d299c4a-9a35-59b9-677d-f6996e8fc292&t=2e3bbaf1-9c4a-5c1a-ee33-a8cc5407b4d9&l=5227
you can see the logs and response there
Didn't squeeze it in this week but I've blocked off some time Monday to check it out.
i added a loopback, so there is a proxy in this example now
Looks like CI is still unhappy? I'm fine with envoy_demo being a local reply, but I think the point of *.proxy.yaml is to provide an example of proxying data from one place to another. I think we'd be better off switching websites to not-really-proxying towards envoyproxy.io
/wait
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
@mattklein123 @alyssawilk
i tested again today to see whether any changes at netlify had resolved the problem that this addresses, unfortunately it doesnt seem to have:
PackagesDistroChecker ERROR [distros] [debian_bullseye/envoy-1.20:proxy-responds] Test failed
Envoy did not respond correctly
Response was
Redirecting to https://www.envoyproxy.io/
not sure about the best way forward, but im thinking these are the options:
- fix whatever is breaking with this pr, and land this solution
- figure out what headers (?) are being sent to netlify that means sni is not working as expected
- accept that http -> https is probs not the best pattern to use in the default demo/config and figure out an alternative to proxy to (i guess reverting to google is an option, but im pretty sure ive seen it display similar redirect issues)
one way or another i kinda need to be able to test that envoy is responding correctly with the default config - the current situation means that it is 50/50 whether you get a redirect or the actual page
im also concerned that the problems netlify have had with redirect loops are caused by the default config in the docker image causing it to cache a redirect, and then that gets served to others - i may be wrong about this, but it is consistent with my testing.
personally, i quite like the solution here, and im guessing it wont be too hard to fix ci, but im really open to whatever we can come up with to fix the default config
CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
ci should be fixed
@alyssawilk @mattklein123 this is no longer required for my current work as the netlify issues appears to be fixed - so im happy to close
that said, i quite like the idea of the default config not requiring a network connection, especially in the context of an OS/package install
im also left wondering - if my theory that our default config was the cause of the redirect loop is correct - what does envoy's sni implementation expose that leaks the information about http that i guess is causing the redirect - my assumption is that an sni request originating from either http or https should look the same to the upstream, but maybe this is wrong
I find having a default config for a proxy actually do proxying more useful, so I'd be inclined to close this off than land it.
upstream http vs https should be exposed via x-forwarded-proto (though our default config overrides that AFIK) but that wasn't the bug with netlify because it was happening with Chrome and an https url which wouldn't be doing x-forward-proto: http.. So I am baffled about the netlify issue but glad it resolved itself!
alternately if you want best of both worlds, we could try to hit up envoyproxy.io and have a custom error rewrite handler which notes lack of network connectivity and sends a canned response? /wait
but that wasn't the bug with netlify because it was happening with Chrome and an https url which wouldn't be doing x-forward-proto: http.
i believe due to caching
it was on the days where i was triggering the problem that others (re-)reported the browser issue
(hence the redirect loop - ie http -> https request happens, redirect is cached for https response, new https request happens, fail)
alternately if you want best of both worlds, we could try to hit up envoyproxy.io and have a custom error rewrite handler which notes lack of network connectivity and sends a canned response?
wfm - ill update as i have some time
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!
this is still an issue as for CI (and i guess end users) as the website can still be flakey
we could have default config be to proxy and retry config serve from cache? I wouldn't have CI test that we can reach envoyproxy.io. but I'd fix CI to not e2e test the internet, rather than replace a useful config.
we could have default config be to proxy and retry config serve from cache?
i guess that would be possible - iiuc then adding adding a warm cache to prevent ci from testing web connect
in this case we are testing what an end user is getting in the package - what CI is demonstrating is in a way end user experience - so i think there are 2 separate issues - website flakiness, and the config
re the config my current thought - altho i havent had time to pursue it was something like - a static page served by default with a proxied link to the website on a separate path/port - that way the default page is at least not proxying to web etc immediately but at the same time still demonstrates a proxy as part of the default config