emissary Only watch tls secrets

Only watch tls secrets

Open raffis opened this issue 1 year ago • 1 comments

Description

Emissary has quite some performance problems on our cluster. It's using 1 full core and over 7G memory without any requests. Basically on standby. Screenshot from 2022-09-22 09-48-43

Profiling: Screenshot from 2022-09-22 09-39-11

The problem is that ambassador basically ends up reconciling non stop because it watches quite a big resource list including secrets, ingresses, ep and so on. On a big cluster this can be quite cpu/mem heavy. In our case we have ~10.000 secrets on the cluster and emissary globally (not single namespace) will read all 10.000 secrets. On top of that some secrets change quite often which triggers basically an infinite reconcile loop.

The solution for this is:

Remove ingress access for emissary from the clusterrole (we only use ambassador mappings)
This pr, as ambassador should only read tls secrets. I don't see a reason why ambassador should read other secrets. (This reduces the number of secrets from 10.000 to like 30 in our case).

After those changes emissary uses like 120m/400Mi on average.

If non tls secrets are used for any reason there should be a possibility to declare a resource filter instead.

Note: I also updated the dev guide as the things there do not work as stated.

Checklist

[ ] I made sure to update CHANGELOG.md.

Remember, the CHANGELOG needs to mention:
- Any new features
- Any changes to our included version of Envoy
- Any non-backward-compatible changes
- Any deprecations
[x] This is unlikely to impact how Ambassador performs at scale.

Remember, things that might have an impact at scale include:
- Any significant changes in memory use that might require adjusting the memory limits
- Any significant changes in CPU use that might require adjusting the CPU limits
- Anything that might change how many replicas users should use
- Changes that impact data-plane latency/scalability
[ ] My change is adequately tested.

Remember when considering testing:
- Your change needs to be specifically covered by tests.
  - Tests need to cover all the states where your change is relevant: for example, if you add a behavior that can be enabled or disabled, you'll need tests that cover the enabled case and tests that cover the disabled case. It's not sufficient just to test with the behavior enabled.
- You also need to make sure that the entire area being changed has adequate test coverage.
  - If existing tests don't actually cover the entire area being changed, add tests.
  - This applies even for aspects of the area that you're not changing – check the test coverage, and improve it if needed!
- We should lean on the bulk of code being covered by unit tests, but...
- ... an end-to-end test should cover the integration points
[x] I updated DEVELOPING.md with any any special dev tricks I had to use to work on this code efficiently.
[ ] The changes in this PR have been reviewed for security concerns and adherence to security best practices.

Sep 22 '22 14:09 raffis

Thanks for the contribution!

My main concern is that this could end up breaking some users if they don't store tls secrets as kubernetes.io/tls

Also we merged #4488 which also helps with constant re-configurations and will be available in v3.2 and v2.4

Also note that support for certificate revocation lists loads the CRL configuration for the feature from from non-tls secrets. I believe we also do watch for opaque secrets for validation contexts.

Sep 22 '22 16:09 AliceProxy

Also we merged https://github.com/emissary-ingress/emissary/pull/4488 which also helps with constant re-configurations and will be available in v3.2 and v2.4

v3.2 does not fix it. It also consumes like 10G of memory just because it reads the entire list of secrets from the cluster.

Oct 03 '22 13:10 raffis

ah I see. Reviewing you're profile again, it's spending pretty much all of its time on the K8sWatcher side. The other fix addresses an issue on the other side - preventing each change from triggering a reconfiguration which can also take up a lot of memory. I'm curious what you set the batch interval to.

As for the idea to configure a filter, I think that works. In terms of where to place it, Module is a pretty old resource so we probably don't want to add new things in there. So I would probably prefer as an env var, to configure which secrets to watch based on label selectors? But that shouldn't be any different than AMBASSADOR_LABEL_SELECTOR https://www.getambassador.io/docs/emissary/latest/topics/running/environment/#ambassador_label_selector. I'll admit that we should make this option more visible in the performance and scaling section of the docs but have you tried setting this option and labeling your secrets to see if that resolves your issue?

Also we merged #4488 which also helps with constant re-configurations and will be available in v3.2 and v2.4

v3.2 does not fix it. It also consumes like 10G of memory just because it reads the entire list of secrets from the cluster.

ah ok. Reviewing your profile again, it's spending a lot of time on the K8sWatcher side due to watching the

Oct 03 '22 13:10 haq204

As for the idea to configure a filter, I think that works. In terms of where to place it, Module is a pretty old resource so we probably don't want to add new things in there. So I would probably prefer as an env var, to configure which secrets to watch based on label selectors? But that shouldn't be any different than AMBASSADOR_LABEL_SELECTOR https://www.getambassador.io/docs/emissary/latest/topics/running/environment/#ambassador_label_selector. I'll admit that we should make this option more visible in the performance and scaling section of the docs but have you tried setting this option and labeling your secrets to see if that resolves your issue?

I was actually not aware that env exists, this might solve it but it sounds unpractical since this involves labeling everything from mappings, services, secrets and so on right?

What would be nice is to optionally specify the resourcegroup and support multiple selectors. Something like:

AMBASSADOR_LABEL_SELECTOR=secrets.v1:mylabel=myvalue,services.v1:mylabel=test

In my case I could then label my two secrets and define this env like:

AMBASSADOR_LABEL_SELECTOR=secrets.v1:emissary-secret=yes

Otherwise I would need to change like 50 repos and label services and so on as well. This change might even be backwards compatible, didn't think that through yet though.

Oct 03 '22 14:10 raffis

in order to specify label selectors per resource, let's create a new env var idk call it AMBASSADOR_WATCHER_LABEL_SELECTOR and we can mark the AMBASSADOR_LABEL_SELECTOR as deprecated to be removed in a future version as the formats will be different.

We could have the same env var support the two different formats but I figure that would make the parsing logic more complicated so it might be more clear to have them as two separate env vars.

How does that sound?

Oct 03 '22 14:10 haq204

How does that sound?

Sounds good to me 👍🏻 , I'll have a look at the code tomorrow.

Oct 03 '22 14:10 raffis

New functionality implemented, docs here: https://github.com/datawire/ambassador-docs/pull/700 I also added the same for the field selector (which was not even documented before).

Two comments,

When I run tests with make gotest it takes ages and fails eventually, seeing some timeouts like testutil_fake_test_test.go:86: Get timed out! in the output and a lots of other output. This looks unrelated to my change. The tests added should work. They failed first time I implemented it and this was visible before all the follow up errors.
I see that the env AMBASSADOR_LABEL_SELECTOR is also used in python/*, honestly no clue what the python stuff is for, and ARCHITECTURE.md is empty where I hoped to get informed about this. Not sure if something needs to be changed there.

However I tested the new feature on our cluster and it works so far.

Oct 04 '22 15:10 raffis

New functionality implemented, docs here: datawire/ambassador-docs#700 I also added the same for the field selector (which was not even documented before).

Two comments,

When I run tests with make gotest it takes ages and fails eventually, seeing some timeouts like testutil_fake_test_test.go:86: Get timed out! in the output and a lots of other output. This looks unrelated to my change. The tests added should work. They failed first time I implemented it and this was visible before all the follow up errors.

I see that the env AMBASSADOR_LABEL_SELECTOR is also used in python/*, honestly no clue what the python stuff is for, and ARCHITECTURE.md is empty where I hoped to get informed about this. Not sure if something needs to be changed there.

However I tested the new feature on our cluster and it works so far.

I pulled down your changes to check. make gotest also runs integration tests (and also runs in CI which unfortunately doesn't run correctly for community contributions due to insufficient permissions but we're actively working to fix that).

In order to properly run them you will need to set the following env vars:

DEV_REGISTRY # Docker registry to push/pull local builds to
DEV_KUBECONFIG #path to a valid kubeconfig file. Integrations tests will spawn a k3s cluster container

For the python stuff, that is old legacy code. I don't think we need to update that.

Oct 04 '22 18:10 haq204

I pulled down your changes to check. make gotest also runs integration tests (and also runs in CI which unfortunately doesn't run correctly for community contributions due to insufficient permissions but we're actively working to fix that).

In order to properly run them you will need to set the following env vars:
DEV_REGISTRY # Docker registry to push/pull local builds to
DEV_KUBECONFIG #path to a valid kubeconfig file. Integrations tests will spawn a k3s cluster container
For the python stuff, that is old legacy code. I don't think we need to update that.

Thanks for the explanation 👍🏻

(I wonder why make gotest needs python then, because that command fails early if my env python3 does not point to python 3.9)

Oct 05 '22 08:10 raffis

I changed some bits and it supports now selectors without the apiversion for the resourcegroup.

Oct 05 '22 08:10 raffis

@haq204 can we run the ci job?

Oct 13 '22 11:10 raffis

Sorry to bother you but can I please 🙏🏻 have an update on this? @haq204 @AliceProxy

Oct 20 '22 09:10 raffis

Hi @raffis sorry for the delay got busy with other things.

So change wise it looks good to me but I brought this up to the rest of the Maintainers team and we were discussing it internally. It turns out that this might actually be a regression from all the way back to 1.14 Legacy Mode. As far as my understanding goes it's not supposed to watch for all secrets and only for those it cares about.

Given this information, this seems to suggest that the proper approach now is to, well fix the regression rather than adding another env var toggle.

@LukeShu @kflynn I think your thoughts would be constructive to this.

Oct 21 '22 14:10 haq204

emissary emissary copied to clipboard

Only watch tls secrets

Description

Checklist

emissary
emissary copied to clipboard