envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Add support for intermediate CA as trusted_ca

Open LuyaoZhong opened this issue 2 years ago • 11 comments

Add support for intermediate CA as trusted_ca with X509_V_FLAG_PARTIAL_CHAIN

Risk Level: Low Testing: unit tests Docs Changes: ssl Release Notes: yes Fixes #19326

Signed-off-by: Luyao Zhong [email protected]

LuyaoZhong avatar Jul 21 '22 11:07 LuyaoZhong

Before proceeding with this, we need to conclude discussion in the issue to make sure this won't have any negative side effects.

/wait

ggreenway avatar Jul 21 '22 15:07 ggreenway

Yeah this sounds good to me, as a quick review, please add release note as behavior change.

lizan avatar Jul 22 '22 07:07 lizan

@ggreenway @lizan

I just tried to get the CI results (I thought it should not break any tests). Unfortunately, the cert verification test fail, I need to do some investigation.

LuyaoZhong avatar Jul 22 '22 10:07 LuyaoZhong

/wait

ggreenway avatar Jul 25 '22 16:07 ggreenway

@ggreenway @lizan @dnephin

Without X509_V_FLAG_PARTIAL_CHAIN, intermediate ca cert will not be treated as trust-anchors, it requires the trust-ca being a cert chain which containning root ca cert. max_verify_depth is config for the max number of intermediate certificates in chain that are parsed during verification. So it expects a failure when setting max_verify_depth no longer than root ca cert depth.

https://github.com/envoyproxy/envoy/blob/18212bb6395af308d895f75352f82df522b038b4/test/extensions/transport_sockets/tls/cert_validator/default_validator_integration_test.cc#L68-L73

After adding X509_V_FLAG_PARTIAL_CHAIN as a default flag, this test can not pass. Because the intermediate ca cert can be treated as trust-anchor with this flag, the verification will succeed.

Except above, I don't think there are other negative side effects. Two options to enable this feature,

  1. making this flag as default, and refine the testcases a little bit
  2. do not set X509_V_FLAG_PARTIAL_CHAIN if max_verify_depth is specified

which one do you prefer?

LuyaoZhong avatar Jul 26 '22 06:07 LuyaoZhong

  1. sounds reasonable to me.

lizan avatar Jul 26 '22 07:07 lizan

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/). envoyproxy/api-shepherds assignee is @adisuissa CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/22350 was synchronize by LuyaoZhong.

see: more, trace.

/retest

@adisuissa @ggreenway this looks ready for another round. Thanks!

jmarantz avatar Aug 15 '22 12:08 jmarantz

Retrying Azure Pipelines: Retried failed jobs in: envoy-presubmit

:cat:

Caused by: a https://github.com/envoyproxy/envoy/pull/22350#issuecomment-1214974927 was created by @jmarantz.

see: more, trace.

/wait

adisuissa avatar Aug 17 '22 13:08 adisuissa

Before proceeding with this, we need to conclude discussion in the issue to make sure this won't have any negative side effects.

What's the general mood on having this become part of main?

dynajoe avatar Sep 21 '22 16:09 dynajoe

Thanks for the update, I think it LGTM

This is probably ready for @ggreenway review again since the test was fixed https://github.com/envoyproxy/envoy/pull/22350#discussion_r952527040

soulxu avatar Oct 20 '22 12:10 soulxu

/retest

as there is no current CI result.

wrowe avatar Oct 26 '22 21:10 wrowe

Retrying Azure Pipelines:

:cat:

Caused by: a https://github.com/envoyproxy/envoy/pull/22350#issuecomment-1292668053 was created by @wrowe.

see: more, trace.

It seems this is blocked on CodeQL, merge main should unblock CI, @LuyaoZhong

wrowe avatar Oct 26 '22 21:10 wrowe

seems like CI is stuck and this needs a merge from main @LuyaoZhong

/wait

KBaichoo avatar Nov 01 '22 15:11 KBaichoo

/wait

This probably need maintainer feedback on this https://github.com/envoyproxy/envoy/pull/22350#discussion_r1001224029 before merge.

soulxu avatar Nov 07 '22 11:11 soulxu

This looks good overall. I think we should add a runtime flag (default enabled) to disable the new behavior in the off chance this causes an issue for someone. Other than that, I think this looks good.

@ggreenway done

LuyaoZhong avatar Nov 16 '22 13:11 LuyaoZhong

oh, wait, we should add test for runtime flag is false.

@soulxu For this feature, I don't think we need test for runtime flag is false, since the old code path(other than the code path that the runtime flag controls) was running for a long time and already tested before.

LuyaoZhong avatar Nov 24 '22 01:11 LuyaoZhong

oh, wait, we should add test for runtime flag is false.

@soulxu For this feature, I don't think we need test for runtime flag is false, since the old code path(other than the code path that the runtime flag controls) was running for a long time and already tested before.

we should keep the old test to ensure there is no regression in the future

soulxu avatar Nov 24 '22 02:11 soulxu

oh, wait, we should add test for runtime flag is false.

@soulxu For this feature, I don't think we need test for runtime flag is false, since the old code path(other than the code path that the runtime flag controls) was running for a long time and already tested before.

we should keep the old test to ensure there is no regression in the future

Sorry, what regression we might have? How to avoid regression by keeping old test?

LuyaoZhong avatar Nov 24 '22 02:11 LuyaoZhong

oh, wait, we should add test for runtime flag is false.

@soulxu For this feature, I don't think we need test for runtime flag is false, since the old code path(other than the code path that the runtime flag controls) was running for a long time and already tested before.

we should keep the old test to ensure there is no regression in the future

Sorry, what regression we might have? How to avoid regression by keeping old test?

Sorry, I may not describe it clearly. We use runtime flag for switch two logic. You add the tests for the new logic, but the old logic still needs to keep here. But if there is no test for the old logic, it could happen regression if someone missed change the code. Just like, you already test the new code today and the code works well, but it doesn't mean you can remove those tests for the new code, you still need to keep the tests for it in case we won't break the code in the future. In the day we deprecate the runtime flag, then we can remove the old logic/code and the old tests.

soulxu avatar Nov 24 '22 02:11 soulxu

Sorry, I may not describe it clearly. We use runtime flag for switch two logic. You add the tests for the new logic, but the old logic still needs to keep here. But if there is no test for the old logic, it could happen regression if someone missed change the code. Just like, you already test the new code today and the code works well, but it doesn't mean you can remove those tests for the new code, you still need to keep the tests for it in case we won't break the code in the future. In the day we deprecate the runtime flag, then we can remove the old logic/code and the old tests.

@soulxu Got it. Thanks.

LuyaoZhong avatar Nov 24 '22 06:11 LuyaoZhong

/retest

LuyaoZhong avatar Nov 25 '22 05:11 LuyaoZhong

Retrying Azure Pipelines: Retried failed jobs in: envoy-presubmit

:cat:

Caused by: a https://github.com/envoyproxy/envoy/pull/22350#issuecomment-1327027038 was created by @LuyaoZhong.

see: more, trace.