otp Add option to configure encryption seed for TLS 1.3 stateless tickets

This enables TLS 1.3 stateless session tickets to work across multiple server instances by allowing the user to configure the same encryption seed in each instance, through a newly added stateless_tickets_seed ssl server option.

Closes #5962.

May 11 '22 13:05 ahovgaard

CT Test Results

      2 files     64 suites 48m 21s :stopwatch:   738 tests   666 :heavy_check_mark:   71 :zzz: 1 :x: 3 497 runs 2 719 :heavy_check_mark: 777 :zzz: 1 :x:

For more details on these failures, see this check.

Results for commit 489c3a3f.

:recycle: This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

May 11 '22 14:05 github-actions[bot]

I think that one of the main use cases for stateless session tickets is to work across multiple server instances. So I think this looks interesting. I added a testing label to start with, and we will get back to you later.

Jun 01 '22 11:06 IngelaAndin

current status update in general the code looks fine, but we're wondering/checking 2 more areas:

adding some warning note about consequences of re-using random values with multiple server instance on different machines (such as anti_replay will not work if bloom filter structure is not shared for example, any others? relevance of time synchronization between machines/zones?)
RFC 8446 contains such description which relates to startup/restart of server (considering single server instance):

When implementations are freshly started, they SHOULD reject 0-RTT as long as any portion of their recording window overlaps the startup time. Otherwise, they run the risk of accepting replays which were originally sent during that period.

Today our bloom filter is accepting tickets before 1st rotation of data structure. Even though, when generating random values (as today) for a listen socket, we will reject a ticket from previous incarnation of ssl server. However I wonder if with your option used we should not improve bloom filter implementation as well. So that when same randoms are used before and after restart, we have some "warm-up" period lasting for 1 Window (?) when we treat all tickets as used. Such a "warm-up" would assure ticket received was create by current ssl server incarnation.

I've added some tests in https://github.com/erlang/otp/pull/6055 ticket_reuse_anti_replay_server_restart test will probably start failing if you combine it with option proposed ... maybe you can confirm/check that ?

Jun 13 '22 10:06 u3s

@u3s Sorry for the late response, I have been away from my computer. I plan to get back to this later this week.

Jun 27 '22 12:06 ahovgaard

#6055 is merged

Jun 28 '22 15:06 u3s

current status update in general the code looks fine, but we're wondering/checking 2 more areas:

adding some warning note about consequences of re-using random values with multiple server instance on different machines (such as anti_replay will not work if bloom filter structure is not shared for example, any others? relevance of time synchronization between machines/zones?)

Are you thinking about adding a warning to the ssl documentation (ssl.xml) together with the proposed stateless_tickets_seed option? I think that would make sense. Those are the two main issues that I can think of, as well.

RFC 8446 contains such description which relates to startup/restart of server (considering single server instance):

When implementations are freshly started, they SHOULD reject 0-RTT as long as any portion of their recording window overlaps the startup time. Otherwise, they run the risk of accepting replays which were originally sent during that period.

Today our bloom filter is accepting tickets before 1st rotation of data structure. Even though, when generating random values (as today) for a listen socket, we will reject a ticket from previous incarnation of ssl server. However I wonder if with your option used we should not improve bloom filter implementation as well. So that when same randoms are used before and after restart, we have some "warm-up" period lasting for 1 Window (?) when we treat all tickets as used. Such a "warm-up" would assure ticket received was create by current ssl server incarnation.

Yes, I think it makes sense to add this "warm-up" period. Do you want me to do that in this PR or can it be deferred to another PR?

I've added some tests in #6055 ticket_reuse_anti_replay_server_restart test will probably start failing if you combine it with option proposed ... maybe you can confirm/check that ?

The tests pass as they are right now, but ticket_reuse_anti_replay_server_restart does indeed fail when extended to configure the same ticket seed before and after the server restart. With the warm-up period added, we should be able to make it pass using the proposed option as well.

Jun 30 '22 14:06 ahovgaard

adding some warning note about consequences of re-using random values with multiple server instance on different machines (such as anti_replay will not work if bloom filter structure is not shared for example, any others? relevance of time synchronization between machines/zones?)

Are you thinking about adding a warning to the ssl documentation (ssl.xml) together with the proposed stateless_tickets_seed option? I think that would make sense. Those are the two main issues that I can think of, as well.

Yes, please - to make option users aware/remind of risks associated with it. Maybe use <warning> tag.

Today our bloom filter is accepting tickets before 1st rotation of data structure. Even though, when generating random values (as today) for a listen socket, we will reject a ticket from previous incarnation of ssl server. However I wonder if with your option used we should not improve bloom filter implementation as well. So that when same randoms are used before and after restart, we have some "warm-up" period lasting for 1 Window (?) when we treat all tickets as used. Such a "warm-up" would assure ticket received was create by current ssl server incarnation.

Yes, I think it makes sense to add this "warm-up" period. Do you want me to do that in this PR or can it be deferred to another PR?

I've added some tests in #6055 ticket_reuse_anti_replay_server_restart test will probably start failing if you combine it with option proposed ... maybe you can confirm/check that ?

The tests pass as they are right now, but ticket_reuse_anti_replay_server_restart does indeed fail when extended to configure the same ticket seed before and after the server restart. With the warm-up period added, we should be able to make it pass using the proposed option as well.

I think it would be preferred if this PR would include bloom filter adjustment, because after adding encryption seed option bloom filter missing implementation will be potentially exposed. With random being generated upon socket creation (before this PR), missing warm-up is not a problem as tickets from previous instance of server will not be usable.

This would be nice to have (pretty easy to implement I guess):

adjustment of tls_bloom_filter module - so it rejects all ClientHellos before it rotates twice since startup (2*WindowSize must pass)
new test case based on ticket_reuse_anti_replay_server_restart using encryption seed option

What will be more tricky is what happens with legacy tests after adding warm-up to tls_bloom_filter. I haven't got an idea on preferred solution for that one yet. But maybe you could add parts above first - and then we could see how big the problem is and discuss how we could solve it. Preferably we would like to avoid too many, unnecessary sleep instructions in tests ... maybe warm-up should work only when encryption option is specified ... I'm not sure yet. I need to think about. @dgud, @IngelaAndin - maybe you have some ideas?

Jun 30 '22 15:06 u3s

What will be more tricky is what happens with legacy tests after adding warm-up to tls_bloom_filter. I haven't got an idea on preferred solution for that one yet. But maybe you could add parts above first - and then we could see how big the problem is and discuss how we could solve it. (..)

@u3s Please review the pushed changes. As expected, the added warm-up breaks a a few of the test cases in ssl_session_ticket_SUITE at least.

Jul 07 '22 10:07 ahovgaard

@u3s I'm not sure what went wrong with the PR pipeline. Some build tasks failed with write /dev/stdout: no space left on device, while the ssl CT tests failed the test case openssl_session_ticket_SUITE:openssl_server_basic/1. However, this test case shouldn't be affected by the changes in this PR since the test uses the OpenSSL s_server and not the OTP ssl server. Besides, with the recent change, existing functionality shouldn't change unless the newly added option is specified.

Aug 05 '22 08:08 ahovgaard

@u3s Would you mind taking another look at this PR?

Sep 05 '22 08:09 ahovgaard

sorry for holiday period related delay. we will have a look on this soon.

Sep 08 '22 08:09 u3s

thanks for contribution!

Sep 21 '22 10:09 u3s

otp otp copied to clipboard

Add option to configure encryption seed for TLS 1.3 stateless tickets

CT Test Results

Artifacts

otp
otp copied to clipboard