trafficserver icon indicating copy to clipboard operation
trafficserver copied to clipboard

TLS session ticket get out of sync between servers

Open duke8253 opened this issue 2 years ago • 1 comments

When developing a new plugin for sharing STEK (Session Ticket Encryption Key) on a colo, I discovered that some of the boxes on the colo I was testing cannot be synced up. Upon further investigation, the problem exists even with manually setting a key file (https://docs.trafficserver.apache.org/admin-guide/files/records.config.en.html#proxy-config-ssl-server-ticket-key-filename), and uploading it to all the servers. This issue is also present with Apple, and @masaori335 was doing testing on their side. However, this issue disappeared over the past weekend for no apparent reason.

Below is the description of the problem, please add comments if I missed anything @bneradt @bryancall @masaori335:

  • When STEK is shared among servers, the normal behavior is these servers can reuse TLS sessions from each other, e.g. server X can resume sessions created on server Y, X and Y being any of the servers in that colo.
  • The problem we're seeing now is, even though STEK is being shared correctly among all servers in the same colo, sometimes there will be a small group of servers that cannot resume session created on others.
  • The group of servers that cannot resume session from other servers, can share session within the group (always the case), essentially splitting the colo into two groups of servers that can resume session within their own group, but not across.
  • Of the two groups formed, the larger group usually contains 75% or more servers of that colo.
  • It's always two groups of servers if this problem is present.
  • It's always the same servers in the same group on the same colo, but it can be different servers on different colos, and the servers persists after ATS restart. E.g. on colo 1, it's always the servers A, C, F, M that forms a group; while on colo 2, it's always the servers B, F, S that forms a group.
  • All related OpenSSL API calls for session encrypt/decrypt return success with the same values across all servers, HMAC_Init_ex, EVP_EncryptInit_ex, EVP_DecryptInit_ex, RAND_bytes.

duke8253 avatar Mar 16 '22 16:03 duke8253

I checked this again, we're still facing this issue. But no clue yet.

We tried "touch & reload" with ssl_multicert.config and records.config, but it didn't work.

masaori335 avatar May 24 '22 00:05 masaori335

This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.

github-actions[bot] avatar May 24 '23 01:05 github-actions[bot]