spire icon indicating copy to clipboard operation
spire copied to clipboard

SPIRE Agent now respects entry's TTL when issuing JWT-SVIDs

Open amartinezfayo opened this issue 3 years ago • 11 comments

Agents used to issue JWT-SVIDs with a TTL hardcoded to 5 minutes. When we refactored the Server APIs, the agents were changed to consume the new Server APIs. The NewJWTSVID RPC from the SVID API is now called by agents, specifying the entry ID. The server then mints the JWT with the TTL specified in the entry. This "new" behavior can be observed since SPIRE v0.11.0.

An example of how agents used to issue JWT-SVIDs can be seen here: https://github.com/spiffe/spire/blob/v0.10.2/pkg/agent/manager/manager.go#L168

Current implementation: https://github.com/spiffe/spire/blob/v1.1.3/pkg/agent/manager/manager.go#L197

We had some discussion in the past about how JWT-SVIDs TTL should be handled (#1438).

Having a hardcoded lifetime for JWT-SVIDs is decidedly not the greatest thing, but was the "less bad" option that we found considering the inherent replay-ability of JWT-SVIDs and the restrictions that we have in the current database schema. Using an aggressive value for the exp claim is an appropriate measure to try to mitigate possible replay attacks.

The change in the behavior to respect entry's TTL was not really intended. As such, we need to find what's the best course of action at this point, considering that (1) this change has been there for quite some time, (2) we had requests in the past to be more flexible than just having a hardcoded TTL, (3) we want to discourage the issuance of JWT-SVIDs with large TTLs, and (4) changes in the behavior need to be thought carefully to consider backwards compatibility.

amartinezfayo avatar Jan 27 '22 22:01 amartinezfayo

Since an updated token requires a round trip to the server, being able to configure this setting as appropriate seems to be a good way to alleviate extra load off of the opa servers.

kfox1111 avatar May 09 '22 17:05 kfox1111

We have a need to set longer JWT-SVID expiration times in some of our Trust Domains. While it's nice that we can currently set the TTL in the entry registration record, because it appears to be more of an accidental ability, we don't feel comfortable relying on it. Also, in cases where we're using the k8s-workload-registrar, that value is not set because the registrar doesn't support it.

SPIRE Server defaulting JWT-SVIDs to a 5 minute lifetime makes lots of sense, but it would be very helpful to us if there was an optional configuration value specifically for JWT-SVIDs.

default_svid_ttl      <--- existing config applicable to X509-SVIDs
default_jwtsvid_ttl   <--- proposed new config applicable to JWT-SVIDs, default to 5m
max_jwtsvid_ttl       <--- proposed new config applicable to JWT-SVIDs, default to value of default_jwtsvid_ttl

default_jwtsvid_ttl allows folks to set a default TTL for JWT-SVIDs.

max_jwtsvid_ttl allows folks to set an absolute max TTL for JWT-SVIDs.

When generating a new JWT-SVID, if there is a set ttl on the registration then the JWT-SVID lifetime will be min(registration_ttl, max_jwtsvid_ttl). If there is no set ttl on the registration then the JWT-SVID lifetime will be default_jwtsvid_ttl.

This approach allows the project to set reasonable defaults while also allowing folks to override those defaults if their specific use-case necessitates it. Documentation for these configuration values can clearly state that extended JWT lifetimes increase the risk of replay attacks but I think it's appropriate to let users override the defaults when needed.

If the community decides this is a reasonable path, I'd be happy to implement it.

dennisgove avatar Jun 07 '22 17:06 dennisgove

If we want to stay completely backward compatible with existing behavior, then the default value for max_jwtsvid_ttl can be set to the max(largest TTL in the registration table, default_jwtsvid_ttl).

Such a setup would allow users with already longer JWT-SVID lifetimes to upgrade to newer versions of SPIRE without having to explicitly set a value for max_jwtsvid_ttl. I don't think this is strictly necessary, but an approach to consider.

dennisgove avatar Jun 07 '22 17:06 dennisgove

Hey @dennisgove ! That is an interesting proposal and something similar has come up before. We're still digesting this all and trying to decide on a direction we're all comfortable with. We'll try and have something actionable soon.

azdagron avatar Jun 09 '22 19:06 azdagron

The maintainers discussed this issue and decided we would be in favor of a solution that allows users to configure TTLs of X.509-SVIDs and JWT-SVIDs at the entry level with new, separate properties on the Entry type. This would provide the most flexibility for defining JWT-SVID TTLs separately from X.509-SVID TTLs while providing a consistent user experience for how this configuration is provided across SVID types. Concretely, this would look something like:

  • New default X.509-SVID and JWT-SVID TTL configuration properties in Server
  • New TTL properties in the Entry type (DB, API definitions) for X.509-SVID and JWT-SVID that clearly distinguish their purpose, which is not clear from the existing ttl field in the Entry
  • Backwards-compatible behavior that respects the existing ttl field in an entry and existing default_svid_ttl Server configuration property as having precedence over the new entry + default TTL properties

Carrying out this solution would be a bit more involved than your original proposal, @dennisgove , since it will introduce a database migration and API changes, which need to be staggered across minor releases to guarantee N Agent -> N+1 Server upgrade compatibility. Given the increased scope, I wanted to check in to see if you'd be willing/able to contribute this?

rturner3 avatar Jun 14 '22 23:06 rturner3

Without a doubt. I'm happy to get started on this. Thanks!

dennisgove avatar Jun 14 '22 23:06 dennisgove

Hi @dennisgove, so assuming we can start this right away, a rough sequence and timeline for the changes would be:

  • Add database changes to RegisteredEntry model (v1.3.x)
  • Add new TTL fields to Entry type in API (v1.5.0)
  • API implementation persisting new fields into database (v1.5.0)
  • Add Server configuration properties for default X.509-SVID and JWT-SVID TTLs (v1.5.0)

Because we support manual database migrations, whenever we make database changes, we have to wait two minor releases before we can start using those new fields to give users one minor release to perform the migration. Our minor release cadence is every 3 months, so unfortunately that means the implementation changes can only be merged 4-6 months after the database changes are released, depending on the point release in which the database changes are introduced. If we can get the database PR merged in the next 3 weeks, that will make it into v1.3.2, and the subsequent changes would come as part of v1.5.0.

Just wanted to double check that you are ok with pushing the changes over that timeline since it goes a bit out into the future!

rturner3 avatar Jun 17 '22 00:06 rturner3

Hi @rturner3. I started looking at the required changes earlier today and came up with basically the same schedule, though was thinking v1.4 as I wasn't aware of the 2 minor requirement for db changes.

Schedule looks great, and I'll have a PR with the db changes in a few days.

Just so I'm clear, after the db changes in 1.3.2, all of the other code changes can go into 1.5.0? Or will anything need to spill into 1.6? Maybe removal of the old 'ttl' config option?

dennisgove avatar Jun 17 '22 00:06 dennisgove

Hey @dennisgove , that is a great question. Yes, I think we will also want to remove the old default_svid_ttl Server configuration parameter in v1.6.0. And yes, assuming the DB changes make it into v1.3.2., all the other code changes can go into v1.5.0.

rturner3 avatar Jun 17 '22 18:06 rturner3

I'm making steady progress on the implementation of this support, but there are a lot of places that need updates (I'm finding). If curious you can view ongoing changes here.

dennisgove avatar Sep 08 '22 02:09 dennisgove

I've submitted two PRs for this change.

  • https://github.com/spiffe/spire/pull/3445
  • https://github.com/spiffe/spire-api-sdk/pull/29

dennisgove avatar Sep 21 '22 20:09 dennisgove