iceberg-rust icon indicating copy to clipboard operation
iceberg-rust copied to clipboard

discussion: token refresh mechanism for rest client

Open TennyZhuang opened this issue 1 year ago • 9 comments

Background: #301

The token fetched from the token server may have a TTL, see TokenResponse::expires_in. In most implementations, the value is about several hours. Our catalog client is a long-lived object, which means that we should handle the token expiration event.

There are two ways:

  1. Spawn a background task, and setup a ticker with a specified time interval, e.g. expires_in * 0.9 seconds, and refetch the token when triggered.
  • Pros: Easy to implement
  • Cons: Must trust the local timer skew
  • Cons: Must introduce a timer, which means depending on a specified async runtime
  1. Call every methods with a retry wrapper. When meeting an unauthorized error, refetch the token and retry the method.
  • Pros: Consistent with iceberg-python
  • Pros: Does not rely on local clock and specific runtime
  • Cons: When expired, thousands of concurrent requests may fail, and then all of them will trigger a token refetch, which is not ideal.
    • This can be workaround by some concurrency control, to force only one request will refetch the token and others must wait for the result, but it introduced complexity.

TennyZhuang avatar Jul 05 '24 09:07 TennyZhuang

2. Call every methods with a retry wrapper. When meeting an unauthorized error, refetch the token and retry the method.

Our rest catalog client now has a authenticate function.

https://github.com/apache/iceberg-rust/blob/0937f19d4c4b4ea14d423e6c78731904b4c9012e/crates/catalog/rest/src/client.rs#L88

We can perform the expiration logic here.

Xuanwo avatar Jul 05 '24 09:07 Xuanwo

I also prefer 2. The iceberg community may deprecate current auth approach in future, and introduce true oauth client in future. I prefer to have a simpler solution without introducing too much extra dependency.

liurenjie1024 avatar Jul 05 '24 10:07 liurenjie1024

  1. Store (expired_at, token) pair in an async Mutex, and check whether it's close to expiration when needed. If it's close to expiration, lock the mutex, update the pair, and unlock. Then only one request will update the token.

TennyZhuang avatar Jul 05 '24 10:07 TennyZhuang

3. Store (expired_at, token) pair in an async Mutex, and check whether it's close to expiration when needed. If it's close to expiration, lock the mutex, update the pair, and unlock. Then only one request will update the token.

Great!

Xuanwo avatar Jul 05 '24 10:07 Xuanwo

  1. Store (expired_at, token) pair in an async Mutex, and check whether it's close to expiration when needed. If it's close to expiration, lock the mutex, update the pair, and unlock. Then only one request will update the token.

Sounds reasonable to me.

liurenjie1024 avatar Jul 05 '24 12:07 liurenjie1024

Then we need https://docs.rs/tokio/latest/tokio/sync/struct.RwLock.html, still depends on tokio :(

This is rust

TennyZhuang avatar Jul 05 '24 13:07 TennyZhuang

Then we need https://docs.rs/tokio/latest/tokio/sync/struct.RwLock.html, still depends on tokio :(

This is rust

How about futures Mutex?

liurenjie1024 avatar Jul 05 '24 13:07 liurenjie1024

The current implementation utilizes std::sync::Mutex with a precisely controlled locking scope.

https://github.com/apache/iceberg-rust/blob/85627083f0791629be3d73413af936165ebd38eb/crates/catalog/rest/src/client.rs#L88-L179

The worst thing that could happen here is that we might send multiple token refresh requests, which I think is fine for a catalog.

Xuanwo avatar Jul 05 '24 13:07 Xuanwo

It's acceptable, but not as expected. When a client request a token, the server may expect it will be used for several hours.

TennyZhuang avatar Jul 05 '24 13:07 TennyZhuang

as an alternative, can we simply expose a public refresh_token method that allows the caller to implement their own token invalidation (for my use case, via option 1, using tokio)

this avoids the downside of "thousands of concurrent requests may fail", and does not require the core library to take any additional dependencies

this change perhaps does not remove the need to implement option 2 in this library, but it does provide an option for downstream consumers to implement token expiry as they wish (where there currently is no such option)

cmcarthur avatar Jun 23 '25 18:06 cmcarthur

@liurenjie1024 I've implemented this here: https://github.com/apache/iceberg-rust/pull/1465

cmcarthur avatar Jun 24 '25 11:06 cmcarthur

@liurenjie1024 I've implemented this here: #1465

Thanks, we could close this for now.

liurenjie1024 avatar Jun 25 '25 09:06 liurenjie1024