sqlx icon indicating copy to clipboard operation
sqlx copied to clipboard

Support rotating passwords

Open Cocalus opened this issue 4 years ago • 6 comments

I suspect this is not a common use case. But I was looking at using https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/UsingWithRDS.IAMDBAuth.html

To authentic to an AWS Aurora Postgres server, instead of playing hide the keys in the cloud. But the generated tokens are only valid for 15min, which I'm pretty sure can't work with a static connection string. The best I can think of is to extend the Pool API to have a closure that generates the connection string, maybe for every connection or just if the connection attempt fails with a bad password.

Some code for generating such a token can be found in the comments for https://github.com/rusoto/rusoto/pull/1733

But there's some async calls in there (wrapped in block_on), so maybe a connection string generating closure should return a future instead of the string.

Cocalus avatar Jun 25 '20 00:06 Cocalus

.build() on the pool could have a variant that takes a closure. That sounds very nice. Perhaps:

let pool = PgPool::new_from(|| async move { ... }).await?;
let pool = PgPool::builder().build_from(|| async move { ... }).await?;

mehcode avatar Jun 25 '20 03:06 mehcode

An other API choice, would be such a closure and a Duration. The password(connection string) would get refreshed when a password is needed and the cached password is older than that duration. So I could tell it to just refresh if it's older than 14 minutes.

A hybrid would be the above plus try a password refresh if a connection attempt returns a bad password error. Since there might be some weird edge cases and a refresh on password error seems pretty robust. Also the timed refresh would avoid a lot of expected errors and latency caused by handling those errors.

There might be some interactions with such a timer and various timeouts. If the password is 13min old but the new connection timeout is set to 3 min, then the password could be old when the connection completes, but wasn't old when the connection was attempted.

Cocalus avatar Jun 25 '20 05:06 Cocalus

I think just a closure is enough to implement caches and timeouts from the application. I'm not sure how much we want of that in core.

Would you agree?

We can add the _from variants of the various constructors and provide support for a dynamic set of connection options where another crate can then use that to do a rotating password cache.

mehcode avatar Jul 19 '20 04:07 mehcode

I believe I can do time limited authentication caching with a closure. It'll probably need interior mutability since I expect the closure to be just a Fn instead of FnMut. Easiest for the user would be if the authentication closure is only called on authentication errors, but that has the costs of making extra connections to the DB.

There's a question of what should happen if I can't get authentication, like a onetime error or if the service that provides rotating keys is temporarily down. Should it return an error and let SQLx handle retries or should it do retries on it's own. Also what happens if the closure is slow and takes longer than the connect_timeout, (I would expect SQLx to enforce the timeout by canceling the future). On a side note The Pool/Pool::Builder docs should probably include the default settings.

Cocalus avatar Jul 20 '20 21:07 Cocalus

As an example, in the pgx connection pool golang library, they allow the user to define hooks that can modify the credentials used. BeforeConnect can modify the configuration for new connections to the pool, and BeforeAquire can invalidate a connection from the pool if the credentials changed.

https://pkg.go.dev/github.com/jackc/pgx/v4/pgxpool#Config

Yamakaky avatar Jan 03 '22 15:01 Yamakaky

Sometimes tokens themselves say when they are expected to expire. Ideally tokens should be refreshed in the background before that point (so that the user gets no latency hit from the token refresh). So this on its own points to something different to a fixed refresh period.

Also, any background refresh can't be assumed to have actually done any work (due to some environments being suspended when there is no active request). So one may still need a read-through cache for obtaining the password (probably async in case you need to fetch a new token).

I think this all points at leaving as much policy out of sqlx as possible, so a closure sounds good to me.

fiadliel avatar Jun 06 '22 20:06 fiadliel