flask-dance icon indicating copy to clipboard operation
flask-dance copied to clipboard

refresh_token and online vs offline access

Open daenney opened this issue 7 years ago • 23 comments

This is kind of a rehash of #137 and #111 and #102. I'm hoping to get some input here (and then maybe provide a documentaiton PR). This appears specific to the Google provider.

Accroding to Google's OAuth documentation you should request an access token of type offline if your applicaiton needs to do actions on behalf of the user while the user is offline. Say for example some kind of scheduled job that runs at some point in time, like backups.

Based on that documentation, if you're just writing an app where you're acting on behalf of the user while she is online, i.e sitting at their browser, it wouldn't occur to you to request an offline token. However, if the user opens up your app, does a thing, goes away for an hour (without closing the tab or browser) and then tries to use it again you'll hit the Missing required parameter: refresh_token because at this point the token has expired and you need to grab a new one.

So far what I've seen is that people just change the blueprint call to fetch an offline token instead, but that seems incorrect to me, since you're not really doing anything offline. This also doesn't really work for session storage b/c as far as I can tell the refresh token is never stored in the user session, it should be saved in a server side backend. The user is still there, it's just that they had other stuff to do and now the token has expired.

From my point of view the solution in #137 is correct, add an error handler to catch the InvalidClientIdError and redirect the user. This'll retrigger the OAuth flow and when users have already given consent won't actually do much of anything, essentially logging them back in with a fresh and valid token. No need to set offline=true in such cases.

The question here is, does this sound reasonable, and should we perhaps add documentation to the Google provider to reflect this?

daenney avatar Jul 07 '18 11:07 daenney

Coming back to this, is it intended that the refresh_token isn't stored in Flask's session object?

daenney avatar Jul 07 '18 17:07 daenney

is it intended that the refresh_token isn't stored in Flask's session object?

No, that's not intended. All of the information returned from the token response should be serialized to a dictionary, and then that dictionary should be stored in the token storage backend (which is Flask's session object by default).

So the next question is: does Google provide a refresh_token if the request does not specify offline=true? I'm not sure it does, but we'll need to verify this. Of course, if the request_token is not provided, then it won't be stored in the Flask session, because there's nothing to store.

Here's a breakdown of the abstraction levels to help you read and understand the code:

Should we perhaps add documentation to the Google provider to reflect this?

Yes! I'm always in favor of adding more useful documentation. Perhaps the process of writing that documentation will also help us figure out exactly where the problem is, and the best way to fix it.

singingwolfboy avatar Jul 09 '18 09:07 singingwolfboy

refresh_token does actually get stored in the session with offline=true. Without offline=true no refresh token is returned and hence not stored, so this all makes sense so far.

However, if you call session.clear() and then redirect to login, the next echange will no longer include the refresh token, so you end up with a session that's configured for offline but no refresh_token ends up in the session, only the id_token and access_token. So it does seem like you get the refresh token, but only once. That's surprising to me, I would expect to get that refresh token in this case.

<SecureCookieSession {u'google_oauth_token': {u'access_token': u'scrubbed', u'id_token': u'scrubbed', u'expires_in': 3600, u'expires_at': 1531138945.917256, u'token_type': u'Bearer', u'refresh_token': u'scrubbed'}}>
# Call session.clear(), this'll retrigger the flow
<SecureCookieSession {}>
127.0.0.1 - - [09/Jul/2018 13:25:28] "GET / HTTP/1.1" 302 -
127.0.0.1 - - [09/Jul/2018 13:25:28] "GET /login/google HTTP/1.1" 302 -
127.0.0.1 - - [09/Jul/2018 13:25:28] "GET /login/google/authorized?state=scrubbed HTTP/1.1" 302 -
<SecureCookieSession {u'google_oauth_token': {u'access_token': u'scrubbed', u'id_token': u'scrubbed', u'token_type': u'Bearer', u'expires_in': 3600, u'expires_at': 1531139128.778162}}>

The code for it is this:

from flask import Flask, redirect, url_for, session
from flask_dance.contrib.google import make_google_blueprint, google

app = Flask(__name__)
app.secret_key = "supersekrit"
blueprint = make_google_blueprint(
    client_id='scrubbed',
    client_secret='scrubbed',
    offline=True,
    scope=["profile", "email"],
)
app.register_blueprint(blueprint, url_prefix="/login")


@app.route("/")
def index():
    """Index route."""
    print session
    session.clear()
    print session
    if not google.authorized:
        return redirect(url_for("google.login"))
    print session
    resp = google.get("/oauth2/v2/userinfo")
    assert resp.ok, resp.text
    return "You are {email} on Google".format(email=resp.json()["email"])


if __name__ == "__main__":
    app.run()

(This'll get you in a loop since it continuously clears the session, but you'll no longer see the refresh_token show up)

daenney avatar Jul 09 '18 11:07 daenney

Alright. It sounds like we need a pull request to update the documentation for the Google provider. We should have a section that discusses on the offline parameter works, what the refresh_token is, and what the trade-off are about enabling offline mode. Some suggested code for how to automatically redirect the user in case of an InvalidClientIdError is probably a good idea, as well.

@daenney, would you like to make this pull request? Or is there more we need to investigate, first?

singingwolfboy avatar Jul 09 '18 11:07 singingwolfboy

I think I have the information I need to whip this up. W.r.t the refresh token only showing up on the first exchange, what should the docs say about that? Always recommend people use a proper persistant storage if offline=True, so at the least set Flask's session to persistent (though that's still probably not fool-proof, if someone where to clear their cache you'd still not get a refresh_token the next time).

daenney avatar Jul 09 '18 12:07 daenney

It sounds like you have more context around this specific situation than I do. Why don't you recommend whatever option makes the most sense to you, and I can read what you write and sanity-check it, to be sure that it makes sense to me?

singingwolfboy avatar Jul 10 '18 09:07 singingwolfboy

OK, I'm a little stumped. I can reproduce the Missing required parameter: refresh_token just fine in real life (just wait an hour for the token to expire). However, I'm failing to get it to trigger in tests.

I have a loggedin_app fixture with a session that stems from May the 4th 2018, by freezing time using freezegun. Then in a test I move time forward using freezegun.move_to by a day. Calling time.time() I correctly get 1525478400.0.

At this point I would expect my test to blow up because the call to google.post(....) would travel all the way down to oauthlib, which will run add_token which should realise that the token has expired, raising the TokenExpiredErrror(), which in turn should get us to https://github.com/requests/requests-oauthlib/blob/master/requests_oauthlib/oauth2_session.py#L332-L343 and the call to refresh_token() which, since we don't have a refresh token will be what ends up raising the CliendIdError.

Except. This doesn't happen. Under test it just works fine. Decorating it with a @responses.activate it also never complains about trying to connect to account.google.com to attempt to refresh the token, which it should.

What am I missing?

daenney avatar Jul 11 '18 16:07 daenney

Oh crap, found it. The pytest freezegun marker doesn't work for fixtures itself, needed to call freeze_time so my token was actually from the future, compared to the time frozen in the test itself. Now it works.

daenney avatar Jul 11 '18 17:07 daenney

There we go, raise #144 with what I think makes sense.

daenney avatar Jul 12 '18 10:07 daenney

According to Google's API Documentation for use on Server-Side implementations, you can request the offline_access=True scope if you wish to continually re-authenticate the user without the need of their consent, even if they're offline. So taking the user through the flask-dance is unnecessary, and would be more wise to have a function re-authenticate the user either periodically or upon request with an expired access token (this could be a configuration/blueprint setting). The API will only send one refresh token per Authorization URL, so you must store it in persistent storage (A secure DB) if you wish to make use of it. Refresh tokens never expire, however if more than one is requested with a single authorization URL, this should be treated as a malicious attack & the access/refresh tokens should be revoked.

Additionally, it specifies (both in the documentation as well as in the Cache-Control header, in the API's Access Token response) that the contents of that response are strictly no-cache, meaning that they should never be put in a cache or sent via URL. They must either be sent through HTTPS headers or immediately processed into persistent storage (like a backend Database). I'm less familiar with flask, but I'm imagining that its session storage acts as a sort of cache, since it persists requests if the user keeps the tab open?

tyrelkostyk avatar Jul 12 '18 23:07 tyrelkostyk

The problem isn't with when you request an offline token, that already works as advertised and it'll refresh for you because requests-oauthlib knows how to handle that.

The trick is when you don't request an offline token. If your application never acts on behalf of the user when they're not there, you should not request an offline token. This decreases the risk and responsibility your app takes on as you'll need to securely store and handle the refresh token. There are a lot fo cases where this just isn't needed. It's this that we're trying to figure out a nicer solution to, and at least provide documentation and samples for on how that oculd be handled.

daenney avatar Jul 13 '18 09:07 daenney

I see, thanks for the clarification.

One possible option would be to include documentation or sample code explaining how to integrate Google (or other providers) SSO with flask-login. I know that if you handle a login through the flask-login extension, it adds support to remember that user, even if he/she leaves the app & closes the browser.

So in the callback function you could create a user object, commit it to the db, then log the user in & remember them:

user = User.query.filter_by(email=email).first()
if not user:
    user = User(username=username, email=email, confirmed=True)
    db.session.add(user)
    db.session.commit()
login_user(user, True)

And I believe that this wouldn't need any offline access or Google re-authentication, not until the user clears their cookies or manually logs out. Feel free to correct me if I'm wrong, I'm only like 75-80% sure on this one haha. Just thought I'd add my input.

tyrelkostyk avatar Jul 13 '18 16:07 tyrelkostyk

That only works if you don't need their access token to then call Google APIs. If all you're doing is creating an identity for them in your application, then this is indeed enough.

However, say you have an app that then wants to call to the Google Plus API, or any Google Cloud Platform API, or any of the other ton of APIs Google makes available: https://developers.google.com/apis-explorer/#p/. At that point having a user object in your database isn't enough, you need their access token if you want your app to be able to interact with those APIs on the user's behalf.

daenney avatar Jul 13 '18 20:07 daenney

Hi folks, are you at a point yet where you can recommend a working example for Google OAuth when offline=False? Using the current master branch and the current Google example in the docs, i get a TokenExpiredError.

araichev avatar Aug 29 '18 00:08 araichev

I'm still working on getting that handled automatically as much as possible, but right now I would suggest you install an error handler for the TokenExpiredError, remove the token from the session and re-trigger the oauth flow. That'll get you a refreshed token and you can then do the request again.

Something along the lines of:

@app.errorhandler(oauthlib.oauth2.rfc6749.errors.TokenExpiredError)
def token_expired(_):
  ...

daenney avatar Aug 29 '18 07:08 daenney

Thanks @daenney . Um, could you flesh out that token_expired function for me? Flask Dance newbie here.

araichev avatar Aug 30 '18 00:08 araichev

Sure! I can show you what I did in a recent project: https://github.com/spotify/gimme/blob/master/gimme/views.py#L52-L94. There's the logout function that shows the token revocation, and there's the error handler. You'll need to catch a different error, but the principle is the same.

daenney avatar Aug 30 '18 10:08 daenney

Thanks @daenney , your code worked for my project.

araichev avatar Sep 02 '18 23:09 araichev

@daenney: any progress on this issue? Is there some way I can help?

singingwolfboy avatar Oct 16 '18 16:10 singingwolfboy

Sorry, not really. Life got a little busy with swapping jobs so this ended up on the back burner. If you want to have a go at the solution we discussed, feel free to do so. My schedule should start to clear up in a week or two.

daenney avatar Oct 20 '18 09:10 daenney

Hey folks, any progress on this issue?

araichev avatar Jan 30 '19 22:01 araichev

Hi I think I am facing the same issue. I have created a post in stackoverflow with the details. I think am able to catch TokenExpiredError but while I try to recover (based on https://github.com/spotify/gimme/blob/master/gimme/views.py#L52-L94) I get InvalidClientIdError. What is the part that I am missing? My flask dance version is 2.2.0

balderman1 avatar Aug 09 '19 06:08 balderman1

Unless you genuinely need to act on behalf of your use when they aren't logged in you should not be using offline=True. When they are logged in then a redirect to the url_for('google.login') call will re-authenticate the session without any action on the user's part.

Waiting for the ClientError causes unnecessary complications. What appears to work (at least for GET requests) is to detect the expired token in a flask.app.before_request function and delete it from the flask session state:

@app.before_request
def before_request():
    if (token := (s := flask.session).get('google_oauth_token')):
        print("Token expiring in", token['expires_at']-time.time())
        if time.time() >= token['expires_at']:
            del s['google_oauth_token']

This causes the application to go through the same dance it did when applying for the original access token, using the client credentials. This in turn requires google re-authentication, but as long as you remain logged in this happens without presenting any further request for credentials, leading to a seamless user experience.

In this case refresh tokens appear to be something of a red herring, as there is no need to use them.

My understanding of OAuth2 and the supporting libraries is still imperfect, so I would appreciate feedback from anyone who tries this remedy.

holdenweb avatar Apr 09 '23 16:04 holdenweb