securedrop icon indicating copy to clipboard operation
securedrop copied to clipboard

Unified server side sessions for JI and API

Open lsd-cat opened this issue 2 years ago • 14 comments

Status

  • [X] Design phase ended
  • [X] Coding pahse ended
  • [x] Test plan

Summary of behavioral changes

Reminder: the changes apply only to the Journalist Interface and Journalist API, NOT to the Source Interface

Change Before After Comment
GUI Session expiration 120 minutes - Renewable unlimited times upon any request 120 minutes - Renewable up to 5 times upon any request in the last half hour of validity
API Session expiration 8 hours - Nonrenewable 120 minutes - Renewable up to 5 times upon any request in the last half hour of validity Uses same backend and mechanism as the JI.It is renewed automatically, so the client does not need to issue any 'renew' or 'refresh' request. In the last half an hour of its validity, if any request is made, token life is extended server-side.
Session info storage JI Flask signed session cookie, the payload is fully client-side, no server state Signed cookie with a session identifier. The identifier matches a record in Resid with the actual session payload Sessions are now stored server-side with all the consequences of this.
Session info storage API The signed JWT token contains the signed payload, namely the authenticated user ID and the expiration. The token is now the signed session identifier with the same backend as the JI. API/JI session tokens are not exchangeable as they use different prefixes in Redis and different signing salts.
API token request The current API request to issue a token is not subject to any change. N/A The expiration is the initial 2 hours expiration. The behavior can be easily changed, but it seems not to be used by securedrop-client anyway.
Cookie signing Cookies are signed to ensure their integrity. Cookies are signed to prevent usage of non-issued server tokens (weird Redis payloads, leaked ids from Redis, etc). In general, this is not enforced by most server-side implementations. As it has little overhead, looks like a reasonable hardening.

Additional changes to the g object

See https://github.com/freedomofpress/securedrop/issues/6428

Description of Changes

This commit attempts to completely redesign the session mechanism for the Journalist Interface and the Journalist API in order to unify the two backends and behaviors, drop some dependencies, and have more control overall.

Currently, the JI relies on built-in Flask sessions for user authentication. They rely on using a session cookie with a signed payload using the app SECRET_KEY. These session cookies are renewable and non-revokable unless changing the SECRET_KEY. It is impossible to know how many sessions are there and destroy them server-side. Current logout functionality just destroys the cookie in the browser. The API uses issues a non-renewable JWT token upon login. Upon logout, the token is placed in a database table for revocation until its expiration. For each authenticated API endpoint, a decorator is used to enforce the check of such token. Upon internal discussions, we have concluded that we are not benefiting from any of the benefits of JWT as a delegation mechanism in the project but we are keeping the burden of maintaining two session backends. Furthermore, itsdangerous has deprecated its JWT signing code.

The changes here move all the sessions from the client-side to the server-side. Sessions are now stored in Redis, and the auth cookie only contains a random string. The session identifier is still signed server-side and verified at each request: this is to prevent that a read primitive in Redis could allow an attacker to steal sessions; with this hardening, the attacker would also need the knowledge of the SECRET_KEY (or of course steal a valid cookie from a logged-in user). No major changes for the JI, while for the API the decorator is dropped, as well as the whole JWT mechanism and now behaves the same as the JI. Authentication on protected endpoints is enforced via the same approach of the JI, which is overall less error-prone than the decorator. Both logouts now destroy sessions server-side. Before, on passphrase change, a nonce was incremented to invalidate past sessions for the affected user. That mechanism is dropped and sessions are simply deleted from the session store in that event.

Session lifetime is preset to 2 hours. Each session can be renewed a maximum of 5 times by simply using it 30 minutes before its expiration. Both values are configurable and can be tuned depending on the needs that arise from discussion or during testing.

No new dependencies are needed. The database migration drops the nonce column for journalists and the revoked_tokens table for JWT revocation as well as the corresponding models.

Fixes

Fixes #6428. Fixes #6224. Fixes #204 (only for the JI, NOT for the SI).

Testing

For manual testing, a lot of manual logins/perform tasks/logout as well as for the API. We should test compatibility with securedrop-client and check if changes there are needed. The token API endpoint has not changed, as well as the authenticator header so we are not expecting big breakage.

Deployment

The new sessions library has some configuration options that could be added to config.py. However, since it has been written ad hoc for this project, the default settings are already the proper ones.

Source

https://github.com/fengsp/flask-session has been used as a starting point, albeit very little of it remains.

Test Plan (draft)

Here is a quick deaft for the test plan. Will be updated this week as I go through the changes one last time and do some more manual testing.

Test Plan

Migration

  • Database migration work (deletes the revoked token table)
  • Token cleanup procedures do not run/exists ~~* Reboot of the server/restart of redis logs destroy all sessions~~ Persistant by Redis default config
  • Test new config with non standard values
  • Config migration works

Security

  • Session ids are random
  • Sessions id are signed
Invalid session id (non existant/wrong signature) gets ignored
  1. Login to the JI
  2. Tamper the cookie signature
  • [x] Confirm that user is logged out

Functional

JI

Journalists can Login
  1. Login to the JI
  • [x] Confirm that the user is logged
Sessions of multiple users are separate
  1. Login to the JI as user X in browser A
  2. Login to the JI as user Y in browser B
  • [x] Confirm that the two userss navigation does not interfere
Sessions are renewed in accordance with the new expiration/renewal policy
  1. Login to the JI
  2. Wait 2 hours
  • [x] Confirm that user is logged out automatically
  • [x] Confirm that an error message is shown
  1. Login to the JI
  2. Wait between 1.5h and 2h and refresh
  3. Wait 1h
  • [x] Confirm that user is still logged in
Logout works
  1. Login to the JI as user X in browser A
  2. Login to the JI as user Y in browser B
  3. Logout from user X in browser A
  • [x] Confirm that user A is logged out
  • [x] Confirm that user B is still logged in
Deleting a user destroy all sessions of the user
  1. Login to the JI as admin X in browser A
  2. Login to the JI as admin/user Y in browser B
  3. From admin X in browser A, delete user Y
  • [x] Confirm that user Y in browser B is logged out
Changing a user password destroy all sessions of the user
  1. Login to the JI as admin X in browser A
  2. Login to the JI as admin/user Y in browser B
  3. From admin X in browser A, change password of user Y
  • [x] Confirm that user Y in browser B is logged out
  1. Login to the JI as admin X
  2. Change admin X password
  • [x] Confirm that admin X is logged out on password change

API

Token are issues on succesful API login
  1. Login to the /token endpoint
  • [x] Confirm that the login is succesful
  1. Use the token for an authenticated operation
  • [x] Verify that the authenticated operation accepts the token
  1. Send a request to the logout endpoint
  • [x] Confirm that a message fo succesful logout is returned
  1. Retry the operation at point 2.
  • [x] Cionfirm that the operation at point 2 now requires a valid token

Integration

This changes should not affect the client, as the token format is exactly the same and token renewal will happen in the background transparently, however testing remains ncessary

SD Client

Login and sesison lifetime
  1. Login through the client
  • [x] Confirm that the client synchronizes succesfully
Session lifetime

We could need a bit of tuning here, what is going to happen is as follow:

  1. The server will issue a token valid for two hours

  2. As the client constantly tries to sync, as soon as the token hit ~1.30h of lifetime the token will be renewed for 2 additional hours.

  3. At the ~3.30h mark, the token will renew again, and again at ~5.30, ~7.30, ~9.30. After that 5 renewal, the token lifetime ends and the session is destroyed. (max duration time would be 12h) It is possible, that during testing phase we want to adjust this settings (5 renewals, 2h each, triggered at 30 minutes to session end)

  4. Login through the client

  • [x] Confirm that the token gets renewed and the session last the expected time.

lsd-cat avatar Apr 18 '22 08:04 lsd-cat

re, mypy, the biggest cause of warnings is save_session types session as a SessionMixin rather than RedisSession. Switching that gets rid of most of the errors. There are also (at least) 2 cases where a field is typed as Optional[T] so we need to check it's not null first. Here's my diff:

diff --git a/securedrop/journalist_app/sessions.py b/securedrop/journalist_app/sessions.py
index 3811ec687..a202cbc61 100644
--- a/securedrop/journalist_app/sessions.py
+++ b/securedrop/journalist_app/sessions.py
@@ -18,7 +18,7 @@ class ServerSideSession(CallbackDict, SessionMixin):
             self.modified = True
         CallbackDict.__init__(self, initial, on_update)
         self.sid = sid
-        self.token = None
+        self.token: Optional[str] = None
         self.is_api = False
         self.to_destroy = False
         self.to_regenerate = False
@@ -45,6 +45,8 @@ class SessionInterface(FlaskSessionInterface):
         return token_urlsafe(32)
 
     def _get_signer(self, app: 'Flask') -> 'URLSafeTimedSerializer':
+        if not app.secret_key:
+            raise RuntimeError('No secret key set')
         return URLSafeTimedSerializer(app.secret_key,
                                       salt=self.salt)
 
@@ -102,6 +104,8 @@ class RedisSessionInterface(SessionInterface):
             sid = self._get_signer(app).loads(sid)
         except BadSignature:
             return new_session(is_api)
+        if not sid:
+            return new_session(is_api)
 
         val = self.redis.get(self.key_prefix + sid)
         if val is not None:
@@ -112,7 +116,7 @@ class RedisSessionInterface(SessionInterface):
                 return new_session(is_api)
         return new_session(is_api)
 
-    def save_session(self, app: 'Flask', session: 'SessionMixin', response: 'Response') -> 'None':
+    def save_session(self, app: 'Flask', session: 'RedisSession', response: 'Response') -> 'None':
         '''This is called at the end of each request, just
         before sending the response.
         '''

That leaves me with:

securedrop/journalist_app/sessions.py:51: error: "SessionInterface" has no attribute "salt"
securedrop/journalist_app/sessions.py:81: error: Return type "Optional[SessionMixin]" of "open_session" incompatible with return type "None" in supertype "SessionInterface"
securedrop/journalist_app/sessions.py:154: error: Argument 2 to "set_cookie" of "BaseResponse" has incompatible type "Optional[str]"; expected "str"

The first issue goes away if the interfaces are merged, like I suggested inline. The second is because open_session() is defined in the parent as returning nothing, so we should get rid of our returns too.

The final one has to do with response.set_cookie(app.session_cookie_name, session.token,, where token is defined as Optional[str]. Should token be typed as str instead?

legoktm avatar Apr 18 '22 18:04 legoktm

Implemented most required changes, tests still in progress, thanks for the help!

The second is because open_session() is defined in the parent as returning nothing, so we should get rid of our returns too.

I am not sure how to resolve this one. Flask docs specifically says that open_session() should return a session object or None if impossible. https://tedboy.github.io/flask/interface_api.session_interface.html

Also here in the parent https://github.com/pallets/flask/blob/a03719b01076a5bfdc2c8f4024eda7b874614bc1/src/flask/sessions.py#L299 it indeed doe not return but it has the Optional return type and it is there just as a placeholder to extend. Do we have any other option?

lsd-cat avatar Apr 22 '22 15:04 lsd-cat

I am not sure how to resolve this one. Flask docs specifically says that open_session() should return a session object or None if impossible. https://tedboy.github.io/flask/interface_api.session_interface.html

The correct docs link is https://flask.palletsprojects.com/en/2.0.x/api/#flask.sessions.SessionInterface.open_session btw, but the return type seems to still be Optional[SessionMixin]. I'll try to poke at it, otherwise we can use # type: ignore to suppress mypy.

legoktm avatar Apr 25 '22 17:04 legoktm

This pull request introduces 2 alerts when merging d11545f90658f03fc8030c6a45839566b077f0d3 into f384d7cdeaab66b99534254277287d2609aee4b2 - view on LGTM.com

new alerts:

  • 2 for Unused import

lgtm-com[bot] avatar May 11 '22 22:05 lgtm-com[bot]

This pull request introduces 2 alerts when merging ec608131afb63bc3faf398a8f14624430ac6bee6 into 1c133e5e555908deb777917532703d695ffb33c4 - view on LGTM.com

new alerts:

  • 2 for Unused import

lgtm-com[bot] avatar May 23 '22 20:05 lgtm-com[bot]

I'm not seeing any issues, so +1 for me.

ghost avatar Aug 16 '22 11:08 ghost

Did a quick skim through, I think we're mostly ready to move this forward?

  • There are still a few unaddressed comments from @nabla-c0d3 that we should get to.
  • And then we need to squash this down somehow.
  • I explicitly listed some bugs that will be fixed by this as well.

legoktm avatar Aug 18 '22 19:08 legoktm

Does this actually close #204? That issue seems to relate only to the source interface. (Altho as an aside, I do like @garrettr's option no.2 here as a way of removing the codename from SI cookies altogether without storing anything server-side. Definitely more feasible than a PAKE-type solution.)

zenmonkeykstop avatar Aug 24 '22 20:08 zenmonkeykstop

As for https://github.com/freedomofpress/securedrop/issues/204 that issue is strictly related to the source interface which is unaffected by the changes here. Although technically the same problem existed in the journalist interface, the issue was very minimal as the session contained mostly just the journalist id and I see no security benefit in hiding it in general, even though now it will be stored server side anyway.

For context, this change affects only the JI for a multitude of reasons: first it unifies the auth/session scheme for both the frontend and the API. Second, we do not want to have the code names stored server side on redis on purpose, even if ephemeral.

I also did another review of the logout mechanism following @nabla-c0d3 comments. There are a few things that needs to be changed: for example as of now if a user change its own password, only his session will be destroyed and not the ones of others logged with the same credentials. That mass deletion will happen only if the admin force change a user password through the admin/edit endpoint. Furthermore, since when an admin tries to change its password from the admin panel, it gets redirected anyway to the account password change endpoint, currently the logout_user() function is not invoked at all in that case. I'll try to address this in the next days and add a relevant test for this since apparently we have a wanted behavior but do not test for it.

lsd-cat avatar Aug 26 '22 14:08 lsd-cat

Test plan moved to the top comment.

lsd-cat avatar Sep 12 '22 21:09 lsd-cat

The failing tests seem unrelated (getting errors retrieving the release pubkey from openpgp.org), rerunning.

zenmonkeykstop avatar Sep 19 '22 14:09 zenmonkeykstop

We should remove the "Rebase:" prefix from the commit messages, otherwise I have no other comments!

legoktm avatar Sep 21 '22 18:09 legoktm

Performed upgrade testing as follows:

  • set up 2.4.2 prod vms
  • built packages off this branch and installed them on prod using the upgrade test procedure
  • [x] upgrade was successful and db migration worked.

Tested session expiry using redis-cli {keys|get|ttl|expire}. There are a couple of issues with session lifetimes:

  • sessions persist across server reboots, as there's nothing stopping the keys being written to disk (our redis.conf is pretty much the default, so that would be tunable there but require provisioning logic). This seems counter to the test plan's stated Reboot of the server/restart of redis logs destroy all sessions
  • when a session is extended in the last 30 minutes of its lifetime, the extension is for 2hrs+the remainder. The feature description would seem to indicate that renewal should be just for 2hrs.

zenmonkeykstop avatar Sep 23 '22 16:09 zenmonkeykstop

Tested session expiry using redis-cli {keys|get|ttl|expire}. There are a couple of issues with session lifetimes:

* sessions persist across server reboots, as there's nothing stopping the keys being written to disk (our redis.conf is pretty much the default, so that would be tunable there but require provisioning logic). This seems counter to the test plan's stated `Reboot of the server/restart of redis logs destroy all sessions`

The non persistance in my testing environment was due to container themselves being not persistant, even though Redis in the early days used not to store snapshots by default, it indeed does now. Although the persistance of sessions is counterintuitive per se (most systems do not), it is not a major issue since the payload contains only the user id, the csrf token, the renewal count and the ttl. I'll amend the test plan to reflect this.

* when a session is extended in the last 30 minutes of its lifetime, the extension is for 2hrs+the remainder. The feature description would seem to indicate that renewal should be just for 2hrs.

As noted above, I will also amend the description to properly reflect this behavior, as we concluded during the development phase that the currently implemented is the intended on.

lsd-cat avatar Sep 23 '22 18:09 lsd-cat