jupyter_server
jupyter_server copied to clipboard
XSRF cookie not always set on first request
Description
With JupyterLab, a first GET request to http://localhost:8889/lab?token=the_token sets a _xsrf cookie. But if the first request is instead http://localhost:8889/api/contents/file.txt?token=the_token, where file.txt is an existing file, then no _xsrf cookie is set.
Reproduce
- Launch
jupyter lab --no-browser - Enter the following URL in a browser (with the appropriate
token):http://localhost:8889/api/contents/file.txt?token=the_token - See in the browser's network analyzer that no
_xsrfcookie is set in the response.
Expected behavior
Shouldn't the _xsrf cookie always be set on the first GET request (if not already present in the cookies)?
Context
- Operating System and version: Ubuntu 21.04
- Browser and version: Chrome 99.0.4844.74
- Jupyter Server version: 1.15.6
Hi @davidbrochart thanks for reporting this! I also think the cookie should be set in the first GET request, but I may be missing some context, so I'll bring this issue up in this week's Jupyter Server meeting and figure out next steps. Do you know if issue also exists in jupyter/notebook?
Do you know if issue also exists in
jupyter/notebook?
Good point, there is no issue with classic Notebook, i.e. a first GET request to http://localhost:8889/edit/file.txt?token=the_token sets the _xsrf cookie.
Hi @davidbrochart were you able to check if the change history/commit messages between jupyter/notebook and jupyter-server provided some context around this discrepancy?
Hey @mwakaba2, no I didn't find any reference to XSRF in Jupyter Notebook: https://github.com/jupyter/notebook/search?q=xsrf. It's actually using jupyter-server now I think, so I don't know why it behaves differently than JupyterLab.
I have looked into this issue.
First, even in the jupyter/notebook, the _xsrf cookie has not been set when http://localhost:8889/api/contents/file.txt?token=the_token has been accessed.
Checking the implementation, I found that the _xsrf cookie is set when this property is read.
In jupyter_server, it is read when the template is rendered or the XSRF cookie is explicitly checked.
Therefore, _xsrf is not set in other APIs, such as /api/sessions.
For example, I suggest the following fixes.
- Call
check_xsrf_cookie()where necessary, such as inNbconvertFileHandler. - Call
check_xsrf_cookie()inprepare()of APIHandler. - Call
check_xsrf_cookie()in the decorator@authorized.
I think that XSRF cookies should be checked because it is more secure than the current situation that files can be viewed via the API.
And I would like to add check_xsrf_cookie() in prepare() or decorator to reduce omissions.
(Such as CSPReportHandler, we can override it where it is not needed.)
However, for users who need to launch without setting a token, this could break compatibility, so please consider and review.
(Of course, those users can use this as ever by just enabling disable_check_xsrf.)
Sorry, I didn't examine the proposed fix enough.
After calling check_xsrf_cookie(), access to the API (e.g. http://localhost:8889/api/sessions) fails with "Blocking request from unknown origin".
It is not desirable because it is difficult to understand the intent of the code, but I have confirmed that the _xsrf cookie is set simply by accessing the variable as follows.
This may be preferable since cookies can be set without changing the existing behavior.
async def prepare(self):
await super().prepare()
self.xsrf_token
if not self.check_origin():
raise web.HTTPError(404)
If there is a better way, I would appreciate your advice.
Thanks for looking into this @reoono.
Indeed accessing self.xsrf_token sets the cookie. But I think what we want is to set it on the first request. If we set it at each request, then we are not checking that the request is originating from the same site, right?
So I think it has to be set when the user is authenticated, e.g. when the token cookie is set. If authentication is turned off, then I guess we also want to turn off XSRF checking.
But I think what we want is to set it on the first request.
That is certainly true. I misunderstood the issue as I proceeded with the investigation.
If we set it at each request, then we are not checking that the request is originating from the same site, right?
It seems _xsrf token is implemented with lazy initialization. Therefore, the token will be set on the first access and the same value will be used thereafter.
So I think it has to be set when the user is authenticated, e.g. when the token cookie is set.
Thank you for the advice. Based on the above, I set xsrf_token with login cookie and checked the behavior of cookies in the following sequence.
-
jupyter lab --no-browserhttp://localhost:8889/api/sessions: 403, no cookieshttp://localhost:8889/api/sessions?token=the_token: 200, with cookies (_xsrf and login cookie)- Reloading and accessing other APIs such as
http://localhost:8889/api/kernelsdoes not change the cookie value
-
jupyter lab --no-browser --ServerApp.token=""(turning off authentication)http://localhost:8889/api/sessions: 200, no cookies
If authentication is turned off, then I guess we also want to turn off XSRF checking.
On a different note, when I access http://localhost:8889/lab without authentication (--ServerApp.token=""), only the _xsrf cookie is set (there is no login cookie).
I would appreciate your comments on whether this one behaves as expected.
http://localhost:8889/api/sessions?token=the_token: 200, with cookies (_xsrf and login cookie)
When launching jupyter server I get a 500:
Traceback (most recent call last):
File "/home/david/mambaforge/envs/jupyter_server/lib/python3.10/site-packages/tornado/web.py", line 1683, in _execute
result = await result
File "/home/david/github/davidbrochart/jupyter_server/jupyter_server/base/handlers.py", line 708, in prepare
await super().prepare()
File "/home/david/github/davidbrochart/jupyter_server/jupyter_server/base/handlers.py", line 600, in prepare
user = self.identity_provider.get_user(self)
File "/home/david/github/davidbrochart/jupyter_server/jupyter_server/auth/identity.py", line 126, in get_user
user = handler.login_handler.get_user(handler)
File "/home/david/github/davidbrochart/jupyter_server/jupyter_server/auth/login.py", line 179, in get_user
cls.set_login_cookie(handler, user_id)
File "/home/david/github/davidbrochart/jupyter_server/jupyter_server/auth/login.py", line 110, in set_login_cookie
handler.xsrf_token
File "/home/david/mambaforge/envs/jupyter_server/lib/python3.10/site-packages/tornado/web.py", line 1422, in xsrf_token
if self.current_user and "expires_days" not in cookie_kwargs:
File "/home/david/mambaforge/envs/jupyter_server/lib/python3.10/site-packages/tornado/web.py", line 1340, in current_user
self._current_user = self.get_current_user()
File "/home/david/github/davidbrochart/jupyter_server/jupyter_server/base/handlers.py", line 152, in get_current_user
raise RuntimeError(msg)
RuntimeError: Calling `SessionRootHandler.get_current_user()` directly is deprecated in jupyter-server 2.0. Use `self.current_user` instead (works in all versions).
I am very sorry, I was working on branch 1.x. Please wait for a while while I will check with the main branch.
As the error message indicates, self.current_user was not set, so setting the xsrf cookie failed.
(Normally, it is set by prepare() of JupyterHandler.)
Therefore, it has been confirmed that JupyterLab works by implementing as follows.
@classmethod
def set_login_cookie(cls, handler, user_id=None):
"""Call this on handlers to set the login cookie for success"""
cookie_options = handler.settings.get("cookie_options", {})
...
handler.set_secure_cookie(handler.cookie_name, user_id, **cookie_options)
# To set _xsrf cookie at login
+ handler.current_user = handler._jupyter_current_user = user_id
handler.xsrf_token
return user_id
I am sorry for the repeated questions.
Although the _xsrf cookie has not been cleared so far, should it be cleared when logging out?
Just to be sure, I have reproduced the case where the _xsrf cookie was not deleted as follows and confirmed that the _xsrf cookie is not authentication information.
http://localhost:8889/api/sessions?token=the_token: 200, with cookies (_xsrf and login cookie)http://localhost:8889/api/sessions: 200, with cookies (_xsrf and login cookie)- Delete login cookie by Chrome DevTools
http://localhost:8889/api/sessions: 403, with cookies (_xsrf cookie)
Thanks @reoono for pushing on this. I am quite confused actually, and I'm not even sure the XSRF cookie should have anything to do with authentication. Maybe we should take a step back and look at the best practices on how it should be implemented?
I just wanted to show that not deleting the xsrf cookie on logout is not a particularly big problem as before. I am sorry for the confusion.
Maybe we should take a step back and look at the best practices on how it should be implemented?
Thanks for your kind attention. In other words, we wanted to make sure that
- Strictly speaking, some people argue that the xsrf cookie should be refreshed despite the low-security risk. (e.g., https://stackoverflow.com/questions/33502828/csrf-token-lifecycle-after-logout)
- As for this issue, is it sufficient only to set the xsrf cookie on the first request?
- Or should we also support deleting the xsrf cookie on logout?
I'm the maintainer of Tornado and I've been reviewing our XSRF protection recently. The PR https://github.com/reoono/jupyter_server/pull/1/files looks good to me - you need to access self.xsrf_token directly on the code path that sets the login cookie because you have a slightly unusual use case in which you set this cookie from an HTTP GET rather than a POST (it would probably not be appropriate to add new calls to check_xsrf_cookie in places that don't already have them - I don't think it makes sense to validate the xsrf cookie in GET requests)
However, there may be an even simpler solution. Setting samesite="lax" on your authentication cookie(s) provides XSRF protection that is just as strong as Tornado's xsrf_token in most cases. I'd appreciate your thoughts on this since Jupyter is an important user of Tornado (and it must support a variety of uncommon scenarios). The PR https://github.com/tornadoweb/tornado/pull/3226 includes documentation updates about this.