jupyter_server
jupyter_server copied to clipboard
Make preferred_dir content manager trait
Move the preferred_dir trait to content manager, but keep the old one for backwards compatibility. The new location should also cause the value to be read later in the init, allowing us to avoid the deferred validation. Also fixes an issue with escaping the root dir on Windows if an absolute path is passed.
@kevin-bates highlighted to check if a change needs to happen in JupyterLab to pick up the value from the new trait or not
Thanks for submitting this PR @vidartf!
I had a chance to run with your branch a bit and agree these related (and duplicated) trait scenarios are difficult!
If I run server with a preferred_dir that exists, but as a sibling to root_dir, that config violation does not appear to be detected until Lab goes to fetch preferred_dir from the contents service. For example, this command line:
$ jupyter lab --ServerApp.preferred_dir=~/dev --ServerApp.root_dir=~/notebooks
will produce this output...
...
To access the server, open this file in a browser:
file:///Users/kbates/Library/Jupyter/runtime/jpserver-59158-open.html
Or copy and paste one of these URLs:
http://localhost:8888/lab?token=911cbde8ca5eba77311ecd2019f2022a67d6df6461766d71
or http://127.0.0.1:8888/lab?token=911cbde8ca5eba77311ecd2019f2022a67d6df6461766d71
[W 2022-09-15 11:45:35.671 ServerApp] 404 GET /api/contents/dev?content=1&1663267535664 (::1): file or directory does not exist: 'dev'
[W 2022-09-15 11:45:35.672 ServerApp] wrote error: "file or directory does not exist: 'dev'"
[W 2022-09-15 11:45:35.672 ServerApp] 404 GET /api/contents/dev?content=1&1663267535664 (95b61f48f2ef4d63bb464bbf09ba3f32@::1) 1.50ms referer=http://localhost:8888/lab
with this dialog in Lab (which I know is unrelated but thought I'd include)...

However, setting preferred_dir on the CM (or FileContentsManager) via the CLI does catch things early...
$ jupyter lab --ContentsManager.preferred_dir=~/dev --ServerApp.root_dir=~/notebooks
...
[C 2022-09-15 11:50:30.021 ServerApp] Bad config encountered during initialization: No such directory: ''/Users/kbates/dev''
((I wonder if we should change this messaging to something like f"Preferred_dir {value} does not exist or is not a sub-directory or root_dir"?))
I'm also not seeing the deprecation warning under any circumstances - nor hitting a breakpoint in FileContentsManager._default_preferred_dir(). I do hit the validator in ServerApp when defining --ServerApp.preferred_dir and wonder if the contents manager's validator should just duplicate the one in ServerApp (and not defer to super()) and move the deprecation warning to that ServerApp._preferred_dir_validate()?
Codecov Report
Base: 80.01% // Head: 80.06% // Increases project coverage by +0.04% :tada:
Coverage data is based on head (
5fffaf7) compared to base (f52aa71). Patch coverage: 94.87% of modified lines in pull request are covered.
Additional details and impacted files
@@ Coverage Diff @@
## main #983 +/- ##
==========================================
+ Coverage 80.01% 80.06% +0.04%
==========================================
Files 68 68
Lines 8021 8040 +19
Branches 1587 1588 +1
==========================================
+ Hits 6418 6437 +19
Misses 1181 1181
Partials 422 422
| Impacted Files | Coverage Δ | |
|---|---|---|
| jupyter_server/services/contents/filemanager.py | 75.27% <91.30%> (+0.71%) |
:arrow_up: |
| jupyter_server/serverapp.py | 80.00% <100.00%> (-0.08%) |
:arrow_down: |
| jupyter_server/services/contents/fileio.py | 90.10% <100.00%> (+0.10%) |
:arrow_up: |
| jupyter_server/services/contents/manager.py | 86.74% <100.00%> (+0.41%) |
:arrow_up: |
| jupyter_server/services/kernels/kernelmanager.py | 81.11% <0.00%> (-0.31%) |
:arrow_down: |
| ...ter_server/services/kernels/connection/channels.py | 59.60% <0.00%> (-0.23%) |
:arrow_down: |
| jupyter_server/utils.py | 85.71% <0.00%> (+0.59%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@kevin-bates highlighted to check if a change needs to happen in JupyterLab to pick up the value from the new trait or not
Yes, a change to jupyterlab-server would be needed here: https://github.com/jupyterlab/jupyterlab_server/blob/4263852881fd2bd181712dcabac1bda176886cd5/jupyterlab_server/handlers.py#L90-L99
Thanks for the update @vidartf.
Lately, I've been hearing that it's not uncommon for a given installation to be using multiple instances (and types) of ContentsManager simultaneously. For references like that above in jupyterlab_server, would only the ContentsManager that is configured on ServerApp be considered the ContentsManager from which preferred_dir is used?
Lately, I've been hearing that it's not uncommon for a given installation to be using multiple instances (and types) of ContentsManager simultaneously. For references like that above in
jupyterlab_server, would only the ContentsManager that is configured on ServerApp be considered the ContentsManager from whichpreferred_diris used?
Lab server would read out the preferred dir from the main/root contents manager, and pass that path to JupyterLab. That path could ofc then be a path into any proxied "inner" contents manager. That would be an internal detail for them to sort out.
If I run server with a
preferred_dirthat exists, but as a sibling toroot_dir, that config violation does not appear to be detected until Lab goes to fetchpreferred_dirfrom the contents service.
Given that these traits are being deprecated, I think it will probably be okay if their validation behavior is not the most user friendly, as long as it is eventually correct. But I will have a look to see if I can force a validation at some point during startup if the deprecated traits are being set.
(I wonder if we should change this messaging to something like f"Preferred_dir {value} does not exist or is not a sub-directory or root_dir"?)
Makes sense.
I'm also not seeing the deprecation warning under any circumstances - nor hitting a breakpoint in
FileContentsManager._default_preferred_dir().
I'll have a look.
Please test it with https://github.com/jupyterlab/jupyterlab_server/pull/324
Thanks @vidartf - I'll try to take this (and 324) for a spin this afternoon.
@kevin-bates Thanks for the review. I pushed a new commit here, and one on the jupyterlab_server branch to help address some of the concerns. I didn't end up changing the warning stack level after all, since I figured it was better to be consistent with other deprecations:
<env dir>\lib\site-packages\traitlets\traitlets.py:1752: DeprecationWarning: ServerApp.token config is deprecated in jupyter-server 2.0. Use IdentityProvider.token
return self._get_trait_default_generator(names[0])(self)
[I 2022-10-15 14:58:48.075 ServerApp] notebook_shim | extension was successfully linked.
<env dir>\lib\site-packages\traitlets\traitlets.py:1752: DeprecationWarning: ServerApp.preferred_dir config is deprecated in jupyter-server 2.0. Use ContentsManager.preferred_dir with a relative path instead
return self._get_trait_default_generator(names[0])(self)
[I 2022-10-15 14:58:48.196 ServerApp] notebook_shim | extension was successfully loaded.
[I 2022-10-15 14:58:48.199 ServerApp] jupyter_server_terminals | extension was successfully loaded.
<env dir>\lib\site-packages\traitlets\traitlets.py:1752: DeprecationWarning: ServerApp.token config is deprecated in jupyter-server 2.0. Use IdentityProvider.token
return self._get_trait_default_generator(names[0])(self)
Thanks for the updates. I'm finding the test failures getting tangled between catching SystemExit and TraitError. It seems the exceptions are occurring from within the jp_configurable_serverapp fixture so I found it needed to be moved within the with pytest.raises blocks.
Here's the commit. I was hesitant to push it in case you wanted to take a different approach or this behavior is indicative of another issue.
@kevin-bates You commit looks good to me! Please push it onto the branch 👍
@vidartf - My commit has been pushed. The merge was more involved than I expected. The tests passed locally, but I'm hoping I didn't mess anything up. 😊
@kevin-bates I did a rebase anyway to pull in latest changes from main. I took the opportunity to clean up the commit order then.
I pushed a change to satisfy pre-commit. What's odd is that the set of pre-commit actions used in CI is different than what is used in the git hook. The latter doesn't call flake8-pyproject or doc8. I haven't looked into why/where that difference occurs.
The ones that are listed as hook-stage: manual are only run in CI by default, because they can't be fixed automatically, and we didn't want them to prevent pushing.
You can run pre-commit run --hook-stage manual locally
The latest merge from master highlights that I was missing a sync/async compat for the dir_exists call, which is good. However, I do not know how to do this inside a trait validator, since AFAIK, traitlets do not support asynchronous validators.
I do not know how to do this inside a trait validator, since AFAIK, traitlets do not support asynchronous validators.
I just spent a couple of hours with this. This is a problem since any async methods used to validate a trait cannot be used. In addition, because the server's initialization is essentially synchronous methods, moving validation to another location is also a non-starter. In essence. any traits reliant on async methods for validation cannot be validated.
The *nix tests were failing because the expected exception was no longer being raised due to its (necessary) removal. However, I noticed that these tests were relative to the preferred_dir not being a sub-directory of root_dir. I went ahead and added a validation method that simply checks if the preferred_dir starts with the root_dir value, and, if not, it's not a subdirectory. Of course, this does not validate that preferred_dir is a directory.
Question: What is the expected value of preferred_dir in the first place? The help string is:
"Preferred starting directory to use for notebooks as a path local to `root_dir`."
Does local in the case mean relative? Or is preferred_dir expected to be absolute like root_dir and local is indicating that it be a subdirectory of root_dir?
The validation I referred to above assumed that preferred_dir is also absolute. If it is supposed to be relative, then the validation I reference will yield a false negative - always - because a relative-oriented preferred_dir will never startswith(root_dir).
I'm going ahead and pushing the additional commit with the assumption it can be easily reverted if necessary. I believe this provides some benefit (assuming preferred_dir is absolute).
@kevin-bates I'm not sure I follow your change. This PR is supposed to move the code to have preferred_dir be a relative path (API path), so checking that is starts with the root dir runs counter to that (it will now always be relative to the root_dir which defines the root of the contents manager API paths).
This PR is supposed to move the code to have preferred_dir be a relative path (API path), so checking that is starts with the root dir runs counter to that (it will now always be relative to the root_dir which defines the root of the contents manager API paths).
I guess my assumption (repeated below) was incorrect. :blush: I did not realize that this PR was also changing preferred_dir to be relative to root_dir and not absolute. This is not mentioned in the opening description - I thought only its location was changing (to ContentsManager) - but, looking more closely at the deprecation messages, see that its value is also changing to be relative to root_dir (I'm sorry I missed that).
The validation I referred to above assumed that
preferred_diris also absolute. If it is supposed to be relative, then the validation I reference will yield a false negative - always - because a relative-orientedpreferred_dirwill neverstartswith(root_dir).
I'm going ahead and pushing the additional commit with the assumption it can be easily reverted if necessary. I believe this provides some benefit (assuming
preferred_diris absolute).
I apologize for getting in the way here. I will revert my last commit. I don't really know how to proceed other than having no validation of preferred_dir until its first use when a client application attempts to read from it.
@kevin-bates Sorry if I ended up wasting your time by imprecise communications. I think having preferred_dir be an API path is the only thing that makes sense from the perspective of a consumer (e.g. JupyterLab). That will allow consumers to treat it much more consistently, without making any assumptions about the underlying mapping to storage. We might still allow for the file-based managers to coerce absolute and relative paths to API paths without deprecations, but that will be an implementation detail.
In the current state of the PR, there is one test that is failing only on *nix, but not on Windows. I'd really appreciate any help in trying to figure out why that test is failing on *nix! I was hoping you could help figure that one out. It is one post-init access test that is expected to fail (the dir shouldn't exist?).
No worries at all @vidartf. I just pushed an import of HTTPError (since it had been removed in a "fix linting" commit separate from the one I reverted) and updated the branch. This should at least get us back to having Windows pass but not *nix (which probably requires some better understanding as well).
This should at least get us back to having Windows pass but not *nix (which probably requires some better understanding as well)
Using this failure as an example, why would Windows be raising an HTTPError when the directory should clearly exist since its a parent of root_dir? I think the *nix test is behaving correctly.
_______________ test_absolute_preferred_dir_not_root_subdir_set ________________
tmp_path = PosixPath('/tmp/pytest-of-runner/pytest-0/test_absolute_preferred_dir_no0')
jp_configurable_serverapp = <function jp_configurable_serverapp.<locals>._configurable_serverapp at 0x7f547962e790>
async def test_absolute_preferred_dir_not_root_subdir_set(tmp_path, jp_configurable_serverapp):
path = str(tmp_path / "subdir")
os.makedirs(path, exist_ok=True)
not_subdir_path = str(tmp_path)
app = jp_configurable_serverapp(root_dir=path)
app.contents_manager.preferred_dir = not_subdir_path
with pytest.raises(HTTPError) as error:
print(app.contents_manager.preferred_dir)
> await app.contents_manager.dir_exists(app.contents_manager.preferred_dir)
E Failed: DID NOT RAISE <class 'tornado.web.HTTPError'>
tests/test_serverapp.py:405: Failed
----------------------------- Captured stdout call -----------------------------
/tmp/pytest-of-runner/pytest-0/test_absolute_preferred_dir_no0
Seems like we're missing a validator in contents/filemanager.py that ensures that preferred_dir is not absolute and exists as a subdirectory to root_dir. That will at least cause these tests to pass, but doesn't address non-filesystem-based content managers.
Using this failure as an example, why would Windows be raising an HTTPError when the directory should clearly exist since its a parent of root_dir? I think the *nix test is behaving correctly.
A preferred dir that is a parent dir of root dir makes no sense. The directory exists on disk, but does not exist within the root of the contents manager, so you would expect CM.dir_exists() to fail (you are not supposed to be able to escape outside the root dir).
Seems like we're missing a validator in contents/filemanager.py that ensures that preferred_dir is not absolute and exists as a subdirectory to root_dir. That will at least cause these tests to pass, but doesn't address non-filesystem-based content managers.
To be explicit, this is the exception I'm expecting should raise: https://github.com/jupyter-server/jupyter_server/blob/e9735fcc8a4d21a623fe81fc5af5726a9a8df6ad/jupyter_server/services/contents/fileio.py#L258-L259
By relying on this check, we don't need a generic validator for preferred_dir, we simply rely on the known endpoint on the CM (dir_exists()) to do the check for us. If we were to make a generic validator, this is the method we would use, but we cannot since traitlets doesn't support async validation.
You could use run_sync in the validator.
Using this failure as an example, why would Windows be raising an HTTPError when the directory should clearly exist since its a parent of root_dir? I think the *nix test is behaving correctly.
Damn, I was looking at CM.dir_exists and thinking its os.path.dir_exists (or equivalent). You're right, it should be in the context of the Contents Manager.
However, what I still don't understand about these particular failures is why Windows is behaving differently than *nix? Filesystem differences aside, python handles existence checks and the like similarly, so it seems there must be something more systemic in play here to explain why the "validation paths" would be different.
Taking a closer look, to_os_path() on the line above, unconditionally prepends path with root, so the test that path.startswith(root) always succeeds. Not sure why Windows would be different. I suspect its because you don't have an "API path" there, but a fully-formed Windows path.
Here are the variable values from my debug session:
path = '/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0'
root = '/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0/subdir'
os_path = '/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0/subdir/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0'
I'm not sure what Windows would produce here, but this is why *nix is not raising the exception.
Prior to that AsyncFileContentsManager.dir_exists unconditionally strips "/" from the preferred_dir value because self._get_os_path(path=path) assumes an "API path" (which doesn't strike me as the correct way to do that).
I suspect it's dumb luck that Windows is passing here, probably due to separator differences that cause to_os_path to produce slightly different results. It seems this can all be prevented by introducing a validator to FileContentsManager, in which case preferred_dir's value would not be an absolute path because the validator would have caught it when preferred_dir is initialized (outside of the test's with block). The following works, but would require the test (and perhaps others) to move the initialization within the with block and change the expected exception to TraitError.
@validate("preferred_dir")
def _validate_preferred_dir(self, proposal):
"""Do a bit of validation of the preferred_dir."""
value = proposal["value"]
if os.path.isabs(value):
# If we receive an absolute path, make it relative to root_dir. If it doesn't
# start with root_dir, raise an exception since it's relative path will be
# invalid anyway.
if not value.startswith(self.root_dir):
raise TraitError(
f"The preferred_dir ({value}) must be a subdirectory of root_dir ({self.root_dir})"
)
value = os.path.relpath(value, self.root_dir)
value = os.path.join(self.root_dir, value)
if not os.path.isdir(value):
raise TraitError(f"The preferred_dir ({value}) is not a directory")
return value
You're right, other contents manager implementations would need to introduce similar validation since we can't do this generically due to the async/traitlets issue.
Here are the variable values from my debug session:
path = '/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0' root = '/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0/subdir' os_path = '/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0/subdir/private/var/folders/t0/lmgg5t1j35dcftfz88jf1mqw0000gn/T/pytest-of-kbates/pytest-190/test_absolute_preferred_dir_no0'I'm not sure what Windows would produce here, but this is why *nix is not raising the exception.
~~Thanks, this is really helpful. I'm guessing the Windows test succeeds since the presence of a drive (e.g. C:) will force it to discard the parts before that. I'll sort this out (it needs fixing independently of fixing preferred_dir validation!).~~
The Windows test did indeed throw an HTTPError since it has a drive (there is a specific check). The Linux test will however only return False, and not throw (it will eventually be converted to a 404 when hit as an API though). I think the unit test should check that it either returns false, or throws a 404.
@blink1073 said:
You could use
run_syncin the validator.
~~I don't get it: Isn't run_sync async now? At least all the usage I'm seeing uses await run_sync(X).~~
I think I now understand that you meant ServerApp.io_loop._run_sync.
@blink1073 If you did mean the ioloop one, that raises:
jupyter_server\services\contents\manager.py:92: in _validate_preferred_dir
dir_exists = self.parent.io_loop.run_sync(self.dir_exists(value))
C:\cleanconda\envs\server-dev\lib\site-packages\tornado\ioloop.py:524: in run_sync
self.start()
C:\cleanconda\envs\server-dev\lib\site-packages\tornado\platform\asyncio.py:199: in start
self.asyncio_loop.run_forever()
C:\cleanconda\envs\server-dev\lib\asyncio\base_events.py:586: in run_forever
self._check_running()
[...]
E RuntimeError: This event loop is already running
No, this one: https://github.com/jupyter/jupyter_client/blob/1aa276b84c53e269dce309bb5cefaaf9bd14f8f6/jupyter_client/utils.py#L51