jupyter_server icon indicating copy to clipboard operation
jupyter_server copied to clipboard

Filtering request using OpenAPI spec

Open fcollonval opened this issue 3 years ago • 12 comments

Fixes https://github.com/jupyterlab/jupyterlab_server/issues/167

This is a PoC to add allowed and blocked request at runtime.

fcollonval avatar Dec 22 '21 15:12 fcollonval

@bollwyvl I created this PoC to filter request at runtime using OpenAPI specification for allowed and blocked routes.

For allowed routes, this sounds great. But for blocked routes (blocking only request matching that subset of rules), it may be too complicated. In particular you can not blocked easily any sub path by simply blocking a common root. The other issue I see is that OpenAPI does not support path parameter with slashes.

So... should I switch to something more trivial to define routes like Regex and list of methods?

fcollonval avatar Dec 22 '21 15:12 fcollonval

you can not blocked easily any sub path by simply blocking a common root.

Right: it's only convention that sub-routes are even related. Regexen would be a way to do it, but could even be applied as a class of patch. The key thing is still to end up with an actual, accurate OpenAPI at the end, whether created by JSON Patch, JSON-e, python, or otherwise.

bollwyvl avatar Dec 22 '21 16:12 bollwyvl

Thanks a lot for reviewing this one @bollwyvl

you can not blocked easily any sub path by simply blocking a common root.

Right: it's only convention that sub-routes are even related. Regexen would be a way to do it, but could even be applied as a class of patch. The key thing is still to end up with an actual, accurate OpenAPI at the end, whether created by JSON Patch, JSON-e, python, or otherwise.

Ok then let's keep OpenAPI as expected input format and let interested admin/dev design their tool to generate it. But that does not address the issue with path parameter with slashes that we are using a lot (like for the content API). The way I see it, is to allow to register path transformer that will escape slashes as needed to fulfill OpenAPI expectation. Something like:

/api/contents/path/to/file.ext :arrow_right: /api/contents/path%2Fto%2Ffile.ext

and :crossed_fingers: that openapi library will pick correctly the parameter

fcollonval avatar Dec 23 '21 10:12 fcollonval

Codecov Report

Merging #645 (ccac364) into main (db46446) will increase coverage by 0.19%. The diff coverage is 90.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #645      +/-   ##
==========================================
+ Coverage   77.88%   78.08%   +0.19%     
==========================================
  Files         110      113       +3     
  Lines       10405    10630     +225     
  Branches     1400     1429      +29     
==========================================
+ Hits         8104     8300     +196     
- Misses       1911     1937      +26     
- Partials      390      393       +3     
Impacted Files Coverage Δ
jupyter_server/serverapp.py 65.62% <70.83%> (+0.11%) :arrow_up:
jupyter_server/specvalidator.py 86.07% <86.07%> (ø)
jupyter_server/tests/extension/test_launch.py 94.11% <89.47%> (-1.34%) :arrow_down:
jupyter_server/tests/test_spec_validation.py 96.61% <96.61%> (ø)
jupyter_server/extension/application.py 75.22% <100.00%> (+0.90%) :arrow_up:
...pyter_server/tests/extension/mockextensions/app.py 94.44% <100.00%> (+0.50%) :arrow_up:
jupyter_server/tests/test_specvalidator.py 100.00% <100.00%> (ø)
jupyter_server/services/kernels/kernelmanager.py 80.63% <0.00%> (-1.91%) :arrow_down:
jupyter_server/base/handlers.py 65.31% <0.00%> (-0.21%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update db46446...ccac364. Read the comment docs.

codecov-commenter avatar Dec 28 '21 15:12 codecov-commenter

I gladly add some documentation about this feature. Could somebody advice me on the best place to do so?

fcollonval avatar Dec 28 '21 16:12 fcollonval

Maybe these docs?

blink1073 avatar Dec 28 '21 18:12 blink1073

I added to the developers section as it is not configurable. So it probably makes more sense there.

fcollonval avatar Dec 29 '21 10:12 fcollonval

@fcollonval and @bollwyvl , great stuff here!

Do you mind waiting to merge this until after we have our next Jupyter Server meeting (January 6th)? I'm hoping to get more folks to weigh-in on this feature since we've discussed this in the past.

For context, we added a way to add/remove whole "services" in Jupyter Server by overriding default_services attribute in the ServerApp: https://github.com/jupyter-server/jupyter_server/blob/3e78339014a0b871c4e25175637564d8f3d5e9d9/jupyter_server/serverapp.py#L741-L757

This PR goes a few steps further and allows developers to cherry-pick individual routes (i.e. within individual services), which is really slick.

Zsailer avatar Dec 31 '21 15:12 Zsailer

Following the discussion at yesterday meeting,

The authorization layer will definitely have overlap with this proposal as it can be seen as adding authorization policy to all users. At the functionality level, I see two main differences (compared to https://github.com/jupyter-server/jupyter_server/pull/165):

  • This PR enforces the rules on any request - it does not rely of having the handler decorated by tornado.web.authenticated. This is a technical detail but it could be important if you don't control the server extensions loaded within the application.
  • This is not a RBAC approach (in the opposite of #165). Authorization policy can take lots of forms (RBAC, ABAC, Open Policy Agent, XACML, Amazon Web Service IAM,...). They have their own logic on what should be defined to create a policy. And they allow different granularity of policy - it is one of the reason external services like the Open Policy Agent will thrive in large organizations but are likely too complex for smaller players.
    In particular expressing the rules for this approach is more demanding than what is required for #165 - admin sys will certainly needs to create their own scripts to generate the OpenAPI specs.

That said I can't formulate the best articulation between this PR and https://github.com/jupyter-server/jupyter_server/pull/165. It could make sense to have both simultaneously - although as pointed by @echarles it will likely be confusing.

fcollonval avatar Jan 07 '22 08:01 fcollonval

This PR enforces the rules on any request - it does not rely of having the handler decorated by tornado.web.authenticated. This is a technical detail but it could be important if you don't control the server extensions loaded within the application.

There have been discussions around the default behavior in case of absence of @authenticated decorator. The same discussion could happen for @authorized. If a developer does not decorate, should the authorization layer reject the request? I like hearing "security by default", so I would be in favor to reject if the developer does not decorate, but this is a discussion to have in https://github.com/jupyter-server/jupyter_server/pull/165

This is not a RBAC approach (in the opposite of Add authorization layer to server request handlers #165). Authorization policy can take lots of forms (RBAC, ABAC, Open Policy Agent, XACML, Amazon Web Service IAM,...).

True. This being said, I don't think https://github.com/jupyter-server/jupyter_server/pull/165 enforces RBAC approach. I have e.g. given a demo a few months ago during our developer meeting of a Open Policy Agent (OPA) implementation which nicely integrates with the proposed authorization layer.

echarles avatar Jan 07 '22 09:01 echarles

I like hearing "security by default", so I would be in favor to reject if the developer does not decorate, but this is a discussion to have in #165

:+1:

This is not a RBAC approach (in the opposite of Add authorization layer to server request handlers #165). Authorization policy can take lots of forms (RBAC, ABAC, Open Policy Agent, XACML, Amazon Web Service IAM,...).

True. This being said, I don't think #165 enforces RBAC approach. I have e.g. given a demo a few months ago during our developer meeting of a Open Policy Agent (OPA) implementation which nicely integrates with the proposed authorization layer.

Awesome that the current proposal is so flexible.

So does that means we could transform this PR as an concrete Authorizer implementation (in the sense of the authorization layer) that uses only the request to provide the allow/deny answer?

fcollonval avatar Jan 07 '22 14:01 fcollonval

So does that means we could transform this PR as an concrete Authorizer implementation (in the sense of the authorization layer) that uses only the request to provide the allow/deny answer?

This would be a nice convergence. Maybe have a a look at https://github.com/jupyter-server/jupyter_server/pull/165 where the Authorizer is a class that can be extended, or that you can inherit from something that will implement this PR features.

Not sure if the the deny by default is implemented in that PR, but it should be possible to make it work like that.

echarles avatar Jan 07 '22 14:01 echarles