PyWebScrapBook
PyWebScrapBook copied to clipboard
Per book/user access permissions
I've running wsb on my web host and would like to keep several scrapbooks for different users. E.g, if I create books as:
/srv/http/scrap/.wsb
/srv/http/scrap/myscrap
/srv/http/scrap/workscrap
/srv/http/scrap/mywifescrap
/srv/http/scrap/mychildscrap
/srv/http/scrap/sharedscrap
is it possible to limit access to these directories depending on the user, e.g:
me -> myscrap, sharedscrap, workscrap
wife -> mywifescrap, sharedscrap
kid -> mychildscrap
Or may be I've missed something in config settings?
P.S. Thanks a lot for your extension. I've missed such functionality since FF migrated to WE
No. This feature is much more complicated and is currently not implemented.
For a quick workaround, you can run PyWebScrapBook behind a reverse proxy and apply HTTP authorization rules for each sub-path at the reverse proxy server. But this cannot prevent sub-directories from being listed for unauthorized users.
No problem. I just wanted to know if there is such feature implemented or not. As a workaround I can run several instances of wsb on different ports and set reverse proxies for them like
https://example.com/myscrap -> 127.0.0.1:8081 -> /srv/wsb/mybook
https://example.com/sharedscrap -> 127.0.0.1:8082 ->/srv/wsb/sharedbook
Not so much elegant, however should work.
This also works, but the sidebar won't list all scrapbooks and you'd have to modify the backend server URL in the options every time when you want to switch between them.
For a quick workaround, you can run PyWebScrapBook behind a reverse proxy and apply HTTP authorization rules for each sub-path at the reverse proxy server. But this cannot prevent sub-directories from being listed for unauthorized users.
I have tried this attempt with apache as reverse proxy but when selecting an unauthorized folder, unfortunately WebScrapBook doesn't like being rejected by a 40X status code:
- If you try to restrict access using the classic "Require user
" approach, a 401 Unauthorizederror will be returned, which unfortunately results in the browser presenting a basic authentication popup. - If you let the reverse proxy reject with
403 Forbiddenor404 Not Found, the scrapbook tree-window correctly displays a red message: Unable to init tree: 403. But the scrapbook select-box gets greyed out with the forbidden sub-folder pre-selected! I didn't manage to get out of this state easily. - Alternatively I tried to redirect to the webscrapbook root url if an unauthorized folder is selected. This redirect works, but makes webscrapbook think that it has successfully selected the forbidden folder.
Do you have any other suggestion how the reverse proxy should react on unauthorized access? Or maybe the scrapbook select box could easily be set to the first scrapbook or even not greyed out if a http error occurs?
My apache config lookes like that:
<Location /webscrapbook>
ProxyPass http://127.0.0.1:8081
ProxyPassReverse http://127.0.0.1:8081
[...]
AuthType basic
AuthName "WebScrapBook"
AuthBasicProvider anon
AuthBasicAuthoritative off
Anonymous_MustGiveEmail off
Anonymous_LogEmail off
Anonymous user "*"
Require valid-user
</Location>
# reject with 401 Unauthorized:
# This will result in a basic auth popup window:
#<Location /webscrapbook/leo>
# Require user leo
#</Location>
# reject with 403 Forbidden
# This will result in a disabled select-box with no easy way get out
#RewriteCond "%{REMOTE_USER}" !"^leo$"
#RewriteRule "^/webscrapbook/leo/" "-" [forbidden,last]
# reject with 404 Not Found
# This will result in a disabled select-box with no easy way get out
#RewriteCond "%{REMOTE_USER}" !"^leo$"
#RewriteRule ^/webscrapbook/leo/ - [redirect=404,last]
# redirect instead of reject:
# This will let webscrapbook think it has successfully selected the scrapbook
#RewriteCond "%{REMOTE_USER}" !"^leo$"
#RewriteRule ^/webscrapbook/leo/ /webscrapbook [redirect=temp,last]
@leo-b I don't get you. Please elaborate what you're trying to achieve.
Fundamentally, an unauthorized user means he has no right to access the site, and cannot access the scrapbooks through the browser extension, no matter how the UI actually shows to reject him.
@leo-b I don't get you. Please elaborate what you're trying to achieve.
This issue is about using the same PyWebScrapBook instance for different users while controlling access to the sub-directories at the reverse proxy.
My example shows several ways to achieve that at reverse proxy level. However, the problem is how webscrapbook reacts on being rejected when trying to access a sub-directory without having permission for that path. (It is rather a UI problem.) Currently it correctly shows the HTTP status message as an error but it keeps the unauthorized sub-folder selected in its scrapbook select-box and disables (greys-out) the select-box so that the user has no chance to get out into the sub-folders he actually has permission again.
That's why I say that per-book access control is not supported.
The current backend APIs are site-wide. When a user has full access to a book, he must have access to the APIs, and as a result have access to any other books under the same site.
It will be a very large code rework to make the APIs support per book access control.
If you really want to do some further access control, consider creating multiple webscrapbook servers that access some folders through symlink with OS-level access control, such as:
/scrapbooks/book1/ # owned by user1, no access for other users
/scrapbooks/book2/ # owned by user2, no access for other users
/scrapbooks/shared/ # owned by admin, read-only for user1 and user2
/server1/ # host at localhost:8081 and proxied at https://my.example.com/user1/, accessible by user1
/server1/book1/ # link to /scrapbooks/book1
/server1/shared/ # link to /scrapbooks/shared
/server2/ # host at localhost:8082 and proxied at https://my.example.com/user2/, accessible by user2
/server2/book2/ # link to /scrapbooks/book2
/server2/shared/ # link to /scrapbooks/shared
However there may still be many caveats, such as a conflict when multiple servers attempt to write to the same book simultaneously.