notebook
notebook copied to clipboard
Log all executed commands in a cell
Feature request: option to log all executed commands in a cell (security audits)
Related: https://github.com/jupyterlab/jupyterlab/issues/7411 https://github.com/jupyterhub/jupyterhub/issues/2562 https://stackoverflow.com/questions/54313101/audit-commands-run-in-jupyter-notebook
Does anyone know if there has been any movement to make this possible? I've been chasing down old issues, and stackoverflow posts looking for a way to log commands run in cells. It seems almost silly that this kind of logging would be missing...
Anyone ? It's highly needed or does someone have another approach?
any movement on this!?
I've been trying to figure out how to get these logs in order to satisfy security audit requirements for one of my clients. Unfortunately, it doesn't look like they're included in the kernel logs. I tried tail'ing the jupyterhub-singleuser process for my test user while running commands in a notebook, but the actual command is not displayed with comm_msg events:
[D 2023-05-08 01:38:52.909 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (busy) [D 2023-05-08 01:38:52.911 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (idle) [D 2023-05-08 01:38:52.972 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (busy) [D 2023-05-08 01:38:52.974 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (idle) [D 2023-05-08 01:38:53.042 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (busy) [D 2023-05-08 01:38:53.043 SingleUserNotebookApp kernelmanager:419] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: comm_msg [D 2023-05-08 01:38:53.044 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (idle) [D 2023-05-08 01:38:53.046 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (busy) [D 2023-05-08 01:38:53.047 SingleUserNotebookApp kernelmanager:419] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: comm_msg [D 2023-05-08 01:38:53.049 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (idle) [D 2023-05-08 01:38:53.050 SingleUserNotebookApp kernelmanager:417] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: status (busy) [D 2023-05-08 01:38:53.052 SingleUserNotebookApp kernelmanager:419] activity on b2f0e0df-752d-4f47-8e64-e192a910e877: comm_msg
This is functionality that would be really useful for deploying jupyterhub in settings where there are audit requirements for access to sensitive data. If it exists, I haven't been able to get it working. I'm running JupyterHub 1.1.0 on Amazon EMR 6.1.1, for what it's worth.
I am also interested in this functionality and want to know if others managed to find a solution for it.
For our self-hosted hub instance, I managed to create an audit logging plugin, but it was very hard to maintain.
Some of the problems I ran into:
- There are many different handlers for kernel interaction and file management, you can subclass them all with audit logging implemented, but then you'd have to maintain all of them
- There are numerous undocumented changes in the kernel handlers
- The different paths that can be traversed in any handler are hard to predict and not particularly well documented
I ended up with an extension that monkey patches the handlers of the kernel, the content manager and the file handler. However, now we cannot upgrade to the latest versions of the server and notebook, the handlers are now async
It would be great if a functionality was built into the Jupyter server itself so that you could ensure an audit log without a hacky extension or changes to the kernel settings on the notebook side