jupyter_server Handling large number of files

At the moment, when a user opens a folder from notebook or jupyterlab, jupyter_server would read all the files inside the folder using os.lstat, which is very costly for large number of files. https://github.com/jupyter-server/jupyter_server/blob/51e3ec362b2b12af48f0e101959c4cbec9d5cb33/jupyter_server/services/contents/filemanager.py#L262-L271

This makes it basically impossible to open a folder with large number of files, the backend would freeze for a long time before being responsive again. And even when the backend returns the data, the frontend would crash due to the rendering of all the files. See https://github.com/jupyterlab/jupyterlab/issues/8700

It would be nice to improve this architecture, using paging or other methods to partially read the files.

Jun 08 '21 15:06 cnydw

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

Jun 08 '21 15:06 welcome[bot]

I created a draft pull request https://github.com/jupyter-server/jupyter_server/pull/539

Together with my other commit https://github.com/cnydw/jupyterlab/commit/6e615c058d9b9e27caeba405b7c3f32446d90214 on the JupyterLab frontend, it could open a folder with 100000 files without problem.

ezgif-1-2c77e653fc85

The two commits I made are just POC, the API changes can certainly be improved. I think it makes sense to first make the backend API changes in jupyter_server, then propagate the frontend changes to JupyterLab and Jupyter Notebook accordingly.

@fcollonval @telamonian

Jun 08 '21 15:06 cnydw

hi, any updates on getting this merged?

Jun 08 '22 19:06 kzhang2

jupyter_server jupyter_server copied to clipboard

Handling large number of files

jupyter_server
jupyter_server copied to clipboard