ideas icon indicating copy to clipboard operation
ideas copied to clipboard

Question: maintaining a library of notebooks

Open lheagy opened this issue 6 years ago • 2 comments

Context

This is a request for ideas. I am helping maintain a library of jupyter notebooks that is being used by multiple instructors for a variety of courses in geophysics. Right now, we have one large repository of notebooks on github. Each instructor will only use a subset of them for their course, and the notebooks are delivered to students through Binder or a university managed JupyterHub.

I don't particularly want to encourage forking the repository and deleting the notebooks that the instructor doesn't need, because it becomes more challenging to incorporate improvements that they make as it is used in their course.

Sketch of a solution

Lightweight repos that download the requested notebooks

The instructor creates a lightweight repository that includes an index.ipynb with an overview of their course along with a notebooks.py (or similar) that contains a list of the notebook urls that should be downloaded for the course, then we have a simple install script that installs dependencies and downloads the desired notebooks.

We will need to be diligent about tagging and versioning the notebook library so that if an instructor wants to "freeze" the course to the version of notebooks available at the beginning of the course, they can.

Instructors create a brach on the repo

The instructor creates a branch and removes the notebooks that they don't want. This is fairly simple from the deployment perspective, but could be a bit more challenging with respect to bringing in updates (can be done, but requires that the instructors be somewhat comfortable with git and changing branches)

Input

  • Do you see any major flaws or drawbacks to an approach like this?
  • Have you seen other similar projects that we should look to for ideas?

lheagy avatar Dec 19 '18 18:12 lheagy

Parallel conversation happening on twitter: https://twitter.com/lindsey_jh/status/1075449492860633088

lheagy avatar Dec 19 '18 18:12 lheagy

Would git submodules work? I imagine creating multiple repositories containing each set of notebooks which could be accessed/managed by an organization or a specific person. On top of that, the full library could be created via git submodules pointing to those repositories. Here the only important aspect would be to define good sets of notebooks.

Advantages:

  • You can track everything independently, as the full library would be just a set of pointers towards the original repositories (even specific branches of those repositories).

Drawbacks:

  • The groups of notebooks should be particularly defined since they would give the structure to the full library.

aladinoster avatar Dec 20 '18 05:12 aladinoster