nbgitpuller
nbgitpuller copied to clipboard
[Exit Status 128] - Nbgitpuller Doesn't Allow Access to R.Datahub
Hello! We're using nbgitpuller in order to have students pull files from a remote repo. and there has been multiple students who have the following issue when trying to log onto r.datahub:

For nearly all students, it seems as if the file lab/lab01/lab01.Rmd manages to be staged but not fully committed, and thus during the pull it returns exit status 128. This also results in multiple duplicated files all throughout the students' datahub repos, in all the subdirectories.
I'm still not too sure as to how to duplicate this issue; it seems to be happening to different students at different times when they click the nbgitpuller link. This has yet to happen to me, but I'm really curious to see what exactly could be causing this thing.
Some insight on this issue would be super helpful! I can also provide additional details if needed. (Also please forgive me if I forgot something crucial; this is the first time I'm writing an issue report).
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
Thanks for opening this, @cdbeon! It's a well written useful issue, so thank you for putting effort into it :)
Do you have a link to the repo that's causing this? My first thought is that maybe both the upstream repo and the users somehow had modified the same lines - that might be causing the error. I know we test for this in nbgitpuller, but maybe this is an edge case. This might explain it being hard-ish to reproduce. And if both upstream and users are constantly working on the same file, might also explain the backup files strewn around.
Does this go away for a user by itself? Or is it repeatable per-user? Being able to reproduce this will help a lot in figuring out what's going on and how we can fix it.
Thanks for replying, yuvi :)
Here's the link to the repo
The simple fix that I've been doing (while trying to find out what exactly the error is) is going to the students' terminals, cleaning it using git clean -f, then committing the one file from the pull that didn't get committed (i.e. lab/lab01/lab01.Rmd). Students are usually able to access their Datahubs no problem after that, until recently. I've had a student where the problem persisted despite doing the fix.
To my knowledge, the error persists until their server auto-shutdown (due to inactivity) or if they go to the control panel and stop/start their own server again. The next time the students are able to access their Datahubs, however, you can see all the duplicated files (even in files that aren't being touched, e.g. the README file in the main directory).
Something to add: the problem sometimes occurs despite students not touching lab01.Rmd for weeks, and number of cases of the problem tends to spike after the staff (I) merges changes onto origin/master on the ph142-fa20 remote repo.
oh, also the nbgitpuller link that we're using:
https://r.datahub.berkeley.edu/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fph142-ucb%2Fph142-fa20&urlpath=rstudio%2F&branch=master
Update: one small fix that we've attempted is to delete the lab01.Rmd on the remote repo in hopes that there would be no conflicts... this actually ended up in more cases for students and additional files within that directory to be deleted as well (lab01.Rmd uses some of the other files, e.g. the autograder function).
I just had a student have a similar (but apparently different) experience. I have not had it myself but have no idea why it happened to him. I just tried using the same link to the repo and am having no problems. So far, he is the only student to have this problem. I tried manually closing his server but when he clicked on the link again it was the same problem with the exit status 128.
Here is the stack trace:

It says fatal: not a git repository (or any of the parent directories): .git which is odd because, as I said, I was able to use the same link (which is given as a button on MS Teams to my users so they don't have to copy/paste anything).
And here is the nbgitpuller link: http://jupyter.rjc.local/user/cferster/git-pull?repo=https%3A%2F%2Fgithub.com%2Fconnorferster%2FWorkbook_02&urlpath=tree%2FWorkbook_02%2FWorkbook+2.ipynb&branch=main
Just adding that I have an issue that sounds very similar to @cdbeon .
- it only occurs for a few students at a time, and only sporadically
- it seems to occur only when I make a change upstream
- the only pushes I ever make upstream are to add an entirely new folder to the repo, which shouldn't cause any issue AFAIK
- I cannot reproduce this on my own user account
- students in my class log in to our hub from Canvas using LTIAuthenticator
- when it happens, I notice afterward that there is a staged but uncommitted file (I think it's typically the same file that appears in the nbgitpuller link)
- when it happens, any time the student tries to log in they end up with yet another copy of basically every file in their repository
- I have attached the full nbgitpuller.log . It seems to indicate that nbgitpuller thinks that all of the upstream files have been renamed for some reason...?
- Here is what a student's directory looks like when the issue occurs for them: folder_structure.log (ignore the fact that the folder is called
dsci-100-student-backupand notdsci-100-student, this is a backup folder I made of one instance when I was originally trying to figure out what was going on.) - the repository that is involved is https://github.com/ubc-dsci/dsci-100-student (the current commit hash is dd4313e )
Here is an example of the kind of nbgitpuller link I'm using:
https://dsci-100-student.stat.ubc.ca/jupyter/hub/lti/launch?custom_next=/jupyter/hub/user-redirect/git-pull%3Frepo%3Dhttps%3A%2F%2Fgithub.com%2FUBC-DSCI%2Fdsci-100-student%26subPath%3Dmaterials%2Ftutorial_01%2Ftutorial_01.ipynb
Here are versions of everything that I'm using. I know some things are currently outdated. My next attempt at resolving the issue is to upgrade various things (especially nbgitpuller, of course):
On the actual server:
pip3 list | grep jupyter
jupyter (1.0.0)
jupyter-client (6.1.7)
jupyter-console (6.2.0)
jupyter-core (4.6.3)
jupyter-telemetry (0.1.0)
jupyterhub (1.4.2)
jupyterhub-idle-culler (1.0)
jupyterhub-ltiauthenticator (0.4.0)
pip3 list | grep nb
nbconvert (5.6.1)
nbformat (5.0.7)
nbgrader (0.5.6)
widgetsnbextension (3.5.1)
Inside the docker container that runs when a student logs into the Jupyterhub
pip3 list | grep jupyter
jupyter-client 6.1.7
jupyter-core 4.6.3
jupyter-telemetry 0.0.5
jupyterhub 1.1.0
jupyterlab 2.2.5
jupyterlab-git 0.23.3
jupyterlab-pygments 0.1.1
jupyterlab-server 1.2.0
pip3 list | grep nb
nbclient 0.5.0
nbconvert 6.0.3
nbdime 2.1.0
nbformat 5.0.7
nbgitpuller 0.9.0.dev0
Hopefully helpful. I will update this post if I end up resolving the problem by upgrading my nbgitpuller et al versions.