nbstripout icon indicating copy to clipboard operation
nbstripout copied to clipboard

nbstripout installation causes incorrect 'git status' results

Open jykim opened this issue 6 years ago • 19 comments

I seem to see changes for notebooks I haven't touched since I installed nbstripout. Interestingly, when I uninstall nbstripout and install it again, the changes are gone. Besides general performance issue, this prevents us from adopting nbstripout as a team.

 $ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)
... list of notebooks I haven't touched ...

----(after uninstall nbstripout and install it again)---

 $ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean

jykim avatar Aug 07 '17 18:08 jykim

What do you get as git diff when you hit the case of allegedly modified notebook files?

kynan avatar Aug 07 '17 21:08 kynan

I generally see nothing when I run get diff on those files, but I haven't tested them all. (can't share them since it's my work notebooks anyways) As I mentioned, It's especially interesting that these files are gone when I uninstall nbstripout and install it again, which I weird.

jykim avatar Aug 11 '17 19:08 jykim

Today I found that these changes show up after 'git pull', and I looked at a few other diffs and found that they are diffs between before) notebooks with output after) notebook with output stripped. So maybe the git maybe telling me to strip the output out from some notebooks which contains outputs from previous commits.

Does this make sense? But I still don't understand why these notebooks show up after 'git pull' and why they're gone after uninstall/reinstall.

jykim avatar Aug 11 '17 21:08 jykim

So what exactly does git status report in this situation? Are those notebooks modified? Or is it a mode change?

Also could you run a nbstripout --status before and after the reinstall and report if there's a difference?

kynan avatar Aug 21 '17 15:08 kynan

@jykim Any update?

kynan avatar Aug 04 '18 13:08 kynan

This might be related to an issue I posted about here. I still haven't figured it out, so any help is welcome.

stas00 avatar Aug 17 '18 18:08 stas00

I have the same problem: when nbstripout is enabled and I checkout several files, the omitting of changes in these files is not reflected by git status (git status shows their presence, the changes are actually omitted). If I do 'nbstripout --uninstall' for the repo and then do 'nbstripout --install' again then the changes are reflected by git status.

melaanya avatar Sep 13 '18 10:09 melaanya

@melaanya, try to change the configuration of the content filters as explained here and it'll help you debugging the issue. You will be able to tell when nbstripout is activated.

stas00 avatar Sep 13 '18 16:09 stas00

@stas00 I've looked at your issue and am about as puzzled as you are. Thanks for documenting it so thoroughly. Guess we'll have to find a better git plumber than I am to explain this behavior.

kynan avatar Oct 16 '18 18:10 kynan

Currently, pretty much any time one of our developers forgets to enable the filter and commits w/o stripping the notebooks out, it breaks git pull for anybody who already had a checked out version. And other than doing a new clone the only solution seems to be is to disable the filter, git pull and reenable the filter.

What we really need is a way to enforce content filters repo-wide, except git doesn't support that due to security reasons (overriding user's config is unsafe). I'm not sure what to do about it. It's a constant problem for us. Perhaps there is a need to re-design content filters so that they aren't configured via a private .gitconfig and can be automatically installed for everybody (note that our repo carries its own version of the nbstripout filter, so the culprit is just not being able to enforce the configuration). Unless perhaps I'm missing something and you have some insights of another way of configuring it, so that the user doesn't need to do anything.

And of course, there are server-side hooks which could do the policing, but it doesn't sound like the best solution. In particular since last I looked github doesn't support those.

stas00 avatar Oct 16 '18 18:10 stas00

Yeah, to me this sounds like a Git issue rather than an nbstripout issue and I'm not sure what (if anything) we could fix in nbstripout to help with this. Have you considered raising the issue on the Git mailing list?

kynan avatar Oct 28 '18 08:10 kynan

I already did so to no avail.

If it weren't for github and we were running our own git backend then server-side hooks could do the policing, but since github doesn't support that, there is nothing one can do to enforce that.

The best "enforcing" solution I found is to run a CI job that flags this problem as it appears, and hopefully the developers watch for [x]'s in their commit status on github.

echo "Check we are starting with clean git checkout"
if [ -n "$(git status -uno -s)" ]; then echo "git status is not clean"; false; fi
echo "Trying to strip out notebooks"
tools/fastai-nbstripout -d dev_nb/*ipynb dev_nb/*/*ipynb docs_src/*ipynb docs_src/*/*ipynb
echo "Check that strip out was unnecessary"
git status -s # display the status to see which nbs need cleaning up
if [ -n "$(git status -uno -s)" ]; then echo -e "!!! Detected unstripped out notebooks\n!!!Remember to run tools/run-after-git-clone"; false; fi

and the best resolution I found is:

disable the filters
git pull
fix the problems
enable the filters
git commit -a
git push

which in the fastai project is done with a script (-d is for disable):

tools/trust-origin-git-config -d
git pull
tools/trust-origin-git-config

stas00 avatar Oct 28 '18 19:10 stas00

FYI, we're discussing adding a dedicated nbstripout command to help with this flow in #108. Could be worth taking inspiration from the fastai scripts.

kynan avatar Nov 10 '19 22:11 kynan

I see the same problem, stopped using the tool after 2 days.

DanielRudnicki avatar Nov 10 '20 11:11 DanielRudnicki

@DanielRudnicki Reproduction steps for the failures you're seeing would help.

kynan avatar Apr 05 '21 15:04 kynan