Git history is getting huge due to Jupyter Notebooks
Just cloned the repo and noticed that it was taking quite some time, downloading 550 MB for a fairly low-LOC repo:
I wonder if there's any merit towards switching to a fork/squash-merge approach instead of the branch based? Realistically, that history is only ever going to become larger.
Counterpoint:
- This repo is not really something many people outside devs clone so it might be overengineering.
wdyt @giovp @LucaMarconato ?
I wonder if what is proposed here is an acceptable solution: https://stackoverflow.com/questions/1209999/how-to-use-git-to-get-just-the-latest-revision-of-a-project.
I don't think that having these extra 500 MB on disk is the issue (mildly inconvenient at best), but realistically this history will very quickly grow even larger in size if we keep working on it without squashing these Jupyter Notebook commits - same for spatialdata-notebooks.
Maybe we could implement a "squash-and-merge-only" rule to counter-act that a little bit?
That could be a good approach @giovp wdyt?
It would be interesting to decrease the size of this repo by squashing commits, not sure how to do it though. The solution you pointed to @LucaMarconato is interesting for dev perspective indeed. And also @timtreis you are right that the whole purpose of having the notebooks repo separate is that we knew this would happen in the future.