notebooks icon indicating copy to clipboard operation
notebooks copied to clipboard

This repository's size is huge

Open mcara opened this issue 3 years ago • 7 comments

This repository includes directories for MAST downloads as well as reference CRDS files bringing the size of this repository close to 4GB. In my opinion this is not necessary and all these directories should be removed from this repository.

mcara avatar Jan 26 '22 03:01 mcara

In my recent clone of this directory (2022-02-15), it seems to be only about 1 GB (984 MB), and the vast majority (945 MB)of that is in the .git subdirectory. I'm guessing that there have been images, FITS files, etc. under version control and since removed, but they live on in the .git history.

nkerman avatar Feb 15 '22 15:02 nkerman

1 GB is also a lot of disk space for something that isn't cat videos.

pllim avatar Feb 15 '22 17:02 pllim

Reducing the repo size would require rewriting the git history.

larrybradley avatar Feb 15 '22 17:02 larrybradley

Only 500-ish commits to squash...

pllim avatar Feb 15 '22 17:02 pllim

Going forward, maintainers need to be strict in the review process to not allow big data files (or any) being checked in.

pllim avatar Feb 15 '22 17:02 pllim

One major issue is not clearing Notebook's cells before committing. (I think I was guilty of this early on in a different repo)

nkerman avatar Feb 15 '22 17:02 nkerman

Proposal:

1, Push the current state of repo to another repo that no one will ever use (for history record keeping purpose). 2. Clear all the notebooks and unwanted data here. 3. Squash all the commits and force push to master. 4. ??? 5. Profit!

pllim avatar Feb 15 '22 17:02 pllim