srai icon indicating copy to clipboard operation
srai copied to clipboard

Fix repository size

Open simonusher opened this issue 2 years ago • 3 comments

The repository size is unnecessarily large. If I'm correct it is primarily due to the gh-pages branch.

simonusher avatar Jun 28 '23 21:06 simonusher

Using these commands I checked that cloning the main branch itself only takes up around 3 MiB, whereas fetching gh-pages afterwards uses up 1,53GiB.

git clone [email protected]:srai-lab/srai.git --single-branch
Cloning into 'srai'...
remote: Enumerating objects: 1802, done.
remote: Counting objects: 100% (19/19), done.
remote: Compressing objects: 100% (18/18), done.
remote: Total 1802 (delta 3), reused 12 (delta 0), pack-reused 1783
Receiving objects: 100% (1802/1802), 3.12 MiB | 885.00 KiB/s, done.
Resolving deltas: 100% (1040/1040), done


git fetch origin gh-pages
remote: Enumerating objects: 6257, done.
remote: Counting objects: 100% (441/441), done.
remote: Compressing objects: 100% (125/125), done.
remote: Total 6257 (delta 164), reused 441 (delta 164), pack-reused 5816
Receiving objects: 100% (6257/6257), 1.53 GiB | 9.92 MiB/s, done.
Resolving deltas: 100% (2910/2910), completed with 1 local object.
From github.com:srai-lab/srai
 * branch            gh-pages   -> FETCH_HEA

Looking further, these largest objects on gh-pages branch take up more than 1 GiB: git ls-tree -r --long HEAD | sort -k 4 -n -r | less

100644 blob c6a29458acf330ff53d5d18bb261af47f9b49fc8 75558497   0.1.4/examples/loaders/osm_way_loader/index.html
100644 blob a15c4d883f4911616c29c3b493c8e89b9708193d 75556712   dev/examples/loaders/osm_way_loader/index.html
100644 blob 789db2383c9389f4e912d1576cb31a3de42f9828 75530291   0.2.0/examples/loaders/osm_way_loader/index.html
100644 blob aabe12d24bbc9d1619cb2854b25734a0dc465269 75447716   0.1.0/examples/loaders/osm_way_loader/index.html
100644 blob 559f46c8a9682340b50ae0aa9f676650f3324d1f 75447716   0.1.1/examples/loaders/osm_way_loader/index.html
100644 blob 4734654330d20bdac69da8b5c7389695e91efa1c 42931670   dev/examples/loaders/files/85f58fbd393c2d5f03d18fee77a7d13f67f2a202e3f932a96992088ead539cb9.osm.pbf
100644 blob bb9e31f5729c6ae945ba6424b94612ece9299d47 42883951   0.2.0/examples/loaders/files/b38dc00f9c72b58ea114e1001c230167ab0cfde4886d98504807f9c5aef892f0.osm.pbf
100644 blob fc03438658de7b8bbd51306fb534d1bd76522bc6 42745857   0.1.4/examples/loaders/files/b38dc00f9c72b58ea114e1001c230167ab0cfde4886d98504807f9c5aef892f0.osm.pbf
100644 blob 804d5410bbc75904461773a8f1e13b9feeb834bf 42122265   0.1.1/examples/loaders/files/b38dc00f9c72b58ea114e1001c230167ab0cfde4886d98504807f9c5aef892f0.osm.pbf
100644 blob b172d176d59f7c0351279133841e8b815a75b836 42121456   0.1.0/examples/loaders/files/b38dc00f9c72b58ea114e1001c230167ab0cfde4886d98504807f9c5aef892f0.osm.pbf
100644 blob a828e0db5d6e3c4a3ac91f507d2531866a85d489 35307667   dev/examples/neighbourhoods/adjacency_neighbourhood/index.html
100644 blob 69bdfa89b5b46ff2e1dfbe731e3cb72763af146a 35302816   0.2.0/examples/neighbourhoods/adjacency_neighbourhood/index.html
100644 blob 59a8aa097c336725c93854a939eeea44e5852da0 35156992   0.1.1/examples/neighbourhoods/adjacency_neighbourhood/index.html
100644 blob 6b1bc708021f1473615c5d8b6ae2377e2e1d3139 35080130   0.1.0/examples/neighbourhoods/adjacency_neighbourhood/index.html
100644 blob 281d93b22fcc36c149c0738624a95e78eaf451b5 35031122   0.1.4/examples/neighbourhoods/adjacency_neighbourhood/index.html
100644 blob 00ecd9aa23d6b07b06f4e03f3f9cd2d366261d61 20655358   dev/examples/regionalizers/voronoi_regionalizer/index.html
100644 blob b9b116ef15487dc08fc62f199ef3b848e8bfa7c7 20648966   0.1.0/examples/regionizers/voronoi_regionizer/index.html
100644 blob 6ae78737bee7231eacb4fef03296610a4fa0ebff 20646653   0.1.4/examples/regionalizers/voronoi_regionalizer/index.html
100644 blob 7e2f42b4eaece8052ea69b5acf988bc690e0d16f 20625325   0.2.0/examples/regionalizers/voronoi_regionalizer/index.html
100644 blob fce0b4f20aaf4581bdb272d68ebcb05761bb6449 20620633   0.1.1/examples/regionizers/voronoi_regionizer/index.html
100644 blob 349417414697784b14e629302ebab42abbe0385b 14375801   dev/examples/loaders/files/example.zip
100644 blob 349417414697784b14e629302ebab42abbe0385b 14375801   0.2.0/examples/loaders/files/example.zip
100644 blob 349417414697784b14e629302ebab42abbe0385b 14375801   0.1.4/examples/loaders/files/example.zip
100644 blob 349417414697784b14e629302ebab42abbe0385b 14375801   0.1.1/examples/loaders/files/example.zip
100644 blob 349417414697784b14e629302ebab42abbe0385b 14375801   0.1.0/examples/loaders/files/example.zip
100644 blob 767177eb71ec5c0e4b38d52f632b5c3d276b401b 9056663    dev/examples/regionalizers/administrative_boundary_regionalizer/index.html
100644 blob 5c12f0a8687f544b10d70687807cfcb23c6c20cc 9030155    0.2.0/examples/regionalizers/administrative_boundary_regionalizer/index.html
100644 blob a98f5c2dfaa10165244a4f710b8d491d216460d9 9029158    0.1.4/examples/regionalizers/administrative_boundary_regionalizer/index.html
100644 blob 7da467c4263b247d4306ff3feb5623d3bcff9424 9023358    0.1.1/examples/regionizers/administrative_boundary_regionizer/index.html
100644 blob 5822bb176c7eca03e155c9d44fa56dd9ec5d2863 9023358    0.1.0/examples/regionizers/administrative_boundary_regionizer/index.html
100644 blob dc1b358dc26a31c2db709e40994a839a5ce3eb4a 7605285    dev/examples/embedders/files/ccfc4ec912ac803c97b939feba28e0a57de61e0e543e36e51c010f9f0a167e37.osm.pbf
100644 blob 5c30580800f40a854ef7bc1d5472ebabab4157a7 7569202    0.2.0/examples/embedders/files/ccfc4ec912ac803c97b939feba28e0a57de61e0e543e36e51c010f9f0a167e37.osm.pbf
100644 blob d3c4943eae12b31c78b7cc7575f9b14c97b92cf8 7550761    0.1.4/examples/embedders/files/ccfc4ec912ac803c97b939feba28e0a57de61e0e543e36e51c010f9f0a167e37.osm.pbf
100644 blob d37c1cb3ef2cd9c2b029cbdcf9ed333cd0e3e58b 7436267    dev/examples/embedders/hex2vec_embedder/index.html
100644 blob c3c3bda60a06c441cf0cb81bb9ab6fe5eec3f96f 7432105    0.1.1/examples/embedders/files/ccfc4ec912ac803c97b939feba28e0a57de61e0e543e36e51c010f9f0a167e37.osm.pbf
100644 blob a788fd0d681e690f8e061d1e034c7c2e6536b17e 7432105    0.1.0/examples/embedders/files/ccfc4ec912ac803c97b939feba28e0a57de61e0e543e36e51c010f9f0a167e37.osm.pbf
100644 blob 4b5c74b9605c44b1d02a0016bd2cfd71c09531bc 7410349    0.2.0/examples/embedders/hex2vec_embedder/index.html
100644 blob 4be4c45a539fa1f7263387ff9cd3f4eab77261ed 7408679    0.1.4/examples/embedders/hex2vec_embedder/index.html
100644 blob 74e9c439de9b93ef95e22088071905588e9177e0 7313386    0.1.0/examples/embedders/hex2vec_embedder/index.html
100644 blob a32ae71aa45b38a06bffcf8a9cef7ae04ec30ffe 7313230    0.1.1/examples/embedders/hex2vec_embedder/index.html
100644 blob 242e55f3da19a1afca53f9844d82ece80133df1b 6897928    dev/examples/loaders/osm_pbf_loader/index.html

simonusher avatar Jul 17 '23 21:07 simonusher

Ideas for resolving this:

  1. Ignore - possibly mention it in contribution guide so people don't have to download a couple of gigs to make a contribution. Not ideal as it will just keep growing at an alarming rate.
  2. Move the docs to another repo - moves the issue away from srai repository, but has the same problem as 1.
  3. Either get rid of maps in their current form (interactive etc.), or somehow compress them and get rid of unnecessary files if any. Will require a history rewrite if we want to reduce the size of already existing commits.

simonusher avatar Jul 17 '23 21:07 simonusher

Tool for cleaning git history - https://rtyley.github.io/bfg-repo-cleaner/

Calychas avatar Jul 19 '23 19:07 Calychas