bids-examples icon indicating copy to clipboard operation
bids-examples copied to clipboard

Reduce the size of the bids examples repository and rewrite the history

Open Remi-Gau opened this issue 2 years ago • 5 comments

the bids examples repo is now close to a 100 Mo: this is really not ideal for something meant to be lightweight

Most of the weight comes from the stimuli examples.

$ du -h -d 0  */stimuli | sort -rh
23M    eeg_ds000117/stimuli
8,1M    eeg_ds003645s_hed/stimuli
8,1M    eeg_ds003645s_hed_longform/stimuli
8,1M    eeg_ds003645s_hed_inheritance/stimuli
2,0M    fnirs_automaticity/stimuli

Given that sitmuli folder content is almost completely unspecified in the BIDS specification it seems excessive that they would take so much space.

Discussed during the last maintainers meeting, our first approach would be to zero the content of any stimuli file in there and then rewrite the history.

Pinging @VisLab to get the opinion of the HED team. Should we be concerned that not having examples of the actual stimuli images may reduce the usefulness of the examples when it to understand the hed tags?

Remi-Gau avatar Jun 02 '23 15:06 Remi-Gau

What ever decision we agree on, the history of the repo may need to be rewritten:

  • https://github.com/bids-standard/bids-examples/issues/119

Remi-Gau avatar Jun 02 '23 15:06 Remi-Gau

We can remove the stimuli directories. The dataset on OpenNeuro has them. Do you want me to do it?

On Fri, Jun 2, 2023 at 10:52 AM Remi Gau @.***> wrote:

What ever decision we agree on, the history of the repo may need to be rewritten:

#119 https://github.com/bids-standard/bids-examples/issues/119

— Reply to this email directly, view it on GitHub https://github.com/bids-standard/bids-examples/issues/373#issuecomment-1573957564, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJCJOUZYU7BRUN76RMDP73XJID5VANCNFSM6AAAAAAYYPXU64 . You are receiving this because you were mentioned.Message ID: @.***>

VisLab avatar Jun 02 '23 16:06 VisLab

We can remove the stimuli directories. The dataset on OpenNeuro has them. Do you want me to do it?

If you can do it that'd be much appreciated.

No need to remove them, you can simply truncate the datafiles in them:

https://github.com/bids-standard/bids-examples/blob/master/CONTRIBUTING.md#how-to-truncate-data-files-to-0kb

Remi-Gau avatar Jun 02 '23 16:06 Remi-Gau

Amended the title of this issue to reflect the steps left to do

Remi-Gau avatar Jul 14 '23 12:07 Remi-Gau

FWIW, recently-ish added git-replace command could be of help after history rewrite to establish "continuity" of history. See e.g. https://andrewlock.net/reducing-the-size-of-a-git-repository-with-git-replace/ .

yarikoptic avatar Oct 19 '23 17:10 yarikoptic