containers
containers copied to clipboard
Prototypical workflow #1
Originally "presented" in training materials issue: https://github.com/ReproNim/module-dataprocessing/issues/26#issuecomment-488298754
Here I would like to have it as a checklist ([x] (r)
for "waiting the release(s)")
- [x] (r)
datalad create -c text2git analysis-for-the-pi; cd analysis-for-the-pi
text2git
has an outstanding issue https://github.com/datalad/datalad/issues/3361 which might redefine it, but otherwise - possible - [x]
datalad create -d . data/dicoms && cp ALL_DICOMS data/dicoms/
- [x]
datalad install -d . https://github.com/ReproNim/containers/
- [ ] workout heuristic for heudiconv under
code/heudiconv-heuristic.py
- [x] (r)
datalad create -d . -c bids data/bids
-c bids
is coming with 0.12 release of datalad and datalad-neuroimaging some time soonish (so - partially done) - [ ]
datalad containers-run -n containers/heudiconv -f code/heudiconv-heuristic -o data/bids --files data/dicoms
(TODO - container: https://github.com/ReproNim/containers/issues/2) - [x] Deface! apparently there is no "official" bids-app yet, but there is a number of defacers available, thus TODO - streamline (bids-app, container etc)
- Carry out analys(es). For each one ATM subdataset should first be pre-created. Some (e.g.,
fmriprep
might benefit from custom-c
configs on what should go under git/annex)- [x]
datalad create -d . -c text2git data/mriqc
- [x] (r)
datalad containers-run --explicit -n containers/bids-mriqc -i data/bids -o data/mriqc '{inputs}' '{outputs}' ...
(TODO - test! TODO -- needs 0.3.2 release of -containers for proper'{inputs}'
to not leak container file in there) - [x]
datalad create -d . -c text2git data/simple_workflow
- [ ]
datalad containers-run -n containers/simple_workflow -i data/bids -o data/simple_workflow ... '{inputs}' ... '{outputs}'
(TODO - container: https://github.com/ReproNim/containers/issues/2)
- [x]
- [ ] when all is good, look into upload to wherever (
datalad create-sibling*
,datalad publish
) ;) TODO: full invocation example
Notes:
- could be argued to step slightly away from YODA principle of derived datasets containing all needed information to reproduce themselves, because there is only a single
containers/
subdataset at the super-dataset level, and derived datasets do not contain it. For the purpose of this workflow I am considering the top level super-dataset as the "reproducibility target". Having access to it will provide all needed information to reproduce any particular subdataset. - in principle aforementioned shortcoming could easily be resolved by installing
containers/
dataset into each result subdataset, but then it would also require installation of original data "neighbor" dataset within. Could be a reckless clone or benefit from CoW on such as BTRFS. But for the initial presentation/use-case I think it should be good enough - from aforementioned example it seems to be very common to run a container which saves output to a new sub-dataset (if that one doesn't exist yet). I wonder if that anyhow could be assisted by
datalad-container
(TODO - issue)
FTR: That's pretty much what datalad-hirni is for and our approach is similar (but YODA compliant ;-) ). I have a poster and a software demo at OHBM - so working on proper documentation ATM. Give me a little bit more time, then I can link to a reasonable description.
- [ ] Deface! apparently there is no "official" bids-app yet, but there is a number of defacers available, thus TODO - streamline (bids-app, container etc)
This would be highly appreciated! According to http://bids-apps.neuroimaging.io/apps/, PeerHerholz/BIDSonym is now an "official" BIDS app. Would you mind considering to add it to the mix?
Checked - since official bids app, it was added to the mix. I guess now is a matter of trying out again (I remember filling a number of issues) and seeing if all is good