datalad "extractors" for input/output arguments for run

Came up for HCP .spec files use case -- inputs/outputs might already be specified in the .spec file. Giving that .spec to wb_command for actual transformation. The question -- how could we have some kind of way to provide adapters/extractors to inform datalad run/rerun about input/outputs

edit: sample .spec files are in hcp dataset https://github.com/datalad-datasets/human-connectome-project-openaccess eg

(git)smaug:/mnt/datasets/datalad/crawl/hcp-openaccess[master]                                                                                                                                                    
$> find HCP1200/ -iname *spec                                                                                                                                                                                    
HCP1200/100206/MNINonLinear/100206.164k_fs_LR.wb.spec                                                                                                                                                            
HCP1200/100206/MNINonLinear/100206.MSMAll.164k_fs_LR.wb.spec                                                                                                                                                     
HCP1200/100206/MNINonLinear/Native/100206.native.wb.spec                                                                                                                                                         
HCP1200/100206/MNINonLinear/fsaverage_LR32k/100206.32k_fs_LR.wb.spec                                                                                                                                             
HCP1200/100206/MNINonLinear/fsaverage_LR32k/100206.MSMAll.32k_fs_LR.wb.spec

Mar 14 '22 18:03 yarikoptic

@mih suggested on the call to just provide some kind of "outside" runner which would comprehend .spec file and craft datalad run invocation providing all extracted input/outputs. Note: current master (aimed for 0.16.x) has ability for placeholders in inputs/output specs which might come of help.

Mar 15 '22 14:03 yarikoptic

Another use case came up in the context of https://github.com/con/opfvta-replication-2023 where we would like to get a specific subset of files. Currently it is a manual "datalad get" call with a long list of them coded up in Makefile.

To make it datalad run compatible, we would need either an excruciating list of -i options or be able to point to some file with the list of the filenames or it could be a script which would print them, which is exactly the target wishlist here, which would allow for that use case.

Aug 04 '23 15:08 yarikoptic

Also think about filenames with newlines -- we might want to make it 0-delimited list by default.

Aug 04 '23 15:08 yarikoptic

datalad datalad copied to clipboard

"extractors" for input/output arguments for run

datalad
datalad copied to clipboard