CuBIDS icon indicating copy to clipboard operation
CuBIDS copied to clipboard

allow CuBIDS to concatenate subjects together from separate directories to be able to run with data on the S3

Open audreymhoughton opened this issue 3 years ago • 6 comments

Use case example: pull subject directory into a temp directory run cubids-validate, cubids-group on that subject, do this with all subjects in an S3 bucket, and concatenate them into one group or validation file.

Let me know if I need to provide more clarity here.

audreymhoughton avatar Nov 30 '21 21:11 audreymhoughton

Putting this on our roadmap (would be a nice feature for future use)

scovitz avatar Feb 15 '23 00:02 scovitz

@scovitz yes, please! I currently can't run cubids-group on any of my datasets, because they are on s3. I can only run cubids-validate.

audreymhoughton avatar Feb 15 '23 18:02 audreymhoughton

I think we could make a feature for cubids-add-nifti-info where the nifti files get downloaded from s3 temporarily and their info added to the sidecars. Then you'd only locally need the sidecars and empty files for the niftis to run cubids group

mattcieslak avatar Feb 15 '23 18:02 mattcieslak

For some of our datasets, even having the sidecars locally is too much.

audreymhoughton avatar Feb 15 '23 18:02 audreymhoughton

In that case cubids probably couldn't fit all that metadata into memory at once. I think the best approach would be to split up huge datasets and do them in batches. I think this would be possible, and definitely easier than writing an s3 version of everything

mattcieslak avatar Feb 15 '23 19:02 mattcieslak

Yeah, I think the key is to be able to concatenate multiple group files together. Then the method of running doesn't really matter as long as there's a cubids-group-merge function or something.

audreymhoughton avatar Feb 15 '23 19:02 audreymhoughton