datalad-neuroimaging support DICOMs tarballs

support DICOMs tarballs

Open yarikoptic opened this issue 6 years ago • 1 comments

It is quite common to have dicoms (e.g. for a single sequence) .tar or .zip balled. I wondered if we could/should make it possible to extract/aggregate metadata from those. I could see it done

within dicom extractor
in a dedicated dicom-tarballs extractor
a generic "extractor helper" (e.g. called "balls") which could then be used to prepare (extract) data for other extractors to munch on

The question is how to "represent" that metadata

tarball could be considered as a "subdataset" of a kind, and thus we could extract/keep it similarly to how we deal with subdatasets
files within tarball could be considered "continuation" of a path for the file, e.g. for a file bu.dcm within a/b/bu.tar it could be path a/b/bu.tar/bu.dcm or some more explicitly defined boundary a/b/bu.tar//bu.dcm or a/b/bu.tar#bu.dcm or even a/b/bu.tar#path=bu.dcm to be inline with how we deal with referencing files in tarballs within our special remote
we could extract/contain only fields common and identical to all files in the tarball, and thus associate with the tarball itself

May 31 '18 15:05 yarikoptic

Hm. In https://github.com/psychoinformatics-de/datalad-hirni (which when ironed out should yield a generalized form to be part of datalad-neuroimaging) we simply make a subdataset from the tarball, which in return is added via add-archive-content. So, you can throw away the DICOMs (and/or the tarball) after metadata extraction, but have metadata on the actual DICOMs. ATM I don't see, why it would be useful to invent an additional way to reference an archive's content other than what add-archive-content does. Do you have a usecase that somehow benefits from not annexing the extracted files?

May 31 '18 17:05 bpoldrack

datalad-neuroimaging datalad-neuroimaging copied to clipboard

support DICOMs tarballs

datalad-neuroimaging
datalad-neuroimaging copied to clipboard