HCPpipelines icon indicating copy to clipboard operation
HCPpipelines copied to clipboard

Audit of "intermediate files" prior to HCP-D/A processing

Open mharms opened this issue 7 years ago • 2 comments

Prior to running the Pipelines in the HCP-D/A data, it would be good if we could do an "audit" of the full set of pipeline outputs, with a goal of moving away from the "packaging" of files that we want to "keep". i.e., As we move to the data living on the cloud, we should plan on simply saving all the files in given, specified output directories. If there are some files in those directories (e.g., the entire MNINonLinear folder) that we really don't want included from certain pipelines, they should probably be deleted as part of the Pipeline script itself.

Relatedly, since we haven't completely dismissed the possibility of saving at least some "intermediates" in the cloud, it would be beneficial to review the intermediates with an eye toward what could be eliminated to reduce storage needs considerably, while keeping any intermediates that might be particularly hard to regenerate, or which might be particularly useful for debugging purposes.

mharms avatar Feb 22 '18 14:02 mharms

I agree. Keith's list on HCP Users is an obvious starting point that I think I suggested previously

glasserm avatar Feb 23 '18 00:02 glasserm

As part of this audit, we should also make sure that any moving or creating of files that was previously happening during the "packaging" process occurs as part of the pipeline scripts themselves.

mharms avatar Feb 23 '18 22:02 mharms