ADIOS2 icon indicating copy to clipboard operation
ADIOS2 copied to clipboard

A new bp tool: select & filter variables

Open ax3l opened this issue 3 years ago • 0 comments

As discussed in our meetings, we would like to propose a standalone parallel BP tool that reads everything in a .bp file/dir, block by block, and can add select/filter variables & attributes, rename and copy them, etc.

  • This is a post-processing tool for data curation
  • In practice, such operations will be done with way less parallel processes than were used in the original write.
  • It would be good to be able to use arbitrary little (or many) ranks to do this, only costing time to copy but not negatively influencing distribution on physical targets.

Why is this feature important?

This is needed for data curation & archiving of large simulation data.

What is the potential impact of this feature in the community?

  • Saving 100s of TBytes in PFS <-> Tape swapping workflows.
  • Only archiving the data that is worth archiving.

Is your feature request related to a problem? Please describe.

This is an RFE.

Describe the solution you'd like and potential required effort

A new tool that can do block-wise copies, parallelizes easily and can thin out data from steps and by variable/attribute name.

Describe alternatives you've considered and potential required effort

Writing this in Python by users will most of the time be some serial-script-something.

Additional context

Discussed in the WarpX/ADIOS ECP meetings.

ax3l avatar Mar 25 '21 19:03 ax3l