PerformDataWrite: more features requested
Now that an application can build a step over a long period of time, there are new usability issues. WarpX, back-tranformed diagnostics (BTD) builds a single step over the entire run. And other WarpX workflows have similar issues.
- if the application crashes, the half-built step is not accessible - but BTD wants access to that
- restart continuing from a half-written step.
Either dumping metadata as well so that a sophisticated user could process the already written blocks of variables, or a second-level concept of Step (for purpose of read) over multiple Steps (at write) would be a useful solution.
I've mulled this over a bit and would propose an external solution. Building up a timestep over a long period of time (while using PerformDataWrite to output some of the data) and somehow managing to have a mechanism where it could be recovered if the application died mid-run would be hideously complex. The best mechanisms I can think of would be to actualy write the metadata, metametadata, index, etc at the PerformDataWrite point and then when you got to the next PerformDataWrite(), go back and read the prior metadata, add the new metadata to it, reset the metadata file pointers to the where the timestep started and rewrite the metadata, metametadata, index, etc. But this puts the entire onus on the writer engine and almost certainly if the job ran out of time while this was happening you'd still get a corrupt timestep. So, I'd change how we're looking at it a bit. What we really have is a situation where what might be convenient for the writer to treat as a timestep (transaction?) and what the reader wants to treat as a timestep are different. I'd propose keeping the writer timestep concept the same. That is, it's an atomic thing where the timestep is only fully realized (with metadata, etc). in EndStep(). But if we have a situation where a reader wants to treat multiple writer steps as if they were a single step, we create a tool like adios_reorganize, but which is capable of combining timesteps. It would read one or more steps, take note of what variables were written in each, combine them (using rules to manage conflicts, etc.) and create an output file with larger "atomic" steps, which might include a final step that maybe was assembled from an incomplete set of pieces. This tool could either operate at the ADIOS external API level or, if we really wanted to pay the price of creating an engine/format specific tool, it could directly read, say, BP5 metadata and rewrite it while leaving the data in place and unmodified. This approach has the advantage that we don't modify the concept of timestep in either the reader or the writer.
I came to similar conclusions. Either we could just dump metadata for the timestep being constructed somewhere and use an external tool to merge the metadata pieces and construct a proper step later (reading and merging during writing is a terrible idea), or we could separate the step concepts for writing and reading.
Long time ago, I introduced two modes to the BeginStep(), an update mode to indicate a continuation of the previous step. It never got accepted by the team, even though there is a Mode parameter to BeginStep(). If we had actual step value stored in the index records, then we could merge metadata for identical steps at reading. I did not properly think it through at that time, so now later realized this would require read-ahead in the index to check if more metadata is needed to be read and processed. But at least, this could work both in files and streaming. I imagined CurrentStep() could not be always increasing (just be monotonically increasing), and a reader just could check that if the new step has the same value. But I concede that it messes us what is a step as a concept.
Another option at API level could be using a flag/mode in EndStep, like fortran/python print commands having an explicit flag to say "to be continued". If we could then save that flag in the record index, we could manage merging steps without read-ahead. I don't know if this would change anything to the read side at the API level.