openPMD-api
openPMD-api copied to clipboard
Writing a field of custom type
HI,
I am not sure what is a good pattern in the following case.
I have a large C-style array of some bitwise-copyable struct. A particular struct being used depends on build parameters and comes from a 3rd party library. So generally I don't know how this type is defined, just know the size and that it is bitwise-copyable. I would like to write this field using storeChunk. Do not care about interpretation, just that after reading back it is exactly same. Should I reinterpret it as N * sizeof(MyStruct) array of char and write like that, or is there something better?
This comes from checkpointing internal RNG states (which may come from a number of implementations) in PIConGPU.
@ax3l do you know if work on encoding POD structs in ADIOS2 is still going? Otherwise, the two unsatisfying approaches that I see for now:
- Do exactly what you described. Drawback: This turns the normally self-contained openPMD datasets partially into non-portable byte-by-byte serializations. Not a huge problem in your situation, not really desirable either.
- Completely reorganize the data. By writing a field of struct type, you are essentially writing an array-of-struct. Turn that into a struct of array. Has a development and performance overhead.
I believe approach 2 requires knowing all struct members and being able to access them individually. Which is not always possible if this struct comes from a 3rd-party code, like the RNG state case which triggered my question.
Thanks for the question. Yes, struct support is definitely not available in ADIOS and would need its own APIs, similar to MPI, on how to define user-specific types. You would also need to know what's in your struct then.
I would say do approach 1. - you could use fixed-size char arrays (Datatype::VEC_CHAR) in openPMD-api for that. chars are guaranteed to be of one Byte in size and are usually the natural type to go over for such blobs.
On second thought, Datatype::VEC_* might not be implemented for all backends - in that case use CHAR and increase array size by N as you suggested is the way to go.
I thin we likely also used char or int8 types of some sort in the old checkpoint-restart implementation for the RNG state in PIConGPU for this... Would have to look it up :)