openPMD-api icon indicating copy to clipboard operation
openPMD-api copied to clipboard

Option To Mask Invalid Regions with Zeros

Open ax3l opened this issue 4 years ago • 8 comments

It would be great to either in ADIOS2 or here have an option to mask Selections (offset+extent) reads that are outside of available data index space (openPMD: availableChunks()) with a constant value, e.g., zero.

This would simplify implementations like: https://github.com/openPMD/openPMD-viewer/pull/332 for the mesh-refinement extension https://github.com/openPMD/openPMD-standard/pull/252

cc @guj @franzpoeschel

ax3l avatar Feb 17 '22 21:02 ax3l

We should ask the ADIOS2 team how ADIOS2 treats undefined regions in reading. If those regions are just skipped, solving this could theoretically be done by zero-initializing the read buffer.

I would try not to roll our own solution for this as it would mean reimplementing many nontrivial things that are already done by ADIOS2:

  • matching read requests with available chunks
  • strided reading from those chunks into a contiguous buffer (I think there might be some fancy read parameters in ADIOS2 for this which we could use though)

franzpoeschel avatar Feb 24 '22 15:02 franzpoeschel

Discussed today: ADIOS does not write to undefined index regions in read.

We allocate memory (or a user passes it to us), so we can set this to a value like NaN or zero (make configurable) https://openpmd-api.readthedocs.io/en/0.14.4/_static/doxyhtml/classopen_p_m_d_1_1_record_component.html#ac31282d2109a693aa48e21a6f76fcb8f

ax3l avatar Mar 07 '22 18:03 ax3l

@lucafedeli88 tried using today the direct ADIOS2 Python Numpy bindings (2.7.1 and also 2.8?) - selecting the whole region of a refinement level there shows undefined (scrambled, non-zero) values outside the written region.

This makes me wonder of ADIOS2's read routines fill unwritten index areas really with zero, or if that is just an issue with the numpy bindings of ADIOS2 @pnorbert.

ax3l avatar Jul 27 '22 21:07 ax3l

Isn't this expected behavior? ADIOS2 does not fill unwritten index areas with zero, it entirely ignores them. So, for instance if you use the std::shared_ptr<T> loadChunk(…) overload, then the memory will get allocate, but noone will ever write to it, so you get random nonsense at read time.

franzpoeschel avatar Jul 28 '22 08:07 franzpoeschel

Also, I'm hesitant to initialize the buffer with zero in that line, as it's a costly operation that most users won't need. If you want a buffer to be filled with data everywhere that there is data, and zero otherwise, I'd say that's a rather application-specific requirement and relatively simple to manually emulate in two lines:

std::shared_ptr<float[]> chunk{new float[10]{0}};
E_x.loadChunk<float>(std::static_pointer_cast<float>(chunk), {0}, {10});

franzpoeschel avatar Jul 28 '22 08:07 franzpoeschel

Write: absolutely, there are no regions and they should not be filled.

Read: Undefined regions should maybe be explicit zero or NaN instead of UB in the ADIOS2 Python bindings?

ax3l avatar Jul 28 '22 18:07 ax3l

Well, numpy has functions to allocate arrays with initialization, like np.zeros, np.ones, np.full if someone wants to do that. As Franz explained, adios does not touch memory cells that has no incoming data.

pnorbert avatar Jul 28 '22 19:07 pnorbert

Yes, but it's the [] operator that causes this in your bindings already. @lucafedeli88 can you post your example from yesterday here?

ax3l avatar Jul 28 '22 19:07 ax3l