ADIOS
ADIOS copied to clipboard
Numpy: Slicing is missing the stepping
Dear all,
I noticed that when accessing an adios variable with []
for slicing in numpy style the stepping functionality is missing.
For numpy arrays you can access every N
th entry in a dimension by doing array[::N]
.
However, if I do the same with an adios variable (which I want to do before I load it into RAM and thus before it is converted to a numpy array) it seems that the following is happening:
adios_var[start:stop:step]
# seems to become the same as
adios_var[start:stop // step]
The usage of a third (step
) variable is numpy-like but doesn't follow the numpy behaviour.
Currently, neither documentation nor error message indicates a misuse.
cc'ing @PrometheusPi, @ax3l
You really do not want to stride during reads anyway for performance. Read all into a temporary var and stride in RAM :)
Minimal "working" example, e.g. with the file from ax3l's issue #69
import adios as ad
f = ad.file("data_compressed.bp")
f["xy"][0, :6:2] == f["xy"][0, :3]
gives [True, True, True].
I think this should be considered a bug. The proper way of handling this should be raising a ValueError
or a NotImplementedError
(if it's possible that adios will support stepping at some time - which could be useful if you do not think about performance but the array does not fit into RAM - which was the case for Marco).
The current implementation masks the problem, since the shape of the returned array is as expected, so in the worst case a user just doesn't notice that he got the wrong data!
A fix in python (the wrapper is not written in python, is it? I will post some pseudocode anyway^^) would be:
def __getitem__(self, key):
if isinstance(key, slice) and slice.step is not None:
raise NotImplementedError
return super().__getitem__(key)
@n01r @ax3l @wmles we are currently supporting in ADIOS2 the Python low-level and high-level APIs. You can see then in action using Jupyter Notebooks running on MyBinder (see ReadMe): https://github.com/ornladios/ADIOS2-Jupyter
A few things about high-level APIs:
- We don't have slicing/stride numpy syntax, high-level APIs are write/read based and are selection/random access (
np_array = read("var", start, count, step_start, step_count)
) and/or step-by-step based (usingfor step in stream
). As in the spirit of Python we heavily overload these 2 functions, but read always returns a numpy array. For examples see: https://github.com/ornladios/ADIOS2/blob/master/testing/adios2/bindings/python/TestBPWriteTypesHighLevelAPI.py - We are maintaining these APIs and potentially adding more features. e.g. compression: https://github.com/ornladios/ADIOS2/blob/master/testing/adios2/bindings/python/TestBPZfpHighLevelAPI.py
Let me know if this is something of interest, ADIOS2 readers should be able to handle ADIOS1.x generated files. Thanks!