support for structured data
It is convenient to have data packed into structures. For example, if a calculation requires a large number of pieces of information, it is preferable to have the following ( I realize this is a bit of a contrived example)
def func(sarray):
for i=0,range(sarray.size):
x = sarray['a'][i] + sarray['b'][i] + ... sarray['z'][i]
# do something with x
as opposed to
def func(a, b, c, d, ......, z):
for i in xrange(a.size):
x = a[i] + b[i] ... + z[i];
# do something with x
This could be solved by accepting structured arrays for input
sarray = zeros(n, dtype=[('a','f8'),('b','f8'),....('z','f8')])
res=func(sarray)
(edited for bugs)
Thank for the input @esheldon. In this particular example the number of parameters to pass is of course reduced but on the other hand the equation becomes more difficult to read. Anyway I do see cases where this could be convenient.
However, introducing structured arrays is a bit tricky:
- HOPE doesn’t support string literals, which would be required to access the columns
- Numpy’s structured arrays allow the user to define arrays with different data types per column. Something that is not possible in pure C.
I’m personally not a big fan of structured arrays (I don’t like the synthax sarray[“a”], prefer Pandas approach sarray.a). Anyway, let me think about this, maybe there is a good solution to this.
J
(sorry the formatting didn't go through in the email)
structured arrays map directly to an array of C structures with the same datatypes. The array can be created with or without alignment of the structure
dt=[('ra','f8'),('dec','f8'),('index','i4')]
# maps to packed C structures, no alignment
a = zeros(n, dtype=dt)
# maps to normal, unpacked C structures
dtype=numpy.dtype(dt, align=True)
a = zeros(n, dtype=dtype)
For the packed version you would need to make sure the struct in C is also packed, but for aligned it is a direct map. For simplicity you could demand only arrays created with align=True
In C the python sarray['a'][35] maps to sarray[35].a
notation:
structured arrays are built into numpy, so they are in a sense fundamental. Codes like pyfits and fitsio return structured arrays (although pyfits wraps it)
Also the sarray.a notation conflicts with python attributes. For example, you can't have a field called "size" because that is already used for the size of the array.