Fitsio works (only) with unaligned data structures
On systems like Sparc and (partly) ARM it is required that data pointers are aligned, and also on other architectures unaligned data may lead to a lower performance. Numpy however seems to create non-aligned data structures which cause the unit tests to fail (segfault) under Sparc64 and armhf platforms. Aligning the data in the tests does not work yet, since this creates an unknown column type "V" for table data.
I think this might be straightforward, but tedious to implement. It basically comes down to ignoring some of the fields ("V") and repacking the data before creating the table and writing.
Note writing aligned data already works fine. For writing, references to the individual fields in the structured array are packed into a list, and in the C code I use the following to get a pointer to the data for a given row
data = PyArray_GETPTR1(array, row);
this is fully aware of the alignment, strides, etc.
To make the reading also work I could restructure the C code to work in the same way. I think this would have minimal performance impact, but would be more general, and would support reading into aligned data as well.
Also note, that arr.dtype.names reports the names of interest, but the descriptor has the extra fields we would ignore
>>> dtype=numpy.dtype([('ra','f8'),
('index','i4'),
('dec','f8'),
('s','S7')],align=True)
>>> xa=numpy.zeros(3, dtype=dtype)
# note this looks different from unaligned, it shows
# the dtype as a dict
>>> xa
array([(0.0, 0, 0.0, ''), (0.0, 0, 0.0, ''), (0.0, 0, 0.0, '')],
dtype={'names':['ra','index','dec','s'], 'formats':['<f8','<i4','<f8','S7'], 'offsets':[0,8,16,24], 'itemsize':32, 'aligned':True})
# this is what we want
>>> xa.dtype.names
('ra', 'index', 'dec', 's')
# this contains stuff for padding, which we don't want
>>> xa.dtype.descr
[('ra', '<f8'),
('index', '<i4'),
('', '|V4'),
('dec', '<f8'),
('s', '|S7'),
('', '|V1')]
Just wanted to leave a sort of "me too" - on Ubuntu we now can't build python-fitsio for armhf, because our builders have kernels which raise SIGBUS on unaligned access and now the testsuite fails because of this.
Oho, I forgot to attach the bt (git master, numpy 1.11.1 rc1, cfitsio 3.390) in case this is a different issue to the one you are discussing.
root@ethical-kitten:~/fitsio/build/lib.linux-armv7l-2.7# gdb --args python2.7 -m unittest discover -v
[…]
(gdb) run
Starting program: /usr/bin/python2.7 -m unittest discover -v
Cannot parse expression `.L954 4@r4'.
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
testAsciiTableWriteRead (fitsio.test.TestReadWrite) ... ok
testBz2Read (fitsio.test.TestReadWrite) ...
Program received signal SIGBUS, Bus error.
0xf6855f64 in ffi8fi8 (input=0x5f9307, ntodo=1, scale=1, zero=0, output=0xfffe5c10, status=0xfffecdf4) at putcolj.c:1872
1872 putcolj.c: No such file or directory.
(gdb) bt
#0 0xf6855f64 in ffi8fi8 (input=0x5f9307, ntodo=1, scale=1, zero=0, output=0xfffe5c10, status=0xfffecdf4) at putcolj.c:1872
#1 0xf6854f1c in ffpcljj (fptr=0x372320, colnum=8, firstrow=1, firstelem=1, nelem=1, array=0x5f9307, status=0xfffecdf4) at putcolj.c:1428
#2 0xf6845c24 in ffpcl (fptr=0x372320, datatype=81, colnum=8, firstrow=firstrow@entry=1, firstelem=firstelem@entry=1, nelem=1, array=0x5f9307, status=status@entry=0xfffecdf4) at putcol.c:701
#3 0xf69568a4 in PyFITSObject_write_columns (self=<optimized out>, args=<optimized out>, kwds=<optimized out>) at fitsio/fitsio_pywrap.c:1956
#4 0x00068d98 in PyEval_EvalFrameEx ()
#5 0x00062c50 in PyEval_EvalCodeEx ()
#6 0x00068af8 in PyEval_EvalFrameEx ()
#7 0x00062c50 in PyEval_EvalCodeEx ()
#8 0x0006864c in PyEval_EvalFrameEx ()
#9 0x00062c50 in PyEval_EvalCodeEx ()
#10 0x00068af8 in PyEval_EvalFrameEx ()
#11 0x00062c50 in PyEval_EvalCodeEx ()
#12 0x000797dc in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
@olebole I need to ask you some questions about this.
I'm confused how numpy record arrays work at all on these systems if align=False.
On one of these systems, is it possible that the data will be created aligned by default, or does aligned always need to be specified?
If it must be specified, do you know of a generic way to determine if aligned data is needed? If so, then this can be set internally in fitsio.
I'm working on this in a branch called aligned
In the case of aligned data, one cannot use the faster reading functions.
This results in a significant slowdown, about a factor of two in the case where one is reading all rows and all columns.
So I do think we want to special case this.
If you look at the aligned branch, I've added some features. Please see below.
If you construct the FITS object with align=True (or use fitsio.read(..., align=True)) then it tries to read into objects with alignment. This is not the default because it slows down reading by as much as a factor of two.
Also in the test you an turn on testing with alignment
fitsio.test.align=True
fitsio.test.test()
(If you know a better way to do that let me know)
I hope this will now pass your tests.
If we can determine that this alignment should be turned on by default on certain systems, we should add that feature, so you don't have to do align=True everywhere.
actually I figured out a simple way to make both tests with and without alignment run, you don't have to set that yourself now.
Sorry that I didn't reply (and test) for a long time.
I now uploaded a version containing your changes from the aligned branch (combined here), but they didn't work (fail on ARM/32 bit).
Where do we stand on this one?
With the latest release 1.1.3, I still get bus errors in the test.
Do you have any ideas for moving forward?
I am afraid that I have no idea. I could try to generate another stacktrace, if that helps.