fitsio icon indicating copy to clipboard operation
fitsio copied to clipboard

Problems reading variable length arrays

Open dneise opened this issue 8 years ago • 9 comments

I have problems reading variable length arrays with fitsio 0.9.9.1 the platform:

Python 3.5.1 |Anaconda 2.5.0
numpy 1.10.4
fitsio 0.9.9.1

fitsio.test.test() runs nicely:

testNonStandardKeyValue (fitsio.test.TestWarnings) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.004s

OK
testAsciiTableWriteRead (fitsio.test.TestReadWrite) ... ok
testBz2Read (fitsio.test.TestReadWrite) ... ok
testChecksum (fitsio.test.TestReadWrite) ... ok
testExtVer (fitsio.test.TestReadWrite) ... ok
testGZIPTileCompressedWriteRead (fitsio.test.TestReadWrite) ... ok
testGZWriteRead (fitsio.test.TestReadWrite) ... ok
testHCompressTileCompressedWriteRead (fitsio.test.TestReadWrite) ... ok
testImageSlice (fitsio.test.TestReadWrite) ... ok
testImageWriteRead (fitsio.test.TestReadWrite) ... ok
testImageWriteReadFromDims (fitsio.test.TestReadWrite) ... ok
testImageWriteReadFromDimsChunks (fitsio.test.TestReadWrite) ... ok
testLowerUpper (fitsio.test.TestReadWrite) ... ok
testMoveByName (fitsio.test.TestReadWrite) ... ok
testPLIOTileCompressedWriteRead (fitsio.test.TestReadWrite) ... ok
testRiceTileCompressedWriteRead (fitsio.test.TestReadWrite) ... ok
testSlice (fitsio.test.TestReadWrite) ... ok
testTableAppend (fitsio.test.TestReadWrite) ... ok
testTableInsertColumn (fitsio.test.TestReadWrite) ... ok
testTableIter (fitsio.test.TestReadWrite) ... ok
testTableSubsets (fitsio.test.TestReadWrite) ... ok
testTableWriteDictOfArrays (fitsio.test.TestReadWrite) ... ok
testTableWriteDictOfArraysScratch (fitsio.test.TestReadWrite) ... ok
testTableWriteDictOfArraysVar (fitsio.test.TestReadWrite) ... ok
testTableWriteListOfArrays (fitsio.test.TestReadWrite) ... ok
testTableWriteListOfArraysScratch (fitsio.test.TestReadWrite) ... ok
testTableWriteListOfArraysVar (fitsio.test.TestReadWrite) ... ok
testTableWriteRead (fitsio.test.TestReadWrite) ... ok
testVariableLengthColumns (fitsio.test.TestReadWrite) ... ok

----------------------------------------------------------------------
Ran 28 tests in 2.300s

OK

The file is not actually FITS but rather FITS compliant as far as I understood. The format is explained here: http://arxiv.org/pdf/1506.06045.pdf

As far as I understood our developers, it should be compatible to std fits readers, just the compressed part would of course be returned as compressed data, which I have to decompress myself. Fair enough... so I tried to read a single row from the 2nd HDU, which containes the compressed events.

import fitsio
from fitsio import FITS
f = fitsio.FITS("/fact/raw/2016/08/09/20160809_149.fits.fz")
f[2][0]

I got:

ValueError                                Traceback (most recent call last)
/home/guest/neise/fooii.py in <module>()
      3 from fitsio import FITS
      4 f = fitsio.FITS("/fact/raw/2016/08/09/20160809_149.fits.fz")
----> 5 f[2][0]

/home/guest/neise/anaconda3/lib/python3.5/site-packages/fitsio/fitslib.py in __getitem__(self, arg)
   2660                 # will also get here if slice is entered but this
   2661                 # is an ascii table
-> 2662                 array = self.read(rows=res)
   2663         else:
   2664             return TableColumnSubset(self, res)

/home/guest/neise/anaconda3/lib/python3.5/site-packages/fitsio/fitslib.py in read(self, **keys)
   1789             if 'rows' in keys:
   1790                 del keys['rows']
-> 1791             data = self.read_rows(rows, **keys)
   1792         else:
   1793             data = self._read_all(**keys)

/home/guest/neise/anaconda3/lib/python3.5/site-packages/fitsio/fitslib.py in read_rows(self, rows, **keys)
   1924 
   1925         rows = self._extract_rows(rows)
-> 1926         dtype, offsets, isvar = self.get_rec_dtype(**keys)
   1927 
   1928         w,=numpy.where(isvar == True)

/home/guest/neise/anaconda3/lib/python3.5/site-packages/fitsio/fitslib.py in get_rec_dtype(self, **keys)
   2134             descr.append(dt)
   2135             isvararray[i] = isvar
-> 2136         dtype=numpy.dtype(descr)
   2137 
   2138         offsets = numpy.zeros(len(colnums),dtype='i8')

ValueError: invalid shape in fixed-type tuple: dimension smaller then zero.

I browsed the code a bit and found this line: https://github.com/esheldon/fitsio/blob/master/fitsio/fitslib.py#L2207

In case of my file, max_size is -1 for all columns, i.e. no max_size was specified (apparently that's ok?) So I commented out line 2207 and it worked...

So I was wondering if this line might be a bug ...

dneise avatar Aug 10 '16 11:08 dneise

yes, that looks like a bug. Can you please post your file so I can debug this?

esheldon avatar Aug 10 '16 11:08 esheldon

Gladly, but the typical file is a few GB ... the smallest I found quickly ~200MB.

I'll try to create a file containing just 2 events or so ... might take a while.

Thanks for the quick response.

dneise avatar Aug 10 '16 12:08 dneise

OK, I'll go ahead and fix that line and push to master and see if it works for you. But it would be good for me to have an example at some point

esheldon avatar Aug 10 '16 12:08 esheldon

pushed to master

esheldon avatar Aug 10 '16 12:08 esheldon

Hello again, I finally got around finding a tiny file containing only 5 events.

fact_example.zip

(I zipped it, so I was able to upload it here)

dneise avatar Aug 19 '16 08:08 dneise

When trying to read the file with fitsio.FITS I ran into a problem, which I think is home-made by those members of the FACT collaboration, who designed this kind of fits extension, but I would like to hear your opinion.

I'm not sure how to explain this point very concisely, so please bear with me .. I rather use a lot of words in order to prevent misunderstandings.


The interesting extension is the 2nd bintable:

In [3]: fitsio.FITS("fact_example.fits.fz")[2]
Out[3]: 

  file: fact_example.fits.fz
  extension: 2
  type: BINARY_TBL
  extname: Events
  rows: 5
  column info:
    EventNum            u1  varray[-1]
    TriggerNum          u1  varray[-1]
    TriggerType         u1  varray[-1]
    NumBoards           u1  varray[-1]
    UnixTimeUTC         u1  varray[-1]
    BoardTime           u1  varray[-1]
    StartCellData       u1  varray[-1]
    StartCellTimeMarker
                        u1  varray[-1]
    Data                u1  varray[-1]
    TimeMarker          u1  varray[-1]

Now, I knew (using other fact software to read these files) that the EventNum and TriggerNum columns both contain simple counters from 1 to 5 in this case. I also know from the documentation:

In our implementation we also allow for the storage of fixed-length data in the heap.

that these simple integers were also stored in the heap section of this bintable, albeit in this special blocked form, which is explained below Fig5 of that paper. So I expected to retrieve:

  • 1long (8byte) size
  • 1char ordering: 'C' or 'R'
  • 1byte num_proc
  • num_proc shorts proc[]: explaining how the content was encoded
    • in the paper it says bytes not shorts, but by studying our own C++ implementation, I found it is in fact shorts
    • 0: no encoding
    • 1: special fact smoothing
    • 2: special fact 16bit huffman encoding
  • and then the actual payload

Here is the maybe interesting part of the header of that bintable:

In [4]: fitsio.FITS("fact_example.fits.fz")[2].read_header()
Out[4]: 

XTENSION= 'BINTABLE'           / binary table extension
BITPIX  =                    8 / 8-bit bytes
NAXIS   =                    2 / 2-dimensional binary table
NAXIS1  =                  160 / width of table in bytes
NAXIS2  =                    5 / number of rows in table
PCOUNT  =              1134155 / size of special data area
GCOUNT  =                    1 / one data group (required keyword)
TFIELDS =                   10 / number of fields in each row
EXTNAME = 'Events'             / name of extension table
ZTABLE  =                    T / Table is compressed
ZNAXIS1 =               867382 / Width of uncompressed rows
ZNAXIS2 =                    5 / Number of uncompressed rows
ZPCOUNT =                    0
ZHEAPPTR=                  960
ZTILELEN=                    1 / Number of rows per tile
THEAP   =                  800
ZSHRINK =                    1 / Catalog shrink factor
TFORM1  = '1QB'                / format of EventNum [var. Length]
TTYPE1  = 'EventNum'           / FAD board event counter
ZFORM1  = '1J'                 / format of EventNum [4-byte INT]
ZCTYP1  = 'FACT'               / Compression type: FACT
TFORM2  = '1QB'                / format of TriggerNum [var. Length]
TTYPE2  = 'TriggerNum'         / FTM board trigger counter
ZFORM2  = '1J'                 / format of TriggerNum [4-byte INT]
ZCTYP2  = 'FACT'               / Compression type: FACT

So the uncompressed EventNum is a 4byte integer and there is no point in compressing one integer, so I expected num_proc == 1 and proc[0] == 0. So I expected to get 8 + 1 + 1 + 2 + 4 = 16 bytes when reading the 0-th EventNum row form the 2nd bintable.

some of my expectations were met ... some not

In [8]: event_num = f[2]["EventNum"][0]
/home/dneise/anaconda3__new/lib/python3.5/site-packages/fitsio/fitslib.py:2222: FITSRuntimeWarning: Column 'EventNum': No maximum size: '1QB'. Column '%s': No maximum size: '%s'. %s
  warnings.warn(mess, FITSRuntimeWarning)
ZHEAPPTR

The warning clearly comes from the fact, that the table header does not comply to the one of the forms for variable length arrays in the fits standard document (2008 version) rPt(e_max) or rQt(e_max), since the (e_max) part is missing. However:

In [9]: event_num
Out[9]: array([array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)], dtype=object)

In [10]: event_num[0]
Out[10]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)

In [11]: event_num[0].shape
Out[11]: (16,)

fitsio is still able to cope with this broken header and return the 16 bytes I expected ... however ... they are all zeros. The same was true for other colums:

In [16]: f[2]["TriggerNum"][0][0]
Out[16]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)
In [17]: f[2]["TriggerType"][0][0]
Out[17]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)
In [18]: f[2]["NumBoards"][0][0]
Out[18]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)
In [19]: f[2]["UnixTimeUTC"][0][0]
Out[19]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=uint8)
In [20]: f[2]["StartCellData"][0][0]
Out[20]: array([196, 192,  98, ...,   0, 223,   0], dtype=uint8)

But at some point data started coming in. I started scratching my head and then used a hexdump of the file, to see why I get all zeros at the beginning.

here is the interesting part:

00158640  73 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |s               |
00158650  45 4e 44 20 20 20 20 20  20 20 20 20 20 20 20 20  |END             |
00158660  20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  |                |
*
00158880  00 00 00 00 00 00 00 10  00 00 00 00 00 00 00 10  |................|
00158890  00 00 00 00 00 00 00 10  00 00 00 00 00 00 00 20  |............... |
001588a0  00 00 00 00 00 00 00 0e  00 00 00 00 00 00 00 30  |...............0|
001588b0  00 00 00 00 00 00 00 10  00 00 00 00 00 00 00 3e  |...............>|
[...]
00158b60  00 00 00 00 00 00 0b 4c  00 00 00 00 00 0d e7 d1  |.......L........|
00158b70  00 00 00 00 00 00 01 4c  00 00 00 00 00 0d f3 1d  |.......L........|
00158b80  00 00 00 00 00 03 59 42  00 00 00 00 00 0d f4 69  |......YB.......i|
00158b90  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00158c40  54 49 4c 45 01 00 00 00  1f 6f 03 00 00 00 00 00  |TILE.....o......|
00158c50  10 00 00 00 00 00 00 00  43 01 00 00 01 00 00 00  |........C.......|
*
00158c70  0e 00 00 00 00 00 00 00  43 01 00 00 01 00 10 00  |........C.......|
00158c80  00 00 00 00 00 00 43 01  00 00 28 00 00 00 14 00  |......C...(.....|
00158c90  00 00 00 00 00 00 43 01  00 00 50 8a b4 57 3f 74  |......C...P..W?t|
00158ca0  03 00 ac 00 00 00 00 00  00 00 43 01 00 00 c4 c0  |..........C.....|

At address 0x00158880 we see the first row, first column of the main data table, i.e. EventNum[0]. Left 8 byte is length, right 8 bytes is jump address to heap. So length is 0x10=16 (yeah as expected), and jump address is also 16. One might have naively expected the first address to be zero, but when looking into the FACT paper, one can find that they invented their own addition. A 16byte so called "TILE" header, which is some kind of private addition. By starting 16bytes behind the actual heap, we jump over this funny "TILE" header.

But where does the heap start again? The official documentation says:

# heap_start = main_data_table_start + THEAP
main_data_table_start = 0x00158880
THEAP = 800  # see table header snippet above
heap_start = main_data_table_start + THEAP
hex(heap_start) # --> '0x158ba0'

But in the hexdump we see

00158b90  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*

The asterisk means, zeros until the next printed address, which is

00158c40  54 49 4c 45 01 00 00 00  1f 6f 03 00 00 00 00 00  |TILE.....o......|

Ah... and there is the funny "TILE" header ... which is according to the FACT paper, the start of the heap. So something is wrong with the value of THEAP, I thought. The real value for THEAP would be:

0x00158c40 - main_data_table_start  # --> 960

And going back into the table header, to check for the value of THEAP again, I found a keyword, whose value is in fact 960. It was the (for me unknown an undocumented) keyword ZHEAPPTR.

I checked some of the papers cited in the FACT paper and found A Tiled-Table Convention for Compressing FITS Binary Tables, which defines ZHEAPPTR

ZHEAPPTR(optional keyword). The value field of this keyword shall contain an integer that gives the byte offset from the start of the main data table to the beginning of the compressed heap. Usage of this keyword is described in Section 5.2

comparing it to the definition of THEAP in the FITS std (2008 version):

THEAP keyword. The value field of this keyword shall contain an integer providing the separation, in bytes, between the start of the main data table and the start of [...] the heap.

That's the same isn't it?


Summing it all up:

I believe the designers of the FACT extension made a mistake. THEAP and ZHEAPPTR should have the same value, I see no reason to use ZHEAPPTR at all.

To test it, I made a copy of the file and in that copy overwrote the values of THEAP with values from ZHEAPPTR

cp fact_example.fits.fz fact_repaired.fits.fz
In [1]: import fitsio
In [2]: f = fitsio.FITS("fact_repaired.fits.fz", "rw")
In [3]: f[2].write_key("THEAP", f[2].read_header()["ZHEAPPTR"])
In [4]: f[2].write_key("THEAP", f[1].read_header()["ZHEAPPTR"])
In [5]: f.close()
In [6]: f = fitsio.FITS("fact_repaired.fits.fz")

In [7]: f[2]
Out[7]: 

  file: fact_repaired.fits.fz
  extension: 2
  type: BINARY_TBL
  extname: Events
  rows: 5
  column info:
    EventNum            u1  varray[16]
    TriggerNum          u1  varray[16]
    TriggerType         u1  varray[14]
    NumBoards           u1  varray[16]
    UnixTimeUTC         u1  varray[20]
    BoardTime           u1  varray[172]
    StartCellData       u1  varray[2892]
    StartCellTimeMarker
                        u1  varray[332]
    Data                u1  varray[226100]
    TimeMarker          u1  varray[0]

In [8]: f[2]["EventNum"][:]
Out[8]: 
array([[16,  0,  0,  0,  0,  0,  0,  0, 67,  1,  0,  0,  1,  0,  0,  0],
       [16,  0,  0,  0,  0,  0,  0,  0, 67,  1,  0,  0,  2,  0,  0,  0],
       [16,  0,  0,  0,  0,  0,  0,  0, 67,  1,  0,  0,  3,  0,  0,  0],
       [16,  0,  0,  0,  0,  0,  0,  0, 67,  1,  0,  0,  4,  0,  0,  0],
       [16,  0,  0,  0,  0,  0,  0,  0, 67,  1,  0,  0,  5,  0,  0,  0]], dtype=uint8)

And there it was ... I got the funny block structure, which I expected and in the last 4 bytes of each block I can clearly see the counter. The decoding of that stuff into a more user friendly output, I can now do in python (might be slow, but thats okay).

Now my question: Assuming I have no chance to convince my fellow FACT people, that they made a mistake there (by the way, do you agree?) do you see a chance of patching your version of cfitsio, such that is simply searches for 'ZHEAPPTR' and 'THEAP' and if both exist, 'ZHEAPPTR' overrides the value of 'THEAP'?

dneise avatar Aug 19 '16 09:08 dneise

Another interesting observation, I just made... As pointed out above, the header of the bintable extension 2 does not specify the maximum length of the variable length arrays. (one can check that low-tec by simply looking into the file with less or so).

However when printing the header with fitsio (using the "repaired" file, where I manually overwrote the value of THEAP with the value of ZHEAPPTR) I get this

In [1]: import fitsio

In [2]: f = fitsio.FITS("fact_repaired.fits.fz")

In [3]: f[2].read_header()
Out[3]: 
[...]
TFORM1  = '1QB(16) '           / format of EventNum [var. Length]
TTYPE1  = 'EventNum'           / FAD board event counter
TUNIT1  = 'uint32'             / unit of EventNum
ZFORM1  = '1J'                 / format of EventNum [4-byte INT]
ZCTYP1  = 'FACT'               / Compression type: FACT
TFORM2  = '1QB(16) '           / format of TriggerNum [var. Length]
TTYPE2  = 'TriggerNum'         / FTM board trigger counter
TUNIT2  = 'uint32'             / unit of TriggerNum
ZFORM2  = '1J'                 / format of TriggerNum [4-byte INT]
ZCTYP2  = 'FACT'               / Compression type: FACT
TFORM3  = '1QB(14) '           / format of TriggerType [var. Length]
TTYPE3  = 'TriggerType'        / FTM board trigger type
TUNIT3  = 'uint16'             / unit of TriggerType
ZFORM3  = '1I'                 / format of TriggerType [2-byte INT]
ZCTYP3  = 'FACT'               / Compression type: FACT
TFORM4  = '1QB(16) '           / format of NumBoards [var. Length]
TTYPE4  = 'NumBoards'          / Number of connected boards
TUNIT4  = 'uint32'             / unit of NumBoards
ZFORM4  = '1J'                 / format of NumBoards [4-byte INT]
ZCTYP4  = 'FACT'               / Compression type: FACT
TFORM5  = '1QB(20) '           / format of UnixTimeUTC [var. Length]
TTYPE5  = 'UnixTimeUTC'        / Unix time seconds and microseconds
TUNIT5  = 'uint32'             / unit of UnixTimeUTC
ZFORM5  = '2J'                 / format of UnixTimeUTC [4-byte INT]
ZCTYP5  = 'FACT'               / Compression type: FACT
TFORM6  = '1QB(172)'           / format of BoardTime [var. Length]
TTYPE6  = 'BoardTime'          / Board internal time counter
TUNIT6  = 'uint32'             / unit of BoardTime
ZFORM6  = '40J'                / format of BoardTime [4-byte INT]
ZCTYP6  = 'FACT'               / Compression type: FACT
TFORM7  = '1QB(2892)'          / format of StartCellData [var. Length]
TTYPE7  = 'StartCellData'      / DRS4 start cell of readout
TUNIT7  = 'uint16'             / unit of StartCellData
ZFORM7  = '1440I'              / format of StartCellData [2-byte INT]
ZCTYP7  = 'FACT'               / Compression type: FACT
TFORM8  = '1QB(332)'           / format of StartCellTimeMarker [var. Length]
TTYPE8  = 'StartCellTimeMarker' / DRS4 start cell of readout time marker
TUNIT8  = 'uint16'             / unit of StartCellTimeMarker
ZFORM8  = '160I'               / format of StartCellTimeMarker [2-byte INT]
ZCTYP8  = 'FACT'               / Compression type: FACT
TFORM9  = '1QB(226100)'        / format of Data [var. Length]
TTYPE9  = 'Data'               / Digitized data
TUNIT9  = 'int16'              / unit of Data
ZFORM9  = '432000I'            / format of Data [2-byte INT]
ZCTYP9  = 'FACT'               / Compression type: FACT
TFORM10 = '1QB(0)  '           / format of TimeMarker [var. Length]
TTYPE10 = 'TimeMarker'         / Digitized time marker - if available
TUNIT10 = 'int16'              / unit of TimeMarker
ZFORM10 = '0I'                 / format of TimeMarker [2-byte INT]
ZCTYP10 = 'FACT'               / Compression type: FACT
[...]

So somebody is clearly infering the maximum size from the main_data_table. As far as I see, it was correctly done. Is this part of std-cfitsio or is it a feature introduced in python-fitsio, so that nice rectangular ndarrays can be returned?

dneise avatar Aug 19 '16 11:08 dneise

Ah, I should mention ... I already tried to do this patch I proposed further up. I forked your repo and played around a bit.

https://github.com/dneise/fitsio

its 4 commits ahead of your master at the moment, but the only interesting commit is this one: https://github.com/dneise/fitsio/commit/fec4f70bf57e1bcb00c16c6ccd8691bde992cb71

dneise avatar Aug 19 '16 11:08 dneise

It kind of works, in that I can read files also, when I did not manually overwrite THEAP, but this fancy max_size inference does not work.

dneise avatar Aug 19 '16 11:08 dneise