SoFiA icon indicating copy to clipboard operation
SoFiA copied to clipboard

Output data types

Open SoFiA-Admin opened this issue 8 years ago • 1 comments

As discussed in issue #111, SoFiA’s output file data types are rather messy at the moment. As per our discussion, the following improvements should be implemented:

  • Copies and simple modifications of the original data (including cubelets and filtered cubes) should have the same data type as the input. In the case of filtered cubes one could consider using 32-bit float as a minimum requirement.
  • Stacked data products (including moment maps) should be 2 times the size of the input type to accommodate potential dynamic range requirements. Again we could define 32-bit float as the minimum requirement to avoid having to write out 16-bit or 32-bit integer types.
  • Masks should be of the smallest possible integer type that can contain the actual number of detections in each source finding run. This number would have to be determined dynamically in each run. Limits: 8-bit → 255; 16-bit → 65535; 32-bit → ≈ 4.29 × 109.
  • When 16-bit or 32-bit integer masks are written, appropriate BZERO and BSCALE keywords need to be introduced to account for the fact that SoFiA’s source IDs are unsigned, while the larger integer types of FITS are intrinsically signed.

SoFiA-Admin avatar Aug 28 '17 03:08 SoFiA-Admin

default data format in numpy when one sets up a numpy array is float64. The format can be set using the dtype parameter, e.g. nancube = np.full((nchannel, naxis2, naxis1), np.nan, dtype='float32') fills an array with NaN's single precision floating point numbers. Problem is that this can only be done when the array is initialised.

BITPIX=-64 corresponds to dtype='float64' BITPIX=-32 corresponds to dtype='float32' BITPIX=32 corresponds to dtype='int32' BITPIX=16 corresponds to dtype='int16'

vdhulst avatar Oct 07 '18 09:10 vdhulst