fabio icon indicating copy to clipboard operation
fabio copied to clipboard

Byteswap in all cases where bpp>1 ...

Open kif opened this issue 6 years ago • 4 comments

Spotted in edfimage at least, by Peter Boesecke. Probably affects only "complex128" with swapped endianness.

kif avatar Apr 02 '19 13:04 kif

I do not understand why the bpp must be checked at all here. How to apply byte swapping for items with a specific bpp (including 1 byte) should be implemented in the byte swapping function. If bpp is 1 it should be clear that the result is equal to the input.

boesecke avatar Jun 12 '19 12:06 boesecke

I propose the following simplification:

    byte_order = self.header[capsHeader['BYTEORDER']]
    if ('Low' in byte_order):
         little_endian=True
    else:
         little_endian=False
    
    if ( little_endian==numpy.little_endian ):
         self._data_swap_needed = False
    else:
         self._data_swap_needed = True

Replacing:

    byte_order = self.header[capsHeader['BYTEORDER']]
    if ('Low' in byte_order and numpy.little_endian) or \
       ('High' in byte_order and not numpy.little_endian):
        self._data_swap_needed = False
    if ('High' in byte_order and numpy.little_endian) or \
       ('Low' in byte_order and not numpy.little_endian):
        if bpp in [2, 4, 8]:
            self._data_swap_needed = True
        else:
            self._data_swap_needed = False

boesecke avatar Jun 12 '19 15:06 boesecke

Rather the simplifying the code, can you just remove it altogether?

See: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

He suggests one should try to remove "swap" lines of code. So rather than fixing "self._data_swap_needed" can numpy do this when reading the bytes by specifying their flavour? Just set self.bytecode to include the source endianness as a little "<" or big ">" as first character. e.g:

t4 = numpy.array( (1,0,0,0), numpy.uint8 ) numpy.frombuffer( t4, dtype='>u4') array([16777216], dtype=uint32) numpy.frombuffer( t4, dtype='<u4') array([1], dtype=uint32)

There is no need for a special case on the 1 bpp:

numpy.frombuffer( t4, dtype='<u1') array([1, 0, 0, 0], dtype=uint8) numpy.frombuffer( t4, dtype='>u1') array([1, 0, 0, 0], dtype=uint8)

jonwright avatar Jun 12 '19 16:06 jonwright

On Wed, 12 Jun 2019 09:16:23 -0700 Jon Wright [email protected] wrote:

Rather the simplifying the code, can you just remove it altogether?

See: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

He suggests one should try to remove "swap" lines of code. So rather than fixing "self._data_swap_needed" can numpy do this when reading the bytes by specifying their flavour? Just set self.bytecode to include the source endianness as a little "<" or big ">" as first character. e.g:

Indeed this would be simpler...

I suggest to open a separated issue/PR as this pattern affects (almost) all file-formats and upgrading them in a consistent way.

There could be a static function in fabioimage providing the "sexed_dtype"

@statimethod
def get_sexed_dtype(dtype, endianness):
   if endianness=="little":
       stype = "<"
   elif endianness=="big":
       stype = ">"
   else:
       stype = "" 
   return stype+numpy.dtype(dtype).char 

which could be used in the 20 classes where this pattern is found ?

kif avatar Jun 13 '19 08:06 kif