opm-common Python tests (Parser?) fail on big-endian systems

On the unofficial port of the hppa architecture, the python tests fail because our C++ code assume little-endianness:

======================================================================
1: ERROR: test_arrays (tests.test_eclfile.TestEclFile)
1: ----------------------------------------------------------------------
1: Traceback (most recent call last):
1:   File "/<<PKGBUILDDIR>>/obj-hppa-linux-gnu/python/tests/test_eclfile.py", line 30, in test_arrays
1:     file2uf = EclFile(test_path("data/SPE9.INIT"), preload=False)
1: RuntimeError: [./src/opm/io/eclipse/EclUtil.cpp:296] Error reading binary header. Expected 16 bytes of header data, found 268435456
1:

It makes me a bit nervous that this seems only to be tested via python.

Apr 29 '22 09:04 blattms

One could try to determine endianess at runtime and act accordingly.

inline bool is_system_little_endian()
{
    const int v { 0x01 };
    const void * addr = static_cast<const void *>(&v);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01);
}

Apr 29 '22 09:04 blattms

This is caused by the way we read/write unformatted ("binary") ECLIPSE-style output files (e.g., the .INIT and .UNRST files) and is unrelated to the parser code. The unformatted output files are supposed to be big endian (network byte order) and our I/O code for these files, at least right now, assumes that it's running on a little endian system.

Fixing this part of the code to be agnostic to host system byte order is a laudable goal, but I'm not going to prioritise it over other work. In C++20 the job gets a little easier because the standard library gets built-in support for detecting endianness. Your detection function is undefined behaviour so I'd prefer not to use that.

Apr 29 '22 10:04 bska

The unformatted output files are supposed to be big endian (network byte order)

But that is perfect and helps fixing. OPM should just use methods available for conversion to/from network order instead of __builtin_swap32 (and hence assuming an endianness):

NAME
       htonl, htons, ntohl, ntohs - convert values between host and network byte order

SYNOPSIS
       #include <arpa/inet.h>

       uint32_t htonl(uint32_t hostlong);
       uint16_t htons(uint16_t hostshort);
       uint32_t ntohl(uint32_t netlong);
       uint16_t ntohs(uint16_t netshort);

DESCRIPTION
       The htonl() function converts the unsigned integer hostlong from host byte order to network byte order.

       The htons() function converts the unsigned short integer hostshort from host byte order to network byte order.

       The ntohl() function converts the unsigned integer netlong from network byte order to host byte order.

       The ntohs() function converts the unsigned short integer netshort from network byte order to host byte order.

       On  the  i386 the host byte order is Least Significant Byte first, whereas the network byte order, as used on the In‐
       ternet, is Most Significant Byte first.

32bit versions are also available on windows.

Feb 24 '23 12:02 blattms

Seems to do job, see this test:

#include<iostream>
#include <arpa/inet.h>
int main()
{
    int i = 32;
    int j = __builtin_bswap32(i);
    std::cout<<"__builtin_bswap32(32) = "<< j << std::endl;
    j = __builtin_bswap32(j);
    std::cout<<"__builtin_bswap32(__builtin_bswap32(32)) = "<< j << std::endl;
    j  = htonl(i);
    std::cout<<"htonl(32) = "<< j << std::endl;
    j  = ntohl(j);
    std::cout<<"ntohl(htonl(32)) = "<< j << std::endl;
}

output:

$ ./swap 
__builtin_bswap32(32) = 536870912
__builtin_bswap32(__builtin_bswap32(32)) = 32
htonl(32) = 536870912
ntohl(htonl(32)) = 32

Edited: There was a typo in the sample (htonl instead of ntohl at the last call). Fixed.

Feb 24 '23 12:02 blattms

htonX()

I take it you see that there is no floating-point support in that list? For what it's worth, we have considered multiple options here.

Feb 24 '23 12:02 bska