yarp icon indicating copy to clipboard operation
yarp copied to clipboard

Big Endian support needs work and testing

Open drdanz opened this issue 7 years ago • 6 comments

At the moment, the endianness is handled by the yarp::os::NetInt32 and similar classes. These classes are typedef to int on little endian and proper classes that store the value as little endian and perform a swap of the bytes every time they are used.

  • This behaviour is obviously not optimal for big endian machines, just in order to perform a simple sum, 3 swaps of the bytes are necessary. The correct way to do this would be to store data on the machine format, and to swap only when necessary (i.e. when sending/receiving the data on/from the network).

  • At the moment we don't have any testing machine with a big endian architecture, this means that this part might be broken, and we don't have any idea. In fact some time ago I tried it and YARP wasn't even building

  • This forces users to use a custom type instead of int32_t and similar, available in c++11, and to do weird stuff to handle casts and similar (several things will work when building on little endian, since these classes are just typedef, but will fail to build on big endian)

This part should be either be re-written (swapping bytes in write/read calls) or dropped, since it affects a lot the readability of the code and the size of stuff to maintain.

drdanz avatar Dec 07 '17 17:12 drdanz

Even worse imho, the yarp::sig::Vector does not handle endianess at all. As states in the documentation, it assumes to work on machines with the same endianess http://www.yarp.it/classyarp_1_1sig_1_1VectorBase.html#details

To be tested, documentation may not be up to date.

barbalberto avatar Dec 12 '17 09:12 barbalberto

Actually, I was remembering something wrong... I had a better look at this part, and it seems to be used as it should be in some places... for example:

  • appendInt() and expectInt() use NetInt32 internally, but the user sees only integers.
  • ImageNetworkHeader uses NetInt32 because it is transferred using appendBlock() and expectBlock()

Probably the re-write is not necessary, but we should ensure that this is used properly everywhere, though. Also I'll leave it as Breaking Change because it might be necessary to break something in some points.

Note: even though this probably makes a lot of sense in term of performances on little endian architectures (i.e. all the machines we use at the moment), if I'm not wrong (i didn't find related documentation), the YARP binary protocol writes all integer as little endian while for convention, most of the network protocols use big endian.

Moreover one might expect to be able to use NetInt32 for reading an integer from some other protocol on the network, but this will not work because, NetInt32 is an integer stored as little endian... The sender probably used htons() and similar to send integers on the network (that will convert the number to big endian), but NetInt32 expects little endian instead. The Net part here is is quite confusing. Perhaps a lot less confusion would be caused if we had something like LittleEndianInt32 and BigEndianInt32. I mean, if there is a reason for preferring little endian on the network, I don't see a reason for not using it in YARP "private" protocols, but we cannot completely forget that there are standards defining the opposite (see https://stackoverflow.com/questions/13514614), having types to handle that would probably help in some cases.

BTW, C++20 will have an endian type trait (http://en.cppreference.com/w/cpp/types/endian) so I expect types like NetInt32 support in the language.

drdanz avatar Dec 13 '17 12:12 drdanz

Files generated by yarpidl_rosmsg use NetInt32 and similar internally, but then they are appended using connection.appendInt((int)seq) This is definitely wrong, they should not be Net types, and they should not be casted right before writing them on the network.

drdanz avatar Mar 11 '18 18:03 drdanz

Vector does not support endianness. Apparently also Image does not handle endianness. In the same way, PointCloud class will not support big endian.

This is getting out of hand. If we really want to support big endian systems we need to get one to make interoperability tests. CC @lornat75

drdanz avatar Apr 04 '18 15:04 drdanz

This issue does not seem critical at the moment, let's keep it in mind.

lornat75 avatar Apr 05 '18 07:04 lornat75

I managed to enable a build on a big endian architecture (s390x) on travis, but, as expected, it is failing, see for example https://travis-ci.org/github/robotology/yarp/jobs/721865131

drdanz avatar Aug 28 '20 11:08 drdanz