RFCs icon indicating copy to clipboard operation
RFCs copied to clipboard

[RFC] Option for endianness in streams

Open PMunch opened this issue 8 years ago • 21 comments

We already have an issue for streams missing any concept of endianness. During a discussion on IRC we came to an agreement that it would be nice to have a way to specify endianness. I propose having an enum of the values eBig, eLittle, eNative and having an optional parameter static[StreamEndianness] for each of the read/write/peek procedures which defaults to eNative to avoid deprecating the old code.

Other options would include having a non-static parameter, multiple procedures post-fixed with LE/BE, and dropping the eNative value while deprecating all stream operations.

The relevant discussion on IRC can be found here.

PMunch avatar Jan 18 '18 14:01 PMunch

IMO, introducing eNative is bad, as well as leaving current careless behavior by default.

yglukhov avatar Jan 18 '18 14:01 yglukhov

I kinda agree that it's a bad idea, but what would be the default? On systems that did not follow the default it would be super annoying having to write something all the time..

PMunch avatar Jan 18 '18 15:01 PMunch

Well it doesn't make a lot of sense to read from a stream in Native endianness. So both eBig and eLittle are better than that, since they are well defined. Given that eLittle is more trendy nowadays, I would go with eLittle. However, this would brake existing nim code on PowerPC, but should we care about it?

yglukhov avatar Jan 18 '18 15:01 yglukhov

Hmm, maybe not. I guess eLittle would be fine as a default. Better make a good design decision and break some code before v1.0 than being stuck with a bad one because it would break some PowerPC code now.

PMunch avatar Jan 18 '18 15:01 PMunch

PPC is pushing PPC64LE as the way forward now that they've seen the error in their big-endian ways.

As for eNative, I think the right solution is something like:

type Endian = enum
  eLittle
  eBig

const eNative = eLittle # or eBig depending on platform

Then you can still use eNative when you are using a stream for temporary data that will only be read from the current machine (or for IPC protocols which also don't need to care), but you don't need an extra case when writing a case endian statement.

RedBeard0531 avatar Jan 18 '18 15:01 RedBeard0531

There already is an Endianness enum + cpuEndian const in system that can be used, no need to introduce another enum.

GULPF avatar Jan 18 '18 15:01 GULPF

@RedBeard0531, i'd rather make a proc getCurrentEndianness() or smth. Endianness may be not available in compile time, at least on JS. And i've heard of ARM endianness magic...

yglukhov avatar Jan 18 '18 15:01 yglukhov

@GULPF yeah that was in case we wanted the eNative as part of the enum, we should stick with the one in system.

@yglukhov, like cpuEndian?

PMunch avatar Jan 18 '18 15:01 PMunch

Well, i'm not sure. cpuEndian seems to be a constant, though I don't know how magic it is. That looks useless in JS.

yglukhov avatar Jan 18 '18 15:01 yglukhov

Endianness is always known at compile time.

JavaScript doesn't have any endianness - it's not possible to create uint8 pointer to uint32 array (ok, there is Int32Array, but are there any browsers supporting both Int32Array and big endian?). WebAssembly is always little-endian.

zielmicha avatar Jan 18 '18 15:01 zielmicha

Well it doesn't make a lot of sense to read from a stream in Native endianness. So both eBig and eLittle are better than that, since they are well defined.

On the contrary, it does make lots of sense if you do the swapBytes() operation before the stream operation. Which is how it is supposed to be used...

Araq avatar Jan 18 '18 15:01 Araq

@zielmicha, I think nodejs might be such an environment. I really don't care that much. Just a tiny detail.

@Araq, question is: whats the most common use-case? How hard is it to customize away from the common use-case. What can nim ultimately do to be idiot-proof without sacrificing the performance? IMO the most common use case for streams is reading and writing with streams being intended to be portable across different machines. E.g. If nim code can write it and read it on the same machine, it should be the same on any other machine. Given that real machines are little, there's no performance cost for the most common use-case. So nim can be idiot-proof for no cost (common use-case). And if so, then why not?

If you agree with my most common use case, then what you're implying is that almost any stream operation should be accompanied with swap. Now try imagine a regular python/js programmer coming to Nim, who will do that. Actually, I don't do that either.

yglukhov avatar Jan 18 '18 15:01 yglukhov

Another option is to set endianness mode of the stream in question, since its very unlikely to use a single stream with mixed endianness. However that definitely sounds like runtime overhead (.

yglukhov avatar Jan 18 '18 16:01 yglukhov

Let's fix this for the next release.

dom96 avatar Jan 18 '18 16:01 dom96

@yglukhov oh that does sound a lot simpler than having to supply it to every call.. If there are two types StreamBE and StreamLE (or some other mechanism), then it should be possible to make it resolve at compile time.

PMunch avatar Jan 18 '18 16:01 PMunch

@PMunch. I don't think thats possible as it would require a polymorphic StreamLE and StreamBE subclasses in between Stream and its impls.

yglukhov avatar Jan 18 '18 17:01 yglukhov

There is nothing to fix here, you tell the stream to write 2 bytes, it writes 2 bytes. Why should it do what you didn't ask for? How can I do writeBuffer(addr complexObj, sizeof complexObj) in your "better" endian aware API?

Araq avatar Jan 18 '18 23:01 Araq

So nim can be idiot-proof for no cost (common use-case). And if so, then why not?

You cannot make this low level API idiot proof.

Araq avatar Jan 18 '18 23:01 Araq

How can I do writeBuffer(addr complexObj, sizeof complexObj)

endian awareness should be done not in writeBuffer, but in write[T] for T = int/uint/float* and in readT and peekT where T is int/uint/float*.

yglukhov avatar Jan 19 '18 15:01 yglukhov

This RFC is stale because it has been open for 1095 days with no activity. Contribute a fix or comment on the issue, or it will be closed in 30 days.

github-actions[bot] avatar Jul 14 '23 02:07 github-actions[bot]

There is nothing to fix here, you tell the stream to write 2 bytes, it writes 2 bytes. Why should it do what you didn't ask for? How can I do writeBuffer(addr complexObj, sizeof complexObj) in your "better" endian aware API?

By looking at the definition of ComplexObj and write each field with the correct endianness? Not sure what the issue here would be. Doesn't it already turn into a series of calls to write[T]? In that case it would just work if the endianness was a property of the stream.

PMunch avatar Jul 14 '23 05:07 PMunch