pharo
pharo copied to clipboard
WideSymbol misses #asByteArray
Bug description WideSymbol does not implement asByteArray subclassResponsibility
To Reproduce
wideSymbol := (String with: 16rF600 asCharacter) asSymbol.
wideSymbol asByteArray
Well, #asByteArray is dangerous and a bad idea as a general thing on strings/symbols. Going from characters to bytes is called encoding, and there are different ways to do this. UTF8 being the most general one today. #utf8Encoded is the easiest selector. Decoding is also needed, as in #utf8Decoded.
Just inspect
'élève - 10€' asByteArray.
the result is a totally unknown encoding !
We really should have good hygiene in this area. Any project is of course free to decide how to encode/decode its values, but there is no general answer.
It makes no sense to transform a string to a byte array (wether wide or not) without an encoding. We should forbid that as it is the source of many ugly bugs
Yes, I think we should even propose to remove asByteArray
from ByteString.
It's wrong and confusing.
- 1 please educate the user and me because I'm pretty sure that I would make all possible mistakes!
I propose to do two steps:
-
fix the subclassResponsibility, yes, knowing that this is not good. But we do it for WideString already, and there are users
-
open another issue tracker entry to get rid of asByteArray from ByteString and WideString. This then has to check where we use #asByteArray, so depending on that, it might be a bigger non-backward compatible change that needs multiple steps
- fix the subclassResponsibility, yes, knowing that this is not good. But we do it for WideString already, and there are users
It is impossible that there are users, because it is defined as subclassResponsibility :). So if there are users they should find an exception! :D
I think we should skip this step 1 at all. There is no satisfying definition for this other than "self shouldNotImplement".