foundation
foundation copied to clipboard
String with Encoding
Expose a String type with a slightly different name that doesn't assume one specific encoding.
data EString encoding = EString (UArray Word8)
where encoding are specific disjoint types like UTF8
or UTF16
.
class Encoding e
...
data UTF8
instance Encoding UTF8
data UTF16
instance Encoding UTF16
Operation would look like:
break :: Encoding e => (Char -> Bool) -> EString e -> (EString e, EString e)
This seems like a lot of complexity just for people using broken encodings...
Foundation already provides an Encoding class with an associated type to the Unit encoding
:
class Encoding encoding where
-- | the unit element use for the encoding.
-- i.e. Word8 for ASCII7 or UTF8, Word16 for UTF16...
--
type Unit encoding
...
What about using an algebraic type to make use of this associated type? Something like:
data EString encoding where
EString :: Encoding encoding => UArray (EncodingUnit encoding)
I don't think Unit encoding
is very useful for the outside world, the IO system will very likely interact in term of UArray Word8
.
I think the only usefulness would be to have typed string when dealing with the foreign world. For example being able to say:
foreign :: Ptr (EString UTF16) -> ...
Could be useful in the future. Also, this way you can cheaply tag any buffer that have a specific encoding that is not UTF8, without transforming the buffer (and still be able to do some textual operation since we can make it Sequential
). I don't think I'll use this very much overall, but we have a flexible system where this can be done cheaply API wise (i.e. exposing 1 type)