text_view icon indicating copy to clipboard operation
text_view copied to clipboard

Add a `utfbom` encoding that handles UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE

Open tahonermann opened this issue 7 years ago • 0 comments

text_view currently defines utf8bom, utf16bom, and utf32bom encodings that detect a BOM and dispatch to the appropriate non-BOM encoding to consume remaining input. However, a utfbom encoding would be useful to consume UTF-8, UTF-16, and UTF-32 formatted files that contain a BOM.

There is a question of what to do if the input lacks a BOM. Options are to fail or fallback to an assumed encoding. A policy class could be used to allow programmer control; e.g., fail, fallback to UTF-8, etc...

tahonermann avatar May 21 '18 13:05 tahonermann