codespan icon indicating copy to clipboard operation
codespan copied to clipboard

support rendering of errors if a file is only partially encoded in UTF-8

Open fogti opened this issue 4 years ago • 3 comments

I want to support the use case that a file contains both utf-8 encoded commands and binary data, such that if an error occures, it is only important that the text selected via labels is valid utf-8, but doesn't require that the whole file is utf-8. As line numbers and such don't make much sense in that scenario, it would be good to print the raw byte positions instead of line numbers. An optional interaction with the bstr crate (the bstr::BStr struct has a nice Debug implementation) might be useful. If the text selected via labels contains invalid utf-8, it should be printed as hex or something similiar, at least, it shouldn't fail.

fogti avatar Jun 17 '20 18:06 fogti

I wonder how much this is related to #187? 🤔

brendanzab avatar Jun 17 '20 21:06 brendanzab

It might be helpful to give a mock-up of what you want the output to look like as well! Might be easier for me to visualise that way.

brendanzab avatar Jun 17 '20 23:06 brendanzab

I think there are multiple possibilities:

  • (a) only support printing if the snippet does only contain utf-8 (probably easier, but less useful)
  • (b) escape non-utf8 characters (more difficult, because the calculation of arrows and such gets more difficult)

I currently don't really know how the output should look like, but it should at least provide the following informations:

  • file name
  • line number / column if applicable (e.g. no non-utf8 stuff before the labels)
  • raw byte position / span (always if the given slice contains non-UTF8 data)
  • error messages
  • the corresponding snippet including arrows/label pointers...

fogti avatar Jun 18 '20 12:06 fogti