copyright-header
copyright-header copied to clipboard
Unable to add copyright to file with UTF-8 characters
I've got a couple of Rust source files (added syntax in #41) with the non-ASCII character for ellipses:…. copyright-header doesn't seem to like that:
SKIP src/logging.rs; invalid byte sequence in US-ASCII
SKIP src/update.rs; detected existing license
SKIP src/main.rs; invalid byte sequence in US-ASCII
It's not a huge deal for us because it's only two files but I'm sure others might hit this at some point. I can't quite tell from the error where in the process the file is being opened as US-ASCII.
I think the problem is here, where perhaps the syntax file needs to be able to define the expected encoding of the source files.
It may be safe to hardcode as UTF-8, since ASCII is a subset of UTF-8 and most if not all programming languages are UTF-8 in source form now.
@colindean thanks for reporting the issue! I think your fix sounds reasonable. Our backlog has grown ever so large. We'll gladly accept any PRs, if you're willing to contribute.
The problem isn't where I linked. That's where the license template is read. However, if the license template contains UTF-8 characters, it might still mess up.
I think it's here, where the source file itself appears to be read.
Problems could also occur when writing here.
I think the first step here is to write some tests around these areas…
(yeah, I'm gonna work toward this at some point unless someone else gets to it first)
https://ruby-doc.org/core-2.5.0/IO.html#method-c-new