copyright-header icon indicating copy to clipboard operation
copyright-header copied to clipboard

Unable to add copyright to file with UTF-8 characters

Open colindean opened this issue 7 years ago • 5 comments

I've got a couple of Rust source files (added syntax in #41) with the non-ASCII character for ellipses:. copyright-header doesn't seem to like that:

SKIP src/logging.rs; invalid byte sequence in US-ASCII
SKIP src/update.rs; detected existing license
SKIP src/main.rs; invalid byte sequence in US-ASCII

It's not a huge deal for us because it's only two files but I'm sure others might hit this at some point. I can't quite tell from the error where in the process the file is being opened as US-ASCII.

colindean avatar Apr 23 '18 20:04 colindean

I think the problem is here, where perhaps the syntax file needs to be able to define the expected encoding of the source files.

It may be safe to hardcode as UTF-8, since ASCII is a subset of UTF-8 and most if not all programming languages are UTF-8 in source form now.

colindean avatar Apr 23 '18 20:04 colindean

@colindean thanks for reporting the issue! I think your fix sounds reasonable. Our backlog has grown ever so large. We'll gladly accept any PRs, if you're willing to contribute.

osterman avatar Apr 27 '18 21:04 osterman

The problem isn't where I linked. That's where the license template is read. However, if the license template contains UTF-8 characters, it might still mess up.

I think it's here, where the source file itself appears to be read.

Problems could also occur when writing here.

I think the first step here is to write some tests around these areas…

colindean avatar May 02 '18 04:05 colindean

(yeah, I'm gonna work toward this at some point unless someone else gets to it first)

colindean avatar May 02 '18 04:05 colindean

https://ruby-doc.org/core-2.5.0/IO.html#method-c-new

colindean avatar May 02 '18 04:05 colindean