textlint icon indicating copy to clipboard operation
textlint copied to clipboard

.textlintrc should be encoded as UTF-8

Open azu opened this issue 8 years ago • 19 comments

Currently, textlint throw error when .textlintrc file is encoded as Shift-JIS. We should check that the .textlintrc is UTF-8.

Proposal

If .textlintrc is not encoded by UTF-8, show error mesage in console.

Modules

  • utf-8-validate
    • It is native module
    • Previoulsly, it is introduced in #397, but it is reverted in #509

Edit: Re-open by #509

azu avatar May 25 '17 08:05 azu

alt https://github.com/hcodes/isutf8

azu avatar Oct 12 '17 10:10 azu

https://github.com/shinnn/is-file-utf8

azu avatar Oct 14 '17 17:10 azu

I think utf-8-validate would be better. I try it on Node.js v4.8.7, it works expected.

> node -e "console.log(require('utf-8-validate')(require('fs').readFileSync('utf8.txt')))"
true
> node -e "console.log(require('utf-8-validate')(require('fs').readFileSync('sjis.txt')))"
false

Leko avatar Dec 15 '17 14:12 Leko

Currently, textlint throw error when .textlintrc file is encoded as Shift-JIS.

@azu How do I reproduce it ?

I try to reproduce on my blog( https://github.com/Leko/WEB-EGG ).

  1. clone this repo
  2. Run npm i
  3. Convert .textlintrc to Shift-JIS
  4. Run npm run lint:md

It does not throw an error. But max-kanji-continuous-len.allow not working. If .textlintrc encoded as UTF8, it works fine.

My .textlintrc is here: https://github.com/Leko/WEB-EGG/blob/master/.textlintrc#L17

Leko avatar Dec 15 '17 15:12 Leko

This issue is based on bug report on twitter.

  • https://twitter.com/okinaka3/status/867298831263715328
  • https://twitter.com/azu_re/status/867662138831130624

textlint use rc-loader in days past. But, I've replaced it to rc-config-loader #39 https://github.com/textlint/textlint/pull/262

Maybe, "throwing error" is caused by rc-loader?

azu avatar Dec 15 '17 15:12 azu

Hmm... It doesn't throw error even in [email protected].

screen shot 2017-12-16 at 0 37 09

We can close this issue ?

Leko avatar Dec 15 '17 15:12 Leko

https://github.com/textlint-ja/textlint-rule-preset-JTF-style/blob/master/src/index.js JTF preset use multibyte string in the key. Is it a reason?

It does not throw an error. But max-kanji-continuous-len.allow not working.

It is better that assert this. Silent failing make user confuse. Bacause, It is not expeceted bahavior for user.

azu avatar Dec 15 '17 15:12 azu

I love loud error rather than silent error.

azu avatar Dec 15 '17 15:12 azu

max-kanji-continuous-len.allow not working.

In this case, if assert .textlintrc encoded as UTF8, markdown should be UTF8. textlint can support text files encoded as UTF8 only. It's OK ?

BTW, rc-loader throws error in https://github.com/textlint-ja/textlint-rule-preset-JTF-style/blob/master/src/index.js .

2017/12/16 1:16 I tested this JSON. src/index.js is javascript so I update as JSON. https://gist.github.com/Leko/c937dbc2d9cdf451fc3595531e99b2da

screen shot 2017-12-16 at 0 48 25

Leko avatar Dec 15 '17 16:12 Leko

I love loud error rather than silent error.

I think so too.

Leko avatar Dec 15 '17 16:12 Leko

textlint can support text files encoded as UTF8 only.

Yes. it looks good that implement the assertion in config-loader.

https://github.com/textlint/textlint/blob/41cae972db08731b797592b24dcd845c0138bec4/packages/textlint/src/config/config-loader.ts#L43

azu avatar Dec 15 '17 16:12 azu

I got it. I'll try to fix it :)

Leko avatar Dec 16 '17 01:12 Leko

micnic/uv: Ultrafast UTF-8 data validation looks like good. require Node.js 6+.

But, we can drop Node.js 4 support in near future #443

azu avatar May 20 '18 06:05 azu

Next Node.js will introduce utf8 checker https://github.com/nodejs/node/pull/45947

azu avatar Dec 29 '22 03:12 azu

buffer.isUtf8(input) require Node.js v18.14.0.

textlint 14 treat Node.js 18.14 as the minimal version.

  • #1200

azu avatar Jan 30 '24 15:01 azu