brave-mouse icon indicating copy to clipboard operation
brave-mouse copied to clipboard

Add charset support

Open treyhunner opened this issue 10 years ago • 2 comments

I made this Node module to detect file character sets: https://www.npmjs.com/package/detect-charset

It's imperfect in that all files without a byte order mark are assumed to be latin1/utf-8, but I think the only way that could be improved upon in general is by returning an unknown for all utf-8 and other non-BOM files that contain unicode.

Let me know if the module requires any improvements/changes to work with brave-mouse. Pull requests welcome.

treyhunner avatar Feb 01 '15 06:02 treyhunner

I have experimented a bit with ICU’s charset detector (using node-icu-charset-detector) which seems to be the most accurate and most battle-tested charset detector out there. However, I’d want to avoid users having to brew install icu4c. I’m currently trying to compile ICU using node-gyp which would probably be my preferred solution.

I have marked the test cases which fail using detect-charset but should work fine using ICU’s detector, if you like to take a look at them.

sonicdoe avatar Feb 01 '15 16:02 sonicdoe

In case anyone’s interested in this, I have released detect-character-encoding which compiles ICU using node-gyp.

sonicdoe avatar Mar 15 '15 11:03 sonicdoe