change-case icon indicating copy to clipboard operation
change-case copied to clipboard

Unicode support

Open stepanselyuk opened this issue 5 years ago • 8 comments

Hello, how to add Unicode support in regexps? I tried to pass modified options:

{
                splitRegexp: [
                    /([\p{Ll}\p{Lo}\p{N}])([\p{Lu}\p{Lt}])/gu, // one lower case char or digit followed by one upper case char
                    /([\p{Lu}\p{Lt}])([\p{Lu}\p{Lt}][\p{Ll}\p{Lo}])/gu // one upper case char followed by one upper case char and then by one lower case char
                ],
                stripRegexp: /\p{C}+/giu, // Unicode Other (basically I don't want to remove anything
                delimiter: ' '
            }

But some transformations work partially, and some not (e.g. CONSTANT_CASE do the case conversion but do not replace space by an underscore). What the best way to add Unicode support to 4.x version?

stepanselyuk avatar Jul 06 '20 17:07 stepanselyuk

My bad, need to strip at least space-symbols. This works:

const options = {
            splitRegexp: [
                /(\p{Ll}|\p{Lo}|\p{N})(\p{Lu}|\p{Lt})/gu, // one lower case char or digit followed by one upper case char
                /(\p{Lu}|\p{Lt})((?:\p{Lu}|\p{Lt})(?:\p{Ll}|\p{Lo}))/gu // one upper case char followed by one upper case char and then by one lower case char
            ],
            //stripRegexp: /[^\p{L}\p{N}]+/gui, // remove all symbols which are not in \p{L} and \p{N} classes
            stripRegexp: /\p{Zs}+/gui, // remove space-symbols
};

stepanselyuk avatar Jul 06 '20 18:07 stepanselyuk

According https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode browsers have full Unicode support for a while. So maybe @blakeembrey you can add a section into docs.

stepanselyuk avatar Jul 06 '20 20:07 stepanselyuk

Nice catch, thanks! I can make a new major release with unicode support and make a note of this in the README for browser support.

blakeembrey avatar Jul 14 '20 02:07 blakeembrey

By the way, you probably want to look at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#Browser_compatibility instead. It shows that \p (Unicode property escapes) has always had much worse support and only recently got added to node.js.

blakeembrey avatar Jul 14 '20 02:07 blakeembrey

@blakeembrey thanks, so can I close this issue?

stepanselyuk avatar Jul 16 '20 10:07 stepanselyuk

Feel free to leave it open until I release a new version 😄 Or submit a PR with the changes!

blakeembrey avatar Oct 12 '20 04:10 blakeembrey

@blakeembrey do you mean I can add these details into README, or I can change defaults in packages/no-case/src/index.ts:11? I can create a PR for sure, just need to understand how you see it.

stepanselyuk avatar Oct 12 '20 09:10 stepanselyuk

Let's change the defaults! It used to be that way using XRegExp but that increased the bundle size unreasonably, so using the native solution for the next major version seems reasonable.

blakeembrey avatar Oct 26 '20 00:10 blakeembrey

This has been fixed with [email protected]. Please update and let me know if you still see any issues.

blakeembrey avatar Sep 30 '23 02:09 blakeembrey