it-tools fix(text-to-unicode): handle non-BMP + more conversion options

trafficstars

Fixes https://github.com/CorentinTh/it-tools/issues/1081 Fixes https://github.com/CorentinTh/it-tools/issues/1175

May 14 '24 13:05 lionel-rowe

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
it-tools	✅ Ready (Inspect)	Visit Preview	Aug 9, 2024 0:23am

May 14 '24 13:05 vercel[bot]

Quality Gate passed

Issues
7 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

May 15 '24 03:05 sonarqubecloud[bot]

Hi @lionel-rowe, great job, could be interesting to add a transcript including "character names" (ie, using https://www.npmjs.com/package/@unicode/unicode-15.1.0) ? This would fix https://github.com/CorentinTh/it-tools/issues/544

May 15 '24 16:05 sharevb

could be interesting to add a transcript including "character names" (ie, using https://www.npmjs.com/package/@unicode/unicode-15.1.0) ? This would fix #544

I have a standalone tool I currently use that provides similar functionality. The JSON file mapping chars/ranges to their names is around 2MB, which doesn't seem like a reasonable amount to pull in unconditionally here. An async solution loading only the relevant Unicode blocks with dynamic import() (perhaps with ASCII range loaded synchronously by default) could work, but it'd take a little work to make async-friendly.

Edit: Actually it looks like the RLE+gzip+base64-encoded version of the name data in the node-unicode package you linked to is "only" ~194 KB, which is much more reasonable, especially if only loaded conditionally. And it could be reduced by a further ~25% if loaded as a raw binary file instead of base64. But I'm not convinced it's within the scope of this tool, it's more of a Unicode "explainer" than a Unicode "converter". Output should probably be tabular, maybe with a link to an external site like compart (like in my standalone tool). And the reverse direction (if implemented) obviously wouldn't be inputting that tabular data, it'd be a search function, preferably with fuzzy matching. In any case, it's definitely not a simple bidirectional converter like the text-to-unicode tool.

May 17 '24 00:05 lionel-rowe

Hi @lionel-rowe, yes, right this is not bi-directional and yes the reverse will be a lookup

May 19 '24 16:05 sharevb

Hi @lionel-rowe, implemented text-to-unicode-names in #1183

Jul 07 '24 12:07 sharevb

Quality Gate passed

Issues
7 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

Aug 09 '24 00:08 sonarqubecloud[bot]

it-tools it-tools copied to clipboard

fix(text-to-unicode): handle non-BMP + more conversion options

Quality Gate passed

Quality Gate passed

it-tools
it-tools copied to clipboard