it-tools icon indicating copy to clipboard operation
it-tools copied to clipboard

fix(text-to-unicode): handle non-BMP + more conversion options

Open lionel-rowe opened this issue 1 year ago • 7 comments
trafficstars

Fixes https://github.com/CorentinTh/it-tools/issues/1081 Fixes https://github.com/CorentinTh/it-tools/issues/1175

image

lionel-rowe avatar May 14 '24 13:05 lionel-rowe

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
it-tools ✅ Ready (Inspect) Visit Preview Aug 9, 2024 0:23am

vercel[bot] avatar May 14 '24 13:05 vercel[bot]

Hi @lionel-rowe, great job, could be interesting to add a transcript including "character names" (ie, using https://www.npmjs.com/package/@unicode/unicode-15.1.0) ? This would fix https://github.com/CorentinTh/it-tools/issues/544

sharevb avatar May 15 '24 16:05 sharevb

could be interesting to add a transcript including "character names" (ie, using https://www.npmjs.com/package/@unicode/unicode-15.1.0) ? This would fix #544

I have a standalone tool I currently use that provides similar functionality. The JSON file mapping chars/ranges to their names is around 2MB, which doesn't seem like a reasonable amount to pull in unconditionally here. An async solution loading only the relevant Unicode blocks with dynamic import() (perhaps with ASCII range loaded synchronously by default) could work, but it'd take a little work to make async-friendly.

Edit: Actually it looks like the RLE+gzip+base64-encoded version of the name data in the node-unicode package you linked to is "only" ~194 KB, which is much more reasonable, especially if only loaded conditionally. And it could be reduced by a further ~25% if loaded as a raw binary file instead of base64. But I'm not convinced it's within the scope of this tool, it's more of a Unicode "explainer" than a Unicode "converter". Output should probably be tabular, maybe with a link to an external site like compart (like in my standalone tool). And the reverse direction (if implemented) obviously wouldn't be inputting that tabular data, it'd be a search function, preferably with fuzzy matching. In any case, it's definitely not a simple bidirectional converter like the text-to-unicode tool.

lionel-rowe avatar May 17 '24 00:05 lionel-rowe

Hi @lionel-rowe, yes, right this is not bi-directional and yes the reverse will be a lookup

sharevb avatar May 19 '24 16:05 sharevb

Hi @lionel-rowe, implemented text-to-unicode-names in #1183

sharevb avatar Jul 07 '24 12:07 sharevb