valibot icon indicating copy to clipboard operation
valibot copied to clipboard

Add length validators based on the `Intl.Segmenter` API

Open remonke opened this issue 1 year ago • 2 comments

Actions minLength and maxLength use String.prototype.length, which is not very reliable as it relies on the number of character codes. This approach is not ideal for checking the number of characters the way humans perceive it, especially with emojis. For example, most emojis have a length of 2 (like 🙃), but some have a length of 7 (like 🧑🏻‍💻).

I suggest adding new actions like minGraphemeCount or maxGraphemeCount, which would use the Intl.Segmenter API instead of String.prototype.length. This would be particularly useful when dealing with user-generated content. As of April 2024, the API is supported in all major browsers.

The code would look something like this:

import * as v from 'valibot';

const PostSchema = v.object({
  title: v.pipe(v.string(), v.maxGraphemeCount(300, /* optional language parameter */ 'en')),
});

remonke avatar Aug 15 '24 14:08 remonke

Thank you for your contribution. We have already discussed this in PR https://github.com/fabian-hiller/valibot/pull/666#issuecomment-2227393849. Feel free to create a PR and copy the source code of length, notLength, minLength and maxLength and implement graphemes, notGraphemes, minGraphemes and minGraphemes based on it.

fabian-hiller avatar Aug 15 '24 15:08 fabian-hiller

Fixes: #853

2-NOW avatar Sep 29 '24 06:09 2-NOW

This has been implemented and will be available in the next release.

fabian-hiller avatar Oct 11 '24 21:10 fabian-hiller