eliza icon indicating copy to clipboard operation
eliza copied to clipboard

[knowledge] Enable No ascii chars in knowledge

Open jolestar opened this issue 11 months ago • 1 comments

Relates to

https://github.com/elizaOS/eliza/issues/2376

Risks

Low

Background

What does this PR do?

  • The issue was fixed by updating the regex pattern to use Unicode properties:
  • Changed from: /[^a-zA-Z0-9\s-_./:?=&]/g
  • To: /[^\p{L}\p{N}\s-_./:?=&]/gu
  • Added 'u' flag for Unicode support
  • Used \p{L} to match any kind of letter from any language
  • Used \p{N} to match any kind of numeric character
  • The fix allows the function to properly handle multilingual content while maintaining the original sanitization goals.

What kind of change is this?

Bug fixes (non-breaking change which fixes an issue)

Documentation changes needed?

My changes do not require a change to the project documentation.

Testing

Where should a reviewer start?

Detailed testing steps

Add a unit test.

jolestar avatar Jan 16 '25 11:01 jolestar

please do it via env flag

wtfsayo avatar Jan 16 '25 12:01 wtfsayo