eliza
eliza copied to clipboard
[knowledge] Enable No ascii chars in knowledge
Relates to
https://github.com/elizaOS/eliza/issues/2376
Risks
Low
Background
What does this PR do?
- The issue was fixed by updating the regex pattern to use Unicode properties:
- Changed from: /[^a-zA-Z0-9\s-_./:?=&]/g
- To: /[^\p{L}\p{N}\s-_./:?=&]/gu
- Added 'u' flag for Unicode support
- Used \p{L} to match any kind of letter from any language
- Used \p{N} to match any kind of numeric character
- The fix allows the function to properly handle multilingual content while maintaining the original sanitization goals.
What kind of change is this?
Bug fixes (non-breaking change which fixes an issue)
Documentation changes needed?
My changes do not require a change to the project documentation.
Testing
Where should a reviewer start?
Detailed testing steps
Add a unit test.
please do it via env flag