pomsky
pomsky copied to clipboard
.NET: `\p{LC}` doesn't work, `.` and `\w` doesn't properly support Unicode
All identified problems (most have been addressed in Pomsky 0.10):
- [x] .NET doesn't support code points (in hexadecimal notation) outside the BMP – must be converted to two UTF-16 surrogates
- [x] make it work in string literals (e.g.
'𐌰') - [x] make it work for hexadecimal code points above U+FFFF (e.g.
U+10330) instead of producing an error
- [x] make it work in string literals (e.g.
- [ ] #89
- [x]
\pLas shorthand for\p{L}doesn't work - [x]
\p{LC}doesn't work- [ ] polyfill?
- [x] scripts and boolean properties don't work at all
- [x] needs investigation to see if all blocks are supported
- [x] check if block names are correctly normalized: underscores must be removed, but dashes preserved
- [x]
\vand\haren't supported - [ ] #88
- [x] need to check if backreferences like
\80are too high (doc) - [x] any further bugs may surface during fuzzing
To Reproduce
The regex-test crate ~~should be~~ was expanded to run .NET tests and run in CI (currently only on Ubuntu).
Expected behavior
.NET flavor works reliably, using unsupported features produces an error.