haxe icon indicating copy to clipboard operation
haxe copied to clipboard

Null character breaks regex patterns

Open tobil4sk opened this issue 3 years ago • 1 comments

When a regex pattern created directly using the constructor that contains a null character, it does not match properly on Eval, Neko, C++, lua, and Hashlink. PHP throws an error on the first match call due of the null character (Null byte in regex).

final containingNull = new EReg("abc\x00def", "");

trace(containingNull.match("abc")); // true, should be false
trace(containingNull.match("abc\x00def")); // true
trace(containingNull.match("abc\x00fed")); // true, should be false

On all other targets it works as expected.

On the other hand, the following works fine on (almost) all targets, where we use the regex literal syntax.

final containingNull = ~/abc\x00def/;

trace(containingNull.match("abc")); // false
trace(containingNull.match("abc\x00def")); // true (apart from on hashlink)
trace(containingNull.match("abc\x00fed")); // false

Targets affected:

  • [x] Neko - https://github.com/HaxeFoundation/neko/pull/249
  • [ ] Lua
  • [ ] C++
  • [ ] Hashlink
  • [ ] Eval
  • [x] ~PHP~ due to a PHP bug

tobil4sk avatar Feb 15 '22 15:02 tobil4sk

The Php issue is a Php bug so I think it's fine to leave it. There is already way to avoid the bug by escaping it or by using Haxe's regex literal syntax, which escapes it automatically.

On Hashlink it is a little bit more complicated however, as according to this page: https://haxe.org/manual/std-String-encoding.html, Hashlink does not support null bytes, so I'm not sure whether it makes sense or not to fix this there. Also, it would potentially require a change of the Hashlink api to pass in the length of the pattern into the constructor.

tobil4sk avatar Feb 27 '22 09:02 tobil4sk