hyperscan
hyperscan copied to clipboard
MULTILINE doesn't match CRLF
I was testing hyperscan and chimera to match some text. With this regex:
^hello$
and this text (WITH CRLF):
test
hello
testing
there are no matches found. Keep in mind this only occurs with CRLF.
It's compiled using MULTILINE flag.
Here is fully reproducible code:
#include <iostream>
#include "hs.h"
int matchHandler(unsigned int id, unsigned long long from, unsigned long long to, unsigned int flags, void* context)
{
std::cout << "Matched to " << to << "\n";
return 0;
}
int main()
{
hs_database* database = nullptr;
hs_compile_error* compileError = nullptr;
hs_compile("^hello$", HS_FLAG_MULTILINE, HS_MODE_BLOCK, nullptr, &database, &compileError);
hs_scratch* scratch = nullptr;
hs_alloc_scratch(database, &scratch);
const char* data = "test\r\nhello\r\ntesting"; // Works switching \r to \n
hs_scan(database, data, strlen(data), 0, scratch, matchHandler, nullptr);
}
This also happens on Chimera.
I would greatly appreciate it if there was a fix for either of the 2.
I doubt this is an issue with Hyperscan.
What's your testing environment? I think \r\n is regarded a a newline only on Windows system.
I'm using Windows.
I have this problem too on centos 7 I hope to deal with it as soon as possible