BUG: `NOT` logical operation fails when `data` length is lower than the `pattern` length.
Hi team,
I wanted to express the following (to not match the given pattern) by using logical combination operator NOT.
NOT (/foo/)
But looks there is a problem when the data to be scanned is smaller than the pattern to match against, it exits soon due to:
https://github.com/intel/hyperscan/blob/64a995bf445d86b74eb0f375624ffc85682eadfe/src/runtime.c#L346
Removing that check fixes the problem, however, I'm unsure what's supposed to be guarded by that check (apart from the obvious)
I created a POC to reproduce this issue. PTAL in case I'm doing something wrong.
#include "hs.h"
#include <iostream>
int match_handler(unsigned int id,
unsigned long long from,
unsigned long long to,
unsigned int flags,
void *context) {
printf("[match_handler] id: %d from: %d to: %d flags: %d", id, from, to, flags);
return 0;
}
int main() {
hs_error_t err;
// content to be scanned (size 3)
const std::string content("foo");
// patterns
const char *expr[] = {"bars", "!101"};
unsigned flags[] = {HS_FLAG_QUIET, HS_FLAG_COMBINATION};
unsigned ids[] = {101, 102};
printf("[*] execute_hyperscan_test()");
// build a database
hs_database_t *db = nullptr;
hs_compile_error_t *compile_err = nullptr;
err = hs_compile_multi(expr, flags, ids, 2, HS_MODE_NOSTREAM,
nullptr, &db, &compile_err);
if (err != HS_SUCCESS) {
printf("Failed to compile");
return;
}
// alloc some scratch
hs_scratch_t *scratch = nullptr;
err = hs_alloc_scratch(db, &scratch);
if (err != HS_SUCCESS || scratch == nullptr) {
printf("Failed to allocate scratch");
return;
}
err = hs_scan(db, content.c_str(), content.size(), 0, scratch, match_handler, nullptr);
if (err != HS_SUCCESS) {
printf("hyperscan can failed: 0x%08X", err);
return;
}
err = hs_free_scratch(scratch);
if (err != HS_SUCCESS) {
printf("failed freeing scratch");
return;
}
hs_free_database(db);
}
Despite the data length isn't enough, I'd expect to return a match regardless of the pattern size in this case.
Let me know if you require further details/clarifications.
For non-logical-combination flag scenario, condition if (rose->minWidth > length) is useful as if input data length is shorter than minimum length of compiled expressions, no expression can get a match.
For logical combination flag like your case, this condition seems not suitable especially when NOT is used. We'll take some time to see whether to do some improvement here.