hyperscan icon indicating copy to clipboard operation
hyperscan copied to clipboard

BUG: `NOT` logical operation fails when `data` length is lower than the `pattern` length.

Open Shaddy opened this issue 4 years ago • 1 comments

Hi team,

I wanted to express the following (to not match the given pattern) by using logical combination operator NOT.

NOT (/foo/)

But looks there is a problem when the data to be scanned is smaller than the pattern to match against, it exits soon due to:

https://github.com/intel/hyperscan/blob/64a995bf445d86b74eb0f375624ffc85682eadfe/src/runtime.c#L346

Removing that check fixes the problem, however, I'm unsure what's supposed to be guarded by that check (apart from the obvious)

I created a POC to reproduce this issue. PTAL in case I'm doing something wrong.

#include "hs.h"
#include <iostream>

int match_handler(unsigned int id,
				  unsigned long long from,
				  unsigned long long to,
				  unsigned int flags,
				  void *context) {
    printf("[match_handler] id: %d from: %d to: %d flags: %d", id, from, to, flags);
    return 0;
}

int main() {
    hs_error_t err;

	// content to be scanned (size 3)
    const std::string content("foo");

	// patterns 
    const char *expr[] = {"bars", "!101"};

	unsigned flags[] = {HS_FLAG_QUIET, HS_FLAG_COMBINATION};
	unsigned ids[] = {101, 102};

	printf("[*] execute_hyperscan_test()");

	// build a database
	hs_database_t *db = nullptr;
	hs_compile_error_t *compile_err = nullptr;

	err = hs_compile_multi(expr, flags, ids, 2, HS_MODE_NOSTREAM,
									  nullptr, &db, &compile_err);

	if (err != HS_SUCCESS) {
		printf("Failed to compile");
		return;
	}

	// alloc some scratch
	hs_scratch_t *scratch = nullptr;
	err = hs_alloc_scratch(db, &scratch);

	if (err != HS_SUCCESS || scratch == nullptr) {
		printf("Failed to allocate scratch");
		return;
	}

	err = hs_scan(db, content.c_str(), content.size(), 0, scratch, match_handler, nullptr);


	if (err != HS_SUCCESS) {
		printf("hyperscan can failed: 0x%08X", err);
		return;
	}

	err = hs_free_scratch(scratch);

	if (err != HS_SUCCESS) {
		printf("failed freeing scratch");
		return;
	}

	hs_free_database(db);
}

Despite the data length isn't enough, I'd expect to return a match regardless of the pattern size in this case.

Let me know if you require further details/clarifications.

Shaddy avatar Nov 08 '21 17:11 Shaddy

For non-logical-combination flag scenario, condition if (rose->minWidth > length) is useful as if input data length is shorter than minimum length of compiled expressions, no expression can get a match.

For logical combination flag like your case, this condition seems not suitable especially when NOT is used. We'll take some time to see whether to do some improvement here.

hongyang7 avatar May 31 '22 16:05 hongyang7