hyperscan
hyperscan copied to clipboard
SEGV hs_scan with HS_FLAG_SOM_LEFTMOST
Hello there,
I was experimenting with scratch buffer allocation and noticed that one specific pattern would cause a crash. The code I used is as follow, I have 2 similar patterns and only allocate the scratch buffer for pattern1
but not pattern2
:
#include <hs/hs_compile.h>
#include <stdio.h>
#include <string.h>
#include <hs/hs.h>
static int eventHandler(unsigned int id, unsigned long long from,
unsigned long long to, unsigned int flags, void *ctx) {
printf("Match for pattern \"%s\" at offset %llu\n", (char *)ctx, to);
return 0;
}
int main(int argc, char *argv[]) {
const char *pattern1 = "^(ftps?|https?).*?\\?+$";
const char *pattern2 = "^(ftps?|https?)://.*$";
hs_compile_error_t *compile_err;
hs_database_t *database1;
hs_compile(pattern1, HS_FLAG_SOM_LEFTMOST, HS_MODE_BLOCK, NULL, &database1,
&compile_err);
hs_database_t *database2;
hs_compile(pattern2, HS_FLAG_SOM_LEFTMOST, HS_MODE_BLOCK, NULL, &database2,
&compile_err);
const char *inputData = "https://foo.bar/";
hs_scratch_t *scratch = NULL;
hs_alloc_scratch(database1, &scratch);
hs_scan(database1, inputData, strlen(inputData), 0, scratch, eventHandler,
(void*)pattern1);
hs_scan(database2, inputData, strlen(inputData), 0, scratch, eventHandler,
(void*)pattern2);
return 0;
}
Expected results: the second hs_scan
should return an error, since the scratch buffer was not allocated properly for that pattern
Actual results: the second hs_scan
is SEGV, here is the stack trace:
#0 0x00007ffff7c3b5cd in memset (__len=32, __ch=0, __dest=0x0)
at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:71
#1 fatbit_clear (bits=0x0) at ./src/util/fatbit.h:60
#2 recordAnchoredLiteralMatch (end=8, anch_id=0, scratch=0x7fffcc00f240, t=<optimized out>)
at ./src/rose/program_runtime.c:101
#3 roseRunProgram (t=<optimized out>, scratch=<optimized out>, programOffset=<optimized out>, som=0,
end=<optimized out>, prog_flags=<optimized out>) at ./src/rose/program_runtime.c:2278
#4 0x00007ffff7c22fd0 in roseAnchoredCallback (start=start@entry=0, end=<optimized out>,
id=id@entry=1888, ctx=ctx@entry=0x7fffcc00f240) at ./src/rose/match.c:215
#5 0x00007ffff7bd5a04 in doComplexReport (cached_accept_id=<synthetic pointer>,
cached_accept_state=<synthetic pointer>, eod=0 '\000', loc=8, s=21, m=<optimized out>,
ctxt=0x7fffcc00f240, cb=0x7ffff7c22f70 <roseAnchoredCallback>) at ./src/nfa/mcclellan.c:74
#6 mcclellanExec8_i (mode=CALLBACK_OUTPUT, c_final=0x0, single=0 '\000', ctxt=0x7fffcc00f240,
cb=0x7ffff7c22f70 <roseAnchoredCallback>, offAdj=0, len=8, buf=0x7fffd4014580 "https://foo.bar/",
state=<synthetic pointer>, m=<optimized out>) at ./src/nfa/mcclellan.c:502
#7 nfaExecMcClellan8_Bi (single=0 '\000', context=0x7fffcc00f240,
length=8, buffer=0x7fffd4014580 "https://foo.bar/",
offset=0, n=<optimized out>) at ./src/nfa/mcclellan.c:922
#8 nfaExecMcClellan8_B (n=<optimized out>, offset=0, buffer=0x7fffd4014580 "https://foo.bar/", length=8,
cb=0x7ffff7c22f70 <roseAnchoredCallback>, context=0x7fffcc00f240) at ./src/nfa/mcclellan.c:945
#9 0x00007ffff7c0dbe1 in runAnchoredTableBlock (scratch=0x7fffcc00f240, atable=<optimized out>,
t=0x555557cfde00) at ./src/rose/block.c:63
#10 roseBlockAnchored (scratch=0x7fffcc00f240, t=0x555557cfde00) at ./src/rose/block.c:212
#11 roseBlockExec (t=<optimized out>, scratch=<optimized out>) at ./src/rose/block.c:395
#12 0x00007ffff7b44f4f in rawBlockExec (scratch=0x7fffcc00f240, rose=0x555557cfde00)
at ./src/runtime.c:188
#13 hs_scan (db=<optimized out>, data=<optimized out>, length=15, flags=<optimized out>,
scratch=0x7fffcc00f240, onEvent=<optimized out>, userCtx=<optimized out>) at ./src/runtime.c:419
Note: there is no crash if I remove the HS_FLAG_SOM_LEFTMOST
flag.
Workaround: reallocating the scratch buffer with the second pattern solves the issue.
It sounds to me that some sanity checks are missing inside hs_scan
.
Thanks for reporting this. This is indeed a historical issue left. We'll consider to add suitable quick check for scratch validity later.
Please refer to latest develop branch. Commit id: 7c65cfce4e4a5c153ff995113c37570a95675417
This document recommends the allocation of one scratch space per scanning context.
Suricata relies on this performance technique. The commit for this change has caused this Suricata issue. I investigated and discovered that the validity check comparing the scratch and database crc
values was detecting a mismatch since the scratch area is per-Suricata thread context but a per-database scratch area is expected.
Here's the performance tip from the documentation link: Tip A scratch space can be allocated so that it can be used with any one of a number of databases. Each concurrent scan operation (such as a thread) needs its own scratch space.
The hs_alloc_scratch() function can accept an existing scratch space and “grow” it to support scanning with another pattern database. This means that instead of allocating one scratch space for every database used by an application, one can call hs_alloc_scratch() with a pointer to the same hs_scratch_t and it will be sized appropriately for use with any of the given databases.
@jlucovsky Yes you're right. We already notice this issue in Snort integration and Rspamd. A quick roll back patch for this particular issue will be available soon. Sorry for any inconvenience.
@jlucovsky Yes you're right. We already notice this issue in Snort integration and Rspamd. A quick roll back patch for this particular issue will be available soon. Sorry for any inconvenience.
Thanks for addressing the issue so quickly!
@jlucovsky Please refre to latest master. fix id: https://github.com/intel/hyperscan/commit/0bf86a7c15823ee80061317a997842211558bfa4 Any questions, let us know.
Thanks. I've verified proper operation on the Suricata LTS and current (master
) releases.
Thanks!
@hongyang7 Will there be a new release with the update? Many of our users only used "released" software. Thus, could the change be in 5.4.2?
@jlucovsky Please check for latest update.