zstd
zstd copied to clipboard
ZSTD_decompressStream checks its state parameter is not NULL
preventing segmentation faults seen with compressed perf recordings
perf record --aio -z -- true && perf annotate
Stacktrace from Debian, nearly identical in RHEL.
Program received signal SIGSEGV, Segmentation fault.
bZSTD_decompressStream (zds=0x0, output=output@entry=0x7fffffffc060, input=input@entry=0x7fffffffc040) at decompress/zstd_decompress.c:1657
1657 decompress/zstd_decompress.c: No such file or directory.
(gdb) bt
#0 ZSTD_decompressStream (zds=0x0, output=output@entry=0x7fffffffc060, input=input@entry=0x7fffffffc040) at decompress/zstd_decompress.c:1657
#1 0x00005555558dd2dc in zstd_decompress_stream (data=data@entry=0x555555de4898, src=src@entry=0x7ffff7ffb248, src_size=src_size@entry=389, dst=0x7ffff665a028,
dst_size=dst_size@entry=528384) at util/zstd.c:100
#2 0x000055555583ea8d in perf_session__process_compressed_event (session=0x555555dddc10, event=0x7ffff7ffb240, file_offset=<optimized out>) at util/session.c:74
#3 0x00005555558409c1 in perf_session__process_user_event (session=session@entry=0x555555dddc10, event=event@entry=0x7ffff7ffb240, file_offset=file_offset@entry=576)
at util/session.c:1664
#4 0x000055555584242b in perf_session__process_event (file_offset=576, event=0x7ffff7ffb240, session=0x555555dddc10) at util/session.c:1797
#5 0x0000555555843afa in reader__process_events (prog=0x7fffffffc720, session=0x555555dddc10, rd=<synthetic pointer>) at util/session.c:2251
#6 __perf_session__process_events (session=0x555555dddc10) at util/session.c:2309
#7 perf_session__process_events (session=session@entry=0x555555dddc10) at util/session.c:2342
#8 0x000055555574991d in __cmd_annotate (ann=0x7fffffffc880) at builtin-annotate.c:402
#9 cmd_annotate (argc=<optimized out>, argv=0x7fffffffdc40) at builtin-annotate.c:647
#10 0x00005555557ddb43 in run_builtin (p=p@entry=0x555555bb5168 <commands+360>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdc40) at perf.c:313
#11 0x0000555555747c08 in handle_internal_command (argv=0x7fffffffdc40, argc=1) at perf.c:365
#12 run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at perf.c:409
#13 main (argc=1, argv=0x7fffffffdc40) at perf.c:539
Likely perf is to be blamed in the first place, but as there are already checks for other issues it seems to be useful to cater for that.
Hi @GitMensch!
Thank you for your pull request and welcome to our community.
Action Required
In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.
Process
In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.
Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.
If you have received this in error or have any questions, please contact us at [email protected]. Thanks!
CLA signed
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!
I would change the title, because it is misleading.
What you mean is that ZSTD_decompressStream() should check that the state pointer is not NULL.
Indeed, such a scenario is a major breach of contract, it points to a serious error on the user side.
So far, this scenario has been classified as a "breach of contract", meaning we just document the (obvious) convention that ZSTD_DStream* must be valid, but if it's not, it's UB (Undefined Behavior). For perspective, same thing happens if the pointer is not NULL but completely bogus (dangling).
We could nonetheless debate about extending the contract to make ZSTD_decompressStream() able to detect a NULL state and cleanly return an error in this case. For this case, the error should be clearly documented (I suspect no corresponding error code exist yet, and stage_wrong is a too rough approximation of the problem, that can be misleading for debugging).
What I'm concerned though is about consistency: if we do that for this symbol, we would have to do that for every other symbol which manipulates a state as head parameter, for consistency sake. And there are really a lot of them.