Fixed a crash in the operation of 128-bit variables in the sse instruction set of the x86/x64 architecture
Summary:
When using llvm9, in the IntrinsicCleanerPass function, handle the bug that the usub_sat, uadd_sat, ssub_sat and sadd_sat instructions crash.
Code change description: 1.Fixed the crash when the two parameter types did not match when calling the CreateICmpSLT function and the crash when the element types did not match when calling the CreateSelect function. 2.Added the createConstantVector function to create a ConstantVector. The bit width and number of elements created depend on the input pConstantVector, and the mode parameter is used to limit the value of the element.
Testcase:
#include <stdio.h>
#include <stdlib.h>
#include <immintrin.h>
void adds8x16_sat() {
__m128i round_8x16b,y_0_8x16b;
y_0_8x16b = _mm_adds_epi16(round_8x16b, y_0_8x16b); //bug1
}
void subs8x16_sat() {
__m128i round_8x16b,y_0_8x16b;
y_0_8x16b = _mm_subs_epi16(round_8x16b, y_0_8x16b); //bug2
}
void uadd8x16_sat() {
__m128i round_8x16b,y_0_8x16b;
y_0_8x16b = _mm_adds_epu8(round_8x16b, y_0_8x16b); //bug3
}
void usubs8x16_sat() {
__m128i round_8x16b,y_0_8x16b;
y_0_8x16b = _mm_subs_epu8(round_8x16b, y_0_8x16b); //bug4
}
int main(int argc, char* argv[]) {
adds8x16_sat();
subs8x16_sat();
add8x16_sat();
sub8x16_sat();
return 0;
}
Checklist:
- [x] The PR addresses a single issue. If it can be divided into multiple independent PRs, please do so.
- [x] The PR is divided into a logical sequence of commits OR a single commit is sufficient.
- [x] There are no unnecessary commits (e.g. commits fixing issues in a previous commit in the same PR).
- [x] Each commit has a meaningful message documenting what it does.
- [x] All messages added to the codebase, all comments, as well as commit messages are spellchecked.
- [x] The code is commented OR not applicable/necessary.
- [x] The patch is formatted via clang-format OR not applicable (if explicitly overridden leave unchecked and explain).
- [x] There are test cases for the code you added or modified OR no such test cases are required.
@Laplace1814 I just saw you opened and closed this PR a couple of times (#1454, #1455, #1546). You don't need to do this, just commit changes to your branch: https://github.com/Laplace1814/klee/tree/my-pr
It will be automatically updated here.
Later, you can merge those changes into a few commits. If you need help with this, let us know.
@MartinNowack Since I have no experience in submitting code to github before this, the operation is a little strange, thank you for your assistance.
@Laplace1814 Thanks again for the patch! You should also run the regression suite before submitting. As you can see in the CI, Intrinsics/Saturating.ll fails.
@ccadar Okay, I will try to run the regression suite again before submitting.
@ccadar It may take several days to write test cases, and multiple codes may be modified to ensure that the actual running results meet expectations.
@Laplace1814 I understand, it's a bit strange at the beginning - but you'll get used to it. And thanks a lot for your effort!
For writing test cases, start with one and push it here when it's working or you've got questions.
Use the test case from above as an example (https://github.com/klee/klee/blob/master/test/Intrinsics/abs-overflow.ll) The main test case magic happens in the first lines as part of the comment.
; REQUIRES: geq-llvm-12.0
; RUN: rm -rf %t.klee-out
; RUN: %klee --output-dir=%t.klee-out --optimize=false %s 2> %t.stderr.log
; RUN: FileCheck %s < %t.stderr.log
You can probably just copy that code snipped and add to each test file.
Next, check with FileCheck (https://www.llvm.org/docs/CommandGuide/FileCheck.html) if it generates the correct output.
That should get you started on this topic.
@MartinNowack Okay, I will learn how to write test cases, thank you for your help.
@MartinNowack I have run the regression test suite and re-modified the code. I have confirmed that this problem is caused by the SSE2 instruction set. Since it is not necessary to support SSE2 instructions at present and this may be a major change, only one is added break statement to prevent klee from crashing.
@Laplace1814 Great, now reduce the test case to its minimum: (I use godbolt for example: https://godbolt.org/z/3nTYqY6av)
So, this is your original source code (reduce it as much as possible - compiling single functions only are fine):
#include <immintrin.h>
__m128i adds8x16_sat(__m128i round_8x16b, __m128i y_0_8x16b) {
return _mm_adds_epi16(round_8x16b, y_0_8x16b); //bug1
}
And becomes (shortened version):
; Function Attrs: nounwind readnone uwtable
define dso_local <2 x i64> @adds8x16_sat(<2 x i64>, <2 x i64>) local_unnamed_addr #0 {
%3 = bitcast <2 x i64> %0 to <8 x i16>
%4 = bitcast <2 x i64> %1 to <8 x i16>
%5 = tail call <8 x i16> @llvm.sadd.sat.v8i16(<8 x i16> %3, <8 x i16> %4) #2
%6 = bitcast <8 x i16> %5 to <2 x i64>
ret <2 x i64> %6
}
; Function Attrs: nounwind readnone speculatable
declare <8 x i16> @llvm.sadd.sat.v8i16(<8 x i16>, <8 x i16>) #1
That's easier to read: @llvm.sadd.sat.v8i16(<8 x i16> %3, <8 x i16> %4) #2 can be easily spotted.
You want to check with FileCheck that the instruction is not there anymore in the assembly.ll.
Add back the feature to KLEE that does the replacement and with that the test should pass. After that, add all the other test cases as well.
@MartinNowack Okay, I will narrow down the test cases later.
@MartinNowack Test cases have been narrowed.
@Laplace1814 Excellent. Now, add back your implementation to check that KLEE doesn't crash even with the intrinsic enabled.
@MartinNowack Re-adding the implementation? Do I need to resubmit the pr?
@Laplace1814 No. Here is your original implementation https://github.com/klee/klee/pull/1457/commits/3246ca9e7a48fad844e6a38832bdcbd09e0fec1a
But with all these changes, it's not there anymore. So you need to bring it back. At the end, the goal of this PR is to handle the issue.
Have a look at: https://github.com/klee/klee/pull/1457/files This is what this PR actually compromises. So, you need to make sure some functionality is recovered.
@MartinNowack I have added the original code
@Laplace1814 Excellent, now let's get that code in shape:
Please replace unsigned mode with an enum class mode and define it.
(Here is a good example - it's a C++11 feature : https://www.learncpp.com/cpp-tutorial/enum-classes/)
@MartinNowack Ok, I'll revise it later, since I'm in Asian time, there will be a time difference in our communication, but it's nice to learn more here.
@MartinNowack I have changed to use the enum type of c++11.
@MartinNowack I have modified the code using switch and assert statements.
@Laplace1814 what is the status of this PR? It would be great if you could address @MartinNowack 's comments (and then squash the commits and rebase) so that we can get to merge it.
@Laplace1814 would you still like to finish this PR? It would be great to have your fixes, but if you don't have time anymore, we'll need to close this PR...
Let's close this inactive PR. I added a note at https://github.com/klee/klee/issues/678 about the missing support for these intrinsics.