jasmin
jasmin copied to clipboard
checksafety - integration in libjade issues
In this link, you can find the results of a libjade workflow run that runs all the examples through the -checksafety
analysis. I explain the issues I found next, and then we can split these into different issues if appropriate.
Within the attached artifact, libjade-logs-src.tar.gz, there are some *.error
files corresponding to executions that returned error:
libjade-logs-src$ find . -name "*.error" | wc -l
53
libjade-logs-src$ grep -nr "Violation" | wc -l
14
libjade-logs-src$ grep -nr "124" | wc -l
9
libjade-logs-src$ grep -nr "Assertion failed" | wc -l
30
In the current status of Libjade-main / Jasmin-main, all implementations fail to check:
- 9 are timeouts (set to 30 minutes for this first try, which can be increased later);
- 30 are
exception File "src/safety/safetyAbsExpr.ml", line 1055
; - 14 are possible safety violations.
Regarding the implementations that reach the timeout: 7 out of 9, include ChaCha/Salsa reference code. This might indicate that there could a pattern in the code (rounds loop maybe?) that doesn't allow the safety checker to conclude relatively quickly. Any suggestions on how to handle these/thoughts if they can be improved?
crypto_stream/salsa20/salsa20/amd64/ref/.ci/checksafety.error
crypto_stream/salsa20/salsa2012/amd64/ref/.ci/checksafety.error
crypto_stream/chacha/chacha20-ietf/amd64/ref/.ci/checksafety.error
crypto_stream/chacha/chacha12/amd64/ref/.ci/checksafety.error
crypto_stream/chacha/chacha20/amd64/ref/.ci/checksafety.error
crypto_stream/xsalsa20/amd64/ref/.ci/checksafety.error
crypto_secretbox/xsalsa20poly1305/amd64/ref/.ci/checksafety.error
crypto_hash/sha512/amd64/ref/.ci/checksafety.error
crypto_xof/shake256/amd64/spec/.ci/checksafety.error
Regarding src/safety/safetyAbsExpr.ml, line 1055
: is this error related to something that can be changed in the code?
Regarding possible safety violations: I believe most, if not all, might be related to the following isolated example. For instance, the following program fails to check:
inline fn __keccak_init_ref() -> stack u64[25]
{
stack u64[25] state;
reg u64 i t;
_,_,_,_,_, t = #set0();
i = 0;
while (i < 25)
{ state[(int)i] = t;
i += 1;
}
return state;
}
inline fn __theta_sum_ref(stack u64[25] a) -> reg u64[5]
{
inline int x y;
reg u64[5] c;
// C[x] = A[x, 0]
for x=0 to 5
{ c[x] = a[x + 0]; }
// C[x] ^= A[x,1] ^ A[x,2] ^ A[x,3] ^ A[x,4]
for y=1 to 5
{ for x=0 to 5
{ c[x] ^= a[x + y*5]; }
}
return c;
}
export fn test(reg u64 rp)
{
inline int i;
stack u64[25] state;
reg u64[5] c;
state = __keccak_init_ref();
c = __theta_sum_ref(state);
for i=0 to 4
{ [rp + i*8] = c[i]; }
}
And if we update __keccak_init_ref
to the following code, the safety checker goes through:
inline fn __keccak_init_ref() -> stack u64[25]
{
stack u64[25] state;
inline int i;
reg u64 t;
_,_,_,_,_, t = #set0();
for i=0 to 25
{ state[i] = t; }
return state;
}
I'm unsure if this is related to this issue.
I'm unsure if this is related to this https://github.com/jasmin-lang/jasmin/issues/67. I think it’s not related. Here the problem is about proving that an array is properly initialized; there it’s about relations between loop indices to prove that an access is in bounds.
Here are some details about sha512.
Given enough time (a bit more than 30mn on my laptop) the analysis terminates and rises two alarms, both at line 291 of sha512.jinc
. The first one is about sblocks
being initialized, the second about oblocks + t
being in bounds.
Are these alarms expected? Let’s look at them in turn.
The sblocks
array is initialized in function __lastblocks_ref
. I encourage the readers to give a look to the function and to try to convince themselves that indeed, it properly initialized said array. First bytes 0 to inlen are set, in increasing order; then from inlen to j (for some j), again in increasing order; then from j to j+8, in decreasing order; finally from j+8 to 256. Even assuming that the general problem of initialization of arrays in while loops can be solved this function remains full of tricks.
In order to prove that oblocks + t
is in bounds, we have to understand that the loop does not iterate too much. (Sorry, I can’t find a compelling argument that this is hard.)
What can we do? If quick and automated checking of program safety is a goal, we can work in two complementary directions: improve the safety checker, or rewrite the programs. For the initialization issue, it might be enough to zero-initialize the whole array at the beginning of the function with a for loop. I’d be curious to know whether this has a noticeable impact in practice.
The second issue can also be easily workaround by using a regular loop counter.
With these two small modifications to the jasmin code, safety checking goes through in about 20mn.
Hi @vbgl, thanks for the corresponding PR. I just pushed the same modifications for sha256. CI should be green for that example during the afternoon.
I also increased the timeout to 90m, allowing to check crypto_xof/shake256/amd64/spec/
.
Regarding improvements to the safety checker: do you think it could be reasonable to support annotations that would allow the safety checker to handle array initializations using while
loops? Maybe a 'special' while loop? What concerns me is code size, especially for future ARM implementations.
Any idea what is causing exception File "src/safety/safetyAbsExpr.ml", line 1055
?
Another question, by running jasminc --help
:
-checksafety : automatically check for safety
-safetyparam parameter for automatic safety verification:
format: "f_1>param_1|f_2>param_2|..." where each param_i is of the form:
pt_1,...,pt_n;len_1,...,len_k
pt_1,...,pt_n: input pointers of f_i
len_1,...,len_k: input lengths of f_i
-safetyconfig [filename]: use filename (JSON) as configuration file for the safety checker
-safetymakeconfigdoc [dir]: make the safety checker configuration docs in [dir]
Is there extended documentation/usage examples of -safetyparam
, -safetyconfig
, and -safetymakeconfigdoc
? I would like to understand if I could take advantage of these for some examples (or all of them) and support them in libjade infrastruture;
Currently, I'm running all the examples with just -checksafety
in this branch.
The assertion failure is related to array copies (which should not be too hard to fix). See #308.
There is some explanation about -safetyparam
in the wiki.
Of course you should use the -safetyparam
option:
time …/jasminc -checksafety -noinsertarraycopy -safetyparam "jade_xof_shake256_amd64_spec>;" shake256_amd64_spec.jazz
terminates with “no safety violation” in about 6mn on my laptop (ok, the inferred precondition is a bit optimistic, but still a non-trivial result).
With -safetyparam "jade_xof_shake256_amd64_spec>output,input;output_length,input_length"
, the result is more precise (and as fast).
PR #312 should fix the assertion failures.
With
-safetyparam "jade_xof_shake256_amd64_spec>output,input;output_length,input_length"
, the result is more precise (and as fast).
Do you know if it is possible to specify the information from the cmd line argument jade_xof_shake256_amd64_spec>output,input;output_length,input_length
within a JSON file to be used with -safetyconfig
? This would simplify the integration: my current idea would be to produce template JSON files and preprocess them with the contents of */include/api.h (for fixed-length arguments).
PR #312 should fix the assertion failures.
Thanks!! I updated the CI machine and I will give it a try
Do you know if it is possible to specify the information from the cmd line argument […] within a JSON file to be used with
-safetyconfig
?
It is not possible: -safetyconfig
is for something else.
But feel free to suggest the interface of your liking.
If this is the case, I can integrate it as it currently is (using -safetyparam
option and corresponding syntax), and then I will update the Makefiles/scripts when necessary. Meanwhile, could you point me to some (extended) documentation of the options related to -safetyconfig
?
I will write some notes next about possible improvements.
I found the input_range
field in the JSON documentation (produced with -safetymakeconfigdoc
) hence the last question. It is documented as follows:
"input_range": {
"default": {},
"description": {
"descr": "Assumed ranges for inputs.",
"example": {
"var1": { "min": "0", "max": "100" },
"var2": { "min": "200", "max": "300" }
}
}
}
My current understanding is that this field can define the ranges of some input variables (for instance, the minimum and maximum length of an array -- I still don't know if it is possible to scope some variable's interval to a given function: in a small example that I wrote, definitions like "var1": { "min": "0", "max": "100" },
seem to be global and variables must have different names across different functions).
If my understanding is correct, it would also be nice to have something like input_pointers
which would allow specifying the properties on -safetyparam
.
Even better, it would be to separate the declaration of properties of a given program and the configuration itself:
-
-safetyconfig
would do what it currently does minus theinput_range
field; -
-safety_someothername_
would allow specifying properties about the program on a function-level basis.
I think this would have the advantage that we could (more quickly) search (automatically, maybe) for configurations that perform faster. Additionally, suppose we need to use input_range
as it currently is. In that case, it might be more challenging to avoid using default configurations from the past (for instance, config files would need to be generated and then patched for each CI run).
To the best of my knowledge, the -safetyconfig
option is not documented beyond what is output by -safetymakeconfigdoc
.
PR #312 should fix the assertion failures.
That was not enough… PR #317 should help.
Hi, I merged the safety branch from libjade into the main branch (there's still some work to do regarding parameter generation for the -safetyparam
option, based on the contents of api.h
files).
Meanwhile, I logged in the machine that is running this job and found:
"libjade/src/crypto_kem/kyber/common/amd64/avx2/shuffle.jinc", line 23 (8-22):
warning: vector suffix simplified from 8u32 to 256
Default checker parameters.
Analyzing function jade_kem_kyber_kyber768_amd64_avx2_keypair
Fatal error: exception File "src/safety/safetyAbsExpr.ml", line 764, characters 11-17: Assertion failed
Command exited with non-zero status 2
13242.83user 4.84system 3:40:49elapsed 99%CPU (0avgtext+0avgdata 272128maxresident)k
0inputs+408outputs (0major+2455143minor)pagefaults 0swaps
Which seems to correspond to this assert false.
Is this something that can be fixed on the compiler or the implementation?
Is this an ArrayInit
?
I don't understand your question. The exception does not seem to mention what in the source code caused this.
You may want to always set OCAMLRUNPARAM=b
in your CI machines; this will provide much clearer exception stack traces.
Is it the case that you’ve let a machine compute during more than three hours and a half to prove a theorem that you know is false?
You may want to always set
OCAMLRUNPARAM=b
in your CI machines; this will provide much clearer exception stack traces.
Thanks, will likely do that early next week.
Is it the case that you’ve let a machine compute during more than three hours and a half to prove a theorem that you know is false?
Did the previous report contribute to the next PR? https://github.com/jasmin-lang/jasmin/pull/343/ If that is the case, I consider the CPU time well spent.
I think that, for the moment, I will continue to include all libjade implementations in these runs as they might provide insightful feedback for improvements and consistently check that updates to the compiler or the libjade source code do not cause any changes in the expected behavior (even if that expected behavior is a "possible safety violation" when the corresponding execution ends).
Did the previous report contribute to the next PR? #343
Not at all. Your previous report is worthless.
I think we know that most implementations of kyber are unsafe (unaligned memory accesses).
Not at all. Your previous report is worthless.
Thanks for the feedback. Do you expect that https://github.com/jasmin-lang/jasmin/pull/343/ will eliminate the previously reported error?