leon
leon copied to clipboard
Bug in proofs
Pretty serious bug, or did I miss something?
import leon.collection._
import leon.lang._
import scala.Any
object Scheduler {
def a(i: BigInt):BigInt = {
i - 1;
} ensuring(res => b(i) && res == 2)
def b(i: BigInt): Boolean = {
val c = a(i);
(i == 1) ==> (c == 0)
}.holds
}
(This code verifies on b3ad5fc57a7469bd8f3b3afbe18a56b54f9d8f3b, but shoudn't because a doesn't always return 2...)
Leon verification is only sound modulo termination.
If you run your example through the termination checker (--termination argument), it will give you an input to a or b that introduces non-termination.
Cheers, Nicolas
On 09. 12. 16 17:13, Baptiste Lepers wrote:
Pretty serious bug, or did I miss something?
import leon.collection._ import leon.lang._ import scala.Any
object Scheduler { def a(i:BigInt):BigInt = { i- 1; } ensuring(res=> b(i)&& res== 2)
def b(i:BigInt): Boolean = { val c = a(i); (i== 1)==> (c== 0) }.holds
}
(This code verifies on b3ad5fc https://github.com/epfl-lara/leon/commit/b3ad5fc57a7469bd8f3b3afbe18a56b54f9d8f3b, but shoudn't because a doesn't always return 2...)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/epfl-lara/leon/issues/279, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKzhG5yPfrcvdoETeKQjET44vCebWEpks5rGX4-gaJpZM4LJGKZ.
Strangely enough the Isabelle solver thinks a
and b
are terminating. Leon claims that b(0)
doesn't terminate but Isabelle evaluates that to false
. (I'm guessing because i == 1
evaluates to false
, so the conclusion is never actually executed.)
Ok... Would it be possible to give an error in that case? It is a bit strange to see that this code passes verification :)
Well, detecting non-termination and/or proving termination can be quite expensive (typically much more so than verification). The extra wait time would probably be a significant pain in verification efforts.
Maybe it would be useful to run a dumbed-down and faster version of the termination checker to try to detect potential non-termination, but in practice you would probably get lots of spurious false positives or miss real cases of non-termination.
We would need a set of "concrete" benchmarks where verification was hurt because of non-termination to determine what checks we should be running.
@lars: this is pretty weird. What kind of operational semantics does Isabelle use for this case? Does the example pass verification too?
Cheers, Nicolas
On 09. 12. 16 17:29, Baptiste Lepers wrote:
Ok... Would it be possible to give an error in that case? It is a bit strange to see that this code passes verification :)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/epfl-lara/leon/issues/279#issuecomment-266057103, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKzhEHMpRCkTXppFS571BIqYKpwhUr0ks5rGYHtgaJpZM4LJGKZ.
It turns out my assessment was wrong. Here's what Isabelle sees:
fun a :: "int ⇒ int" and b :: "int ⇒ bool" where
"a i = i - 1" |
"b i = (i = 1 ⟶ a i = 0)"
Hence, the function definitions themselves are definitely terminating, because there is no call to b
in a
(the Isabelle backend will – by default – not include the pre- and postconditions in the function definition). If the functions are annotated with @isabelle.fullBody
, then Isabelle sees the call to b
and correctly refuses to define the function. I'm unsure what to make of all this.
In essence, the question is: How do you deal with pre- and postconditions introducing complications in the call graph?
By the way, the Isabelle backend is supposed to always give sound results even without --termination
switched on. So there is definitely an issue here somewhere.
I think that @BLepers is not specifically talking about the Isabelle back end. Termination checker should be on by default and turning it off explicitly should be allowed but known to cause soundness issues if the functions are not terminating. @BLepers , let us know what termination checker reports and if it is too weak in some cases.
The termination checker says that a does not finish:
[ Info ] ║ apply ✓ Terminates (Non-recursive)
[ Info ] ║ apply ✓ Terminates (Non-recursive)
[ Info ] ║ a ✗ Non-terminating for call: a(0)
[ Info ] ║ b ✗ Calls non-terminating functions a
So I guess it makes sense that the verification phase is not sound for that example (even though it should probably fail by default as Viktor said because errors like that can easily be made).
What I find a bit confusing in that example is that a() never "really" calls b(). I.e., if you run the applicative code without the "ensuring", there is no infinite loop. (Basically I just tried to separate a bit the proof logic - b() - and the applicative code - a()).
The guarantee Leon is giving you when it says "valid" is that if you run the code, every time you encounter a contract, it won't evaluate to false
. In this case, Leon is telling you the contract of a() (namely b()) will never evaluate to false
(which is true since it won't terminate). However, you didn't guarantee that the contract would evaluate to true
, which is typically what you would want when you say something is verified.
Note that with termination (and a few other checks that guarantee the program won't crash), never evaluating to false
is equivalent to always evaluating to true
.