sv-benchmarks icon indicating copy to clipboard operation
sv-benchmarks copied to clipboard

Benchmarks depending on undefined functions in DeviceDriversLinux64

Open mchalupa opened this issue 8 years ago • 27 comments

Hi,

in those benchmarks there are calls to functions (or dereference of an external variable in one case) that are not defined and then a decision is made based on the return value (I linked exact lines):

https://github.com/sosy-lab/sv-benchmarks/blob/master/c/ldv-linux-4.2-rc1/linux-4.2-rc1.tar.xz-32_7a-drivers--net--usb--cdc_subset.ko-entry_point_true-unreach-call.cil.out.c#L6096

https://github.com/sosy-lab/sv-benchmarks/blob/master/c/ldv-commit-tester/m0_true-unreach-call_drivers-usb-gadget-g_printer-ko--106_1a--2b9ec6c.c#L3827

https://github.com/sosy-lab/sv-benchmarks/blob/master/c/ldv-linux-3.4-simple/32_1_cilled_true-unreach-call_ok_nondet_linux-3.4-32_1-drivers--staging--speakup--speakup_ltlk.ko-ldv_main0_sequence_infinite_withcheck_stateful.cil.out.c#L588

Could somebody from ldv team check it? I have more 'candidates' on my list that I haven't checked yet, so maybe I'll add some more later.

Thanks, Marek

mchalupa avatar Nov 21 '16 19:11 mchalupa

It is my understanding that extern functions should just be treated as returning a nondeterministic value. If their precise semantics is not important for the safety of the benchmark, we should just leave them as they are IMHO.

PhilippWendler avatar Nov 22 '16 05:11 PhilippWendler

If their precise semantics is not important for the safety of the benchmark

That is the problem - when we treat the undefined functions as returning non-deterministic value (which is what we do), we get different results than when we just ignore them. That is why I investigated these benchmarks and filed this issue.

mchalupa avatar Nov 22 '16 20:11 mchalupa

What do you mean by "just ignore them"? What is the difference to assuming a nondet return value?

PhilippWendler avatar Nov 23 '16 10:11 PhilippWendler

Ignore was not the right word - we put there some concrete value (like 0). Since the benchmark should not depend on the returned value of undefined function, the value put there should not make any difference, but it turns out that it does in some cases.

mchalupa avatar Nov 23 '16 11:11 mchalupa

the benchmark should not depend on the returned value of undefined function

that is not true for LDV benchmarks

mutilin avatar Nov 23 '16 12:11 mutilin

@mutilin Why can't we use _VERIFIER_nondet* in place of those functions? We have been fighting over various undefined behaviours in numerous benchmarks, so we should really not have any such undefined functions left.

tautschnig avatar Nov 26 '16 20:11 tautschnig

Replacing lots of functions in benchmarks would be a considerable effort when preparing benchmarks. @mutilin and this colleagues would like to have verifiers that they can apply to Linux kernel source as easily and with as least modifications as possible (cf. Evgeny's mails on the list). And I agree that a verifier that cannot handle undefined functions in a graceful manner seems not really helpful for projects of verification in the "real world".

So why can't a verifier simply accept these undefined functions as returning nondet? IMHO all the discussions about undefined behavior are irrelevant for this question.

PhilippWendler avatar Nov 26 '16 20:11 PhilippWendler

I agree that verifiers should be able to handle such function, but we should state it somewhere (or is it somewhere and I missed it?). However, in some benchmarks there are also things like:

void *kmalloc(...)
{
     return __kmalloc(...);
}

where __kmalloc() is undefined and the code uses kmalloc for memory allocation. I suppose that in these cases the verifier should assume that kmalloc allocated memory and not that kmalloc can return anything. Or should we suppose that __kmalloc() returns nondeterministic pointer? (@mutilin ?)

mchalupa avatar Nov 26 '16 20:11 mchalupa

It might be good to at least capture the assumption "invoking a function where no definition is provided yields a non-deterministic value, but does not cause any other side effects." Furthermore I'm worried about all cases where a pointer is returned, or a struct containing pointers, and those pointers are later dereferenced in some form. The reason I'm wondering is that witness checkers might have a very different idea on what is permitted and what isn't.

tautschnig avatar Nov 27 '16 00:11 tautschnig

Documenting this is always good.

For witness validation, I would not expect many problems. These are unreach-call benchmarks, so verifiers should ignore all memsafety problems and just continue (and for example, CPAchecker does so).

Although now that I am thinking about it, there is a general problem with witness validation in CPAchecker and nondet pointers, but this could also occur with the official __VERIFIER_nondet_pointer functions. Nondet pointers in CPAchecker cannot point to stack variables that never have their address taken.

PhilippWendler avatar Nov 27 '16 08:11 PhilippWendler

For definition of external functions we are thinking about introducing a new function void* ram_alloc(void) which will have a semantics of allocating so called Recursively Allocated Memory (RAM) . Please have a look at the proposal and give feedback comments at https://docs.google.com/document/d/1T2j3asG6wl1FM28JXGTmrri8nbnDMdPUVtCB0ZMBK4Q/edit#

mutilin avatar Nov 28 '16 15:11 mutilin

https://github.com/sosy-lab/sv-benchmarks/issues/267#issuecomment-263425130

This might actually the way to fully address #230: if there are benchmarks that dereference invalid pointers because an undefined function is being used then a fix to those benchmarks might be required. Else it's likely good enough to keep them as they are.

I can say in advance that LDV benchmarks are memory unsafe

mutilin avatar Nov 29 '16 12:11 mutilin

We can teach our C-backend to replace all calls to undefined functions, which base return types are simple, e.g. integers, with calls to SV-COMP well known functions. So this will be done automatically. Other users should do this themselves. Existing benchmarks should be fixed. Perhaps you will extend SV-COMP rules to avoid this but this should be decided by the whole community rather than by some involved people.

Besides we constantly add models for all undefined functions that actually allocate some memory, so this won't be the issue in new benchmarks, while existing benchmarks should be fixed.

The only question is about undefined functions which base return types aren't simple, e.g. pointers. The clear semantics for corresponding special SV-COMP functions should be established. When this will be done we will happily prepare automatically all new benchmarks appropriately while all existing benchmarks should be fixed manually.

eunovm avatar Dec 19 '16 08:12 eunovm

Assuming undefined functions return a nondeterministic value of the right type and don't modify anything is one reasonable interpretation, but not the only one and not my preferred solution. In my view it is too unsafe --- what if the unknown function really does modify some global variable for example --- then your verifier might tell you your program is safe when it isn't. You could argue that an equally "graceful" approach would be to assume that the function modifies everything if no further information is provided. I would prefer that the user make the assumption explicit, whatever it is. (Or there could be a command line option to state a blanket assumption --- but it wouldn't be the default.)

Some tools would use a contract on the function declaration to express this assumption (\assigns nothing;); others might require a simple stub, which is what I thought Dirk said should be done for SV-COMP. Maybe there should be an exception for some category or another if that is not practical.

In any case, the policy should just be clearly stated somewhere so we don't have to argue about each benchmark on a case by case basis.

sfsiegel avatar Dec 20 '16 17:12 sfsiegel

I agree with you that it would be better to do this explicit both in SV-COMP rules and by providing some option to verification tools through BencExec wrappers. The same situation is with recently discussed inline assembly code.

At least some verification tools although assume this by default, but print some warnings in their logs. Logs aren't good places since normal users don't examine them usually. Ditto it is hard for tools. Perhaps it would be better to reflect such the assumptions explicitly and formally in witnesses since they are some guaranties that verification was correct and sound. But some semantics, e.g. interpretation of undefined functions, isn't clear for everybody of course, thus it isn't absolutely clear what "correct" and "sound" mean. Indeed verification tools always state that verification is correct and sound if some assumptions are satisfied. These assumptions may include a particular version of standards/compilers, a particular year of SV-COMP rules and maybe some more options, like inline assembler was ignored, undefined functions were assumed so and so, etc.

eunovm avatar Dec 23 '16 12:12 eunovm

Another corner question is calls through unknown function pointers. Although this is just one special case of common interpretation of unknown pointers, it usually matters more than operations with other types of pointers. In particular some verification tools provide a variety of options and related heuristics how to deal with them.

Also I think that it will be extremely useful both for users and developers if verification tools will specify explicitly and fairly all assumptions they made at corresponding places in source code. For instance, if there will be an edge in a witness that corresponds to a call through unknown function pointer, there is also should be some explanation of why a tool, say, didn't traverse it. Nobody expects miracles, so, you just need to clarify better that there aren't them.

I perfectly understand those verification tool developers that prefer to ask to forbid undefined functions, to remove inline assembly code, etc. This fits the true verification world very well since you won't need even reason about them. But also this will close the doors for users with real-world code at least until they will be able to use supercomputers for verification.

eunovm avatar Dec 23 '16 12:12 eunovm

I don't see how it "closes the door" to ask a real-world developer to write a simple 1-line stub of a function. It's not that hard. But whatever ---- there is a reasonable case for any decision on this issue, there just needs to be a decision --- a decision for SV-COMP, not just for this example.

sfsiegel avatar Dec 23 '16 13:12 sfsiegel

What is the summary of the long discussion? The interpretation of the previous years is no side effects by definition and a non-det return value. Philipp wants to interpret those functions as having no side effects and return a non-det value. Michael wants to replace it by __VERIFIER_nondet_int or similar. Steve wants stub functions.

Is anybody against Steve's proposal? Is it easy to write a script that inserts stubs for all external functions?

dbeyer avatar Dec 23 '16 14:12 dbeyer

It won't be easy because of there is an unknown list of thousands of such functions when we are preparing verification tasks on the fly. So they should be processed automatically somehow.

Now I think that it will be much better not to provide any stubs or replacements with _VERIFIER_nondet*() because of actually this will just help to verification tools but don't serve the original goals of verifiers (people) who want to know under what circumstances verification tools prove that code is correct or find some bugs in it.

Let's assume that there is a call to undefined function int func(void). If we will automatically replace its call or model it by means of _VERIFIER_nondet_int(), then we will loose information that that was originally undefined user function! Then we will be unable to say to user, hey, we interpreted your undefined function in such the manner, so don't surprise that we didn't find the known bug that he/she expected to find.

eunovm avatar Dec 23 '16 14:12 eunovm

My last reply was for Steve's last comment. It is hard for verifiers (people) to provide stubs for all unknown functions manually (and this has no much sense). This can be done automatically quite easy both for new and existing verification tasks. But I don't think that this is a perfect solution from the verifiers point of view since they need real evidences and know assumptions were made. There are several ways how to provide those assumptions nevertheless. For me it would be better to incorporate into witnesses.

eunovm avatar Dec 23 '16 14:12 eunovm

Another option is to provide the list of assumptions globally, but this won't permit some tricky cases, e.g. when there are some particular context dependent reasonings about a call through unknown function pointer (so verifiers will need to understand that themselves.

eunovm avatar Dec 23 '16 14:12 eunovm

In a real world use case, the person who wrote the stub would be the same person who is running the verification tool. They don't need to be told that a missing function was stubbed out, because they were the ones who stubbed it out. They are responsible for making their assumptions explicit. It is the user's job to explicitly state their assumptions --- not the verification tool's job to make these assumptions --- because the user is the only one who knows what assumptions are appropriate. The verification tool has no way of knowing whether it is reasonable to assume that some undefined functions modifies nothing.

When I use Frama-C, for example, I have to explicitly insert contracts for all functions that are called, even if those contracts are as simple as "\assigns nothing" (the equivalent to what we are talking about).

Well, that's my opinion but I will go by whatever decision is made.

sfsiegel avatar Dec 23 '16 14:12 sfsiegel

I read above that there are "thousands of such functions". Wow --- that is fascinating. Where do these functions come from? Has someone actually gone through all of them and checked that they don't modify the state? If not, what use is the verification process? It could say some property holds when it doesn't because this assumption was wrong.

sfsiegel avatar Dec 23 '16 14:12 sfsiegel

As for the calls through unknown function pointers...

Is it correct that the only way such a function pointer could exist is because it was returned by __VERIFIER_nondet_something()?

The only good solution I can think of is to add another special kind of assumption statement that means "assume the function pointed to modifies nothing". Or, add another function like __VERIFIER_pure_function_ptr() which returns some new function pointer that points to a function assumed to modify nothing.

However, this would be a change for the future and for the immediate situation I can't think of a better solution than to assume all such function pointers refer to functions that modify nothing.

sfsiegel avatar Dec 23 '16 15:12 sfsiegel

What is the summary of the long discussion? The interpretation of the previous years is no side effects by definition and a non-det return value. Philipp wants to interpret those functions as having no side effects and return a non-det value. Michael wants to replace it by __VERIFIER_nondet_int or similar. Steve wants stub functions.

Assuming that undefined function returns nondet value is not enough, because we can not assume that such function returns nondet pointer. We will still have undefined behaviour when it is dereferenced. See our proposal for this problem https://docs.google.com/document/d/1T2j3asG6wl1FM28JXGTmrri8nbnDMdPUVtCB0ZMBK4Q/edit#

mutilin avatar Dec 23 '16 22:12 mutilin

In a real world use case, the person who wrote the stub would be the same person who is running the verification tool. They don't need to be told that a missing function was stubbed out, because they were the ones who stubbed it out. They are responsible for making their assumptions explicit. It is the user's job to explicitly state their assumptions --- not the verification tool's job to make these assumptions --- because the user is the only one who knows what assumptions are appropriate. The verification tool has no way of knowing whether it is reasonable to assume that some undefined functions modifies nothing.

When I use Frama-C, for example, I have to explicitly insert contracts for all functions that are called, even if those contracts are as simple as "\assigns nothing" (the equivalent to what we are talking about).

Well, that's my opinion but I will go by whatever decision is made.

As far as I remember you aren't the first who referred to such the examples and Dirk answered that the difference of software verification is that it is fully automatic rather than interactive. We feel this difference each day when we can get verification results even we have many things undefined. This surprises some people very much. Reasonable assumptions do have sense. Without them we likely should use other tools.

What is interesting is that we can add models step by step, e.g. to get rid of false alarms. So if one wants to get better results, one needs to do some more work. Above I reasoned that also it will be very useful to explicitly specify assumptions made by verification tools. First, this will help one to understand faster what models are actually missed. Second, this will help to clearly understand that verification isn't complete and sound. For instance, calls by function pointers can be almost totally ignored. With that people wouldn't be surprised that with doing nothing they are able to completely verify their code. But they still can add models and reach complete and sound verification at least at the level of their models.

I read above that there are "thousands of such functions". Wow --- that is fascinating. Where do these functions come from? Has someone actually gone through all of them and checked that they don't modify the state? If not, what use is the verification process? It could say some property holds when it doesn't because this assumption was wrong.

We are treating almost all Linux kernel modules one by one and remain almost all their environment undefined. Unfortunately I haven't particular numbers right now, but by the order of magnitude there are tens of thousands of functions defined by the kernel and by other modules that can be called by any of them. Moreover this list always changes from one version of the kernel to another one. Of course, no module calls all available functions but nevertheless almost all functions are called by some modules. Some functions are called by thousands of modules but the most are called by just a few modules.

As for the models, we have about several hundreds of them in total. But they even aren't always used - for some particular rules there can be just a couple of modeled functions. Recently we started investigation of adding of memory allocation function models for all rules, so soon each rule will come with at least several dozens of modeled functions. But thousands of functions won't be most likely ever modeled and even investigated as you suggested.

After all it is absolutely amazing that we already could find hundreds of bugs accepted by the Linux kernel developers. In addition the false alarm rate is just about 80% on average, while we were able to find more than 60% of known fixed bugs on the dedicated set (the latter means that in general many issues, maybe 80% or even more, remain uncovered when tools and models aren't specially tuned). I think that now nobody can give exact answers why this is possible, perhaps this is due to source code under verification is mature enough, perhaps the most of undefined stuffs don't influence much anything and so on.

As for the calls through unknown function pointers...

Is it correct that the only way such a function pointer could exist is because it was returned by __VERIFIER_nondet_something()?

No, because of function pointers can also come from undefined functions that return pointers or they can change their parameters accordingly.

The only good solution I can think of is to add another special kind of assumption statement that means "assume the function pointed to modifies nothing". Or, add another function like __VERIFIER_pure_function_ptr() which returns some new function pointer that points to a function assumed to modify nothing.

However, this would be a change for the future and for the immediate situation I can't think of a better solution than to assume all such function pointers refer to functions that modify nothing.

I think that this should be discussed within the document linked by @mutilin. There are also other pointer issues that we need to agree on. Without no doubts this can't be done very quickly, but we would like to have some decision as soon as possible, since we want to set up our infrastructure appropriately and, in particular, provide next SV-COMP with more good verification tasks.

eunovm avatar Dec 26 '16 08:12 eunovm

Stephen (@sfsiegel), please see my additions for function pointers 1.a, 1.b on page 1

mutilin avatar Dec 26 '16 10:12 mutilin