analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

`compinfo` keys not updated incrementally

Open sim642 opened this issue 2 years ago • 0 comments

UpdateCil updates IDs of statements (sid) and variables (vid), but not compinfo keys (ckey). I already wondered this in #598 but now here's concrete evidence that not updating them causes issues for incremental updating as well.

Reproduction

Analyze the following program incrementally (before vs after patching): https://github.com/goblint/bench/tree/b79d0997d2bf3444f6296301e0bfc7bdebc8788c/cve-bench/CVE-2019-19537. In order to see the problem, https://github.com/goblint/analyzer/commit/5cfb7e69fe9a4f0ceff1735580200e0e80d41213 needs to be reverted, because it simply hides the problematic accesses. Nevertheless, similar structs and unions could be used in actual user code, which the access analysis shouldn't ignore.

Before

On initial analysis, there's the following memory location with four accesses:

[Warning][Race] Memory location (struct usb_interface).dev(119532124,usb_interface).mutex(1002340388,device).wait_lock(655119850,mutex).__annonCompField18(448810155,spinlock).rlock(344401721,__anonunion____missing_field_name_256509364).raw_lock(706607406,raw_spinlock).__annonCompField17(637443086,arch_spinlock).tickets(201271018,__anonunion____missing_field_name_45265226).head(313990837,__raw_tickets)(119532124) (race with conf. 80):
  read with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
  read with mhp:{tid=[], {usb_register_dev}} (conf. 80) (file.c:200:2-200:114)
  write with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
  write with mhp:{tid=[], {usb_register_dev}} (conf. 80) (file.c:200:2-200:114)

Here the offset fields have debug printout of the ckey and cname of the compinfo they belong to.

After

After patching, the result differs when analyzed in server mode vs without it.

In server mode

Surprisingly, the number of safe and total memory locations goes up, because the previous four accesses now are split between two slightly different memory locations:

[Success][Race] Memory location (struct usb_interface).dev(119532124,usb_interface).mutex(1002340388,device).wait_lock(655119850,mutex).__annonCompField18(448810155,spinlock).rlock(344401721,__anonunion____missing_field_name_256509364).raw_lock(706607406,raw_spinlock).__annonCompField17(637443086,arch_spinlock).tickets(757927913,__anonunion____missing_field_name_45265226).head(313990837,__raw_tickets)(119532124) (safe):
  read with [mhp:{tid=[], {usb_register_dev}}, lock:{minor_rwsem}] (conf. 80) (file.c:201:2-201:114)
  write with [mhp:{tid=[], {usb_register_dev}}, lock:{minor_rwsem}] (conf. 80) (file.c:201:2-201:114)

[Warning][Race] Memory location (struct usb_interface).dev(119532124,usb_interface).mutex(1002340388,device).wait_lock(655119850,mutex).__annonCompField18(448810155,spinlock).rlock(344401721,__anonunion____missing_field_name_256509364).raw_lock(706607406,raw_spinlock).__annonCompField17(637443086,arch_spinlock).tickets(201271018,__anonunion____missing_field_name_45265226).head(313990837,__raw_tickets)(119532124) (race with conf. 80):
  read with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)
  write with mhp:{tid=[], {usb_register_dev, usb_devnode}} (conf. 80) (file.c:68:2-68:32)

The difference in the locations is that the second-to-last field offset tickets belongs to different structs: for reused accesses it's with key 201271018 and for new accesses it's 757927913. This ckey difference in one offset component is enough to consider them completely separate memory locations.

This result is unsound because the old racing accesses also race with the new accesses that have the extra lock.

Without server mode

Outside of server mode, all four accesses remain together under one memory location, as expected.

Explanation

The problem is that compinfo keys are assigned from nextCompinfoKey in CIL. In server mode, both before and after programs are parsed during a single Goblint execution, therefore the counter doesn't reset in between. Without server mode, the before and after programs are parsed during separate Goblint execution, thus both start with the same counter value.

Even outside of server mode it only works by coincidence that the number and order of defined structs/unions does not change, such that they get identical keys. I haven't tested this but I suspect if that were to change, this could also cause issues outside of server mode.

So clearly UpdateCil also needs to fix up compinfo keys.

Additional notes

The problematic struct-union-emalgamation in Linux kernel headers is this:

typedef struct arch_spinlock {
	union {
		__ticketpair_t head_tail;
		struct __raw_tickets {
			__ticket_t head, tail;
		} tickets;
	};
} arch_spinlock_t;

In CIL:

  • The keys are assigned incrementally here: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/cil.ml#L1498-L1499.
  • This is also documented in the interface: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/cil.mli#L365-L369.
  • Contradictorily, the ckey field documentation in the implementation says they are based on a name hash: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/cil.ml#L353-L357.
  • Turns out that's what Mergecil does later: https://github.com/goblint/cil/blob/da5e1941db81643ef0d2133960cd28a7de311131/src/mergecil.ml#L1510.

This hashing explains why all the ckeys in the above output are so large. But it also poses issues for the incremental updating, because unlike sids and vids they aren't assigned sequentially (at least in the merged files that get compared).

And it also creates further confusion, because in both cases the corresponding cname is identical: __anonunion____missing_field_name_45265226. Nevertheless, somehow they end up with different keys. I'm not sure why, but might be because the anonymous union with a missing field name initially gets a different name and during merging the names are somehow unified but the keys are not?

sim642 avatar Apr 06 '22 09:04 sim642