c2rust icon indicating copy to clipboard operation
c2rust copied to clipboard

`rustc_wrapper` is invoked twice during dynamic instrumentation

Open oinoom opened this issue 3 years ago • 5 comments

This causes the runtime to get finalized twice. The first time this occurs, it's during the collection of instrumentation points which causes the built-up set of MirLocs to get drained prematurely. This causes crashes when building the pointer derivation graph because of out-of-index access when attempting to process generated lighttpd events.

This issue needs #582 to reproduce, to get past #570.

oinoom avatar Aug 05 '22 20:08 oinoom

I was only able to reproduce this once after running it a bunch of times, but after a rm -rf instrument.target, it works again, and I haven't been able to reproduce it.

kkysen avatar Aug 09 '22 22:08 kkysen

Now I'm able to reproduce it. I think I may have just been not waiting long enough.

The error, I think, is because lighttpd builds two crates, rust (a staticlib) and server (a bin). These happen in separate c2rust-instrument-as-rustc-wrapper invocations, and thus the metadata file is written to twice. We need to combine these writes somehow instead of overwriting them, and note that cargo may invoke the rustc wrapper in parallel. There are two ways to do this that I see:

  1. For each rustc wrapper call, open in append mode, serialize to a Vec<u8>, and then do a single write (not write_all). If the write doesn't write all of its data, panic, as only single writes are atomic. This should be fine unless there is a signal that interrupts the write (or are there other cases the full write wouldn't go through?). This will leave consecutive Metadatas in the file, so when reading, we need to keep deserializing Metadatas until we reach the end of the file.

  2. For each rustc wrapper call, write the Metadata to a file like ${crate_name}.${metadata} or ${metadata}/${crate_name}. Then in the cargo wrapper, once cargo and the rustc wrappers have run, read all of those files in, merge them in-memory, and then a merged Metadata to the original $metadata file.

Which one do people think is preferable?

kkysen avatar Aug 10 '22 02:08 kkysen

Option 1 seems brittle. I wouldn't be surprised if some systems set arbitrary limits on how much data can be written in a single write call. Linux has such a limit, though it's high enough not to be a problem here (most likely):

On Linux, write() (and similar system calls) will transfer at most 0x7ffff000 (2,147,479,552) bytes, returning the number of bytes actually transferred. (This is true on both 32-bit and 64-bit systems.)

spernsteiner avatar Aug 10 '22 18:08 spernsteiner

Do you know if there's a way to write all bytes or 0 bytes? I can split up writes into smaller chunks.

kkysen avatar Aug 10 '22 18:08 kkysen

I don't think there's a standard API for that.

spernsteiner avatar Aug 10 '22 19:08 spernsteiner