p4c icon indicating copy to clipboard operation
p4c copied to clipboard

Link against the Boehm-Demers-Weiser Garbage Collector using FetchContent.

Open fruffy opened this issue 1 year ago • 10 comments

To have more control over linking and setup of the Boehm-Demers-Weiser garbage collector P4C uses we add it as a submodule. For compatibility users can still choose to link with an external dependency. Several other changes were necessary to make this work:

  • We build and link the static libraries of the garbage collection library. This way we do not run into missing library problems. To be able to do so we had to disable some definitions in gc_cpp.cpp.
  • We enable threads and fork support by default to ensure parallelism support.
  • We do not link bdwgc's include directories globally. Instead we use the Abseil approach and mandate that every target links to library that pulls in the BDWGC headers.
  • Clean up the top-level CMakelists a little and move all the module includes to the top.

fruffy avatar Mar 19 '23 23:03 fruffy

Hi @fruffy! What is the motivation to turn libGC into a submodule? Do we need something from a specific new version? Even if so, is submodule really a good solution? Unless we want to track upstream closely I think we should stick either to using installed libgc, or if we can't we can use cmake's fetch content. A submodule adds extra overhead for checkout (and both submodule and fetch content add build overhead, but I guess that is less of a concern).

There are concerns that this would impact downstream compilers which now use a fixed version of libGC in their build/release environments (especially if p4c was to track upstream libCG closely, as opposed to tracking releases).

vlstill avatar Mar 20 '23 08:03 vlstill

Hi @fruffy! What is the motivation to turn libGC into a submodule? Do we need something from a specific new version? Even if so, is submodule really a good solution? Unless we want to track upstream closely I think we should stick either to using installed libgc, or if we can't we can use cmake's fetch content. A submodule adds extra overhead for checkout (and both submodule and fetch content add build overhead, but I guess that is less of a concern).

There are concerns that this would impact downstream compilers which now use a fixed version of libGC in their build/release environments (especially if p4c was to track upstream libCG closely, as opposed to tracking releases).

libGC interacts badly with the current version of the Z3 solver (for example, https://github.com/p4lang/p4c/pull/3927). A more recent version of Z3 (4.12.0) does not work at all. A newer version of the garbage collector and the appropiate configuration appears to fix these problems. The configuration is not possible with the system-installed libgc.

The new version also makes the garbage collection more stable. For example, the crashes when running gdb disappears and Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS becomes less frequent. Increased performance is another nice side-effect. I have to verify this but I see a 30% improvement in long-running test-case generation time for P4Testgen.

It seems prudent to include something as essential as the garbage collection directly as a submodule. It allows us to more easily configure the collector and fix bugs. Thankfully, there are not many source files and it compiles very quickly.

There is still a lot more testing to do, of course. I am not planning to merge this any time soon, but it is good to have this PR as scratch pad to weed out issues with P4Testgen and Z3 etc.

fruffy avatar Mar 20 '23 12:03 fruffy

I see. I've noticed the Z3 problem with libGC but did not investigate it further. It is good that it can be fixed on libGC side.

The improvements in stability & speed would be very nice :-).

It seems prudent to include something as essential as the garbage collection directly as a submodule. It allows us to more easily configure the collector and fix bugs. Thankfully, there are not many source files and it compiles very quickly.

If we are dependent on a particular (minimal) version that is not widely available and on custom configuration than I agree this would be simpler or even required.

I still wonder if we can:

  • use a release tag/branch for our submodule
  • use fetch-content instead to submodule to avoid having the sources checked out in the repo (but still have a fixed version from git, just using cmake instead of submodules).

vlstill avatar Mar 20 '23 14:03 vlstill

I see. I've noticed the Z3 problem with libGC but did not investigate it further. It is good that it can be fixed on libGC side.

The improvements in stability & speed would be very nice :-).

It seems prudent to include something as essential as the garbage collection directly as a submodule. It allows us to more easily configure the collector and fix bugs. Thankfully, there are not many source files and it compiles very quickly.

If we are dependent on a particular (minimal) version that is not widely available and on custom configuration than I agree this would be simpler or even required.

I still wonder if we can:

* use a release tag/branch for our submodule

* use [fetch-content](https://cmake.org/cmake/help/latest/module/FetchContent.html) instead to submodule to avoid having the sources checked out in the repo (but still have a fixed version from git, just using cmake instead of submodules).

FetchContent works great, thanks for pointing me to this. It took a little tweaking but now I got it integrated with the CMake build. The one downside is that this will not work with bazel etc, but those build scripts do not use GC anyway...

The only thing left to fix now is the MacOS build.

fruffy avatar Mar 21 '23 01:03 fruffy

libGC interacts badly with the current version of the Z3 solver (for example, https://github.com/p4lang/p4c/pull/3927). A more recent version of Z3 (4.12.0) does not work at all.

Oh, that is a quite serious issue for the future. Is there a bug report for this problem on z3 issue tracker? Could we bisect problematic commits?

davidbolvansky avatar Mar 21 '23 15:03 davidbolvansky

libGC interacts badly with the current version of the Z3 solver (for example, #3927). A more recent version of Z3 (4.12.0) does not work at all.

Oh, that is a quite serious issue for the future. Is there a bug report for this problem on z3 issue tracker? Could we bisect problematic commits?

I have not done a deep dive why Z3 is causing these issues but it could be related to https://github.com/Z3Prover/z3/issues/6572. glibc 2.34 removes pthread which could have some effect. This might get fixed.

fruffy avatar Mar 21 '23 16:03 fruffy

The MacOS build fails with a bizarre error. Quite difficult to debug.

Edit: It looks like the problem was that libbmv2 was not linked statically. I do not quite understand why that causes issues.

fruffy avatar Mar 21 '23 16:03 fruffy

I see. I've noticed the Z3 problem with libGC but did not investigate it further. It is good that it can be fixed on libGC side.

As far as I can see, there are double frees there. So the problem might be with Z3. Just exposed / hidden by GC.

asl avatar Feb 24 '24 05:02 asl

I see. I've noticed the Z3 problem with libGC but did not investigate it further. It is good that it can be fixed on libGC side.

As far as I can see, there are double frees there. So the problem might be with Z3. Just exposed / hidden by GC.

I also had that issue with linking P4C against other frameworks, for example gRPC. Unclear what causes it but bumping the GC version does seem to fix it.

fruffy avatar Feb 25 '24 17:02 fruffy

I also had that issue with linking P4C against other frameworks, for example gRPC. Unclear what causes it but bumping the GC version does seem to fix it.

On the other hand, it might be result of GC freeing pointer too early :) I am seeing that we are not forwarding aligned allocation functions via GC, it might cause some problems. We do have this patch downstream (otherwise one cannot use libgc with LLVM with C++17 enabled). I will submit PR shortly

asl avatar Feb 25 '24 18:02 asl

Looks like this is still getting hung up on the fedora and old Ubuntu issues, but hopefully doesn't interoduce any more hard-to-debug dependencies.

https://github.com/p4lang/p4c/pull/4623 introduced the Ubuntu 18.04 failure. Not sure why... for some reason this version of boost pulls in hash_combine for the cpp_int.

The Fedora failure should be fixed in #4644.

fruffy avatar May 08 '24 00:05 fruffy

What is the reason to use submodule instead of FetchContent?

vlstill avatar May 09 '24 13:05 vlstill

What is the reason to use submodule instead of FetchContent?

I see that is just an out-of-date top of the PR, could you please update it?

vlstill avatar May 09 '24 13:05 vlstill

@asl Planning to merge this soon. Lmk if you have any concerns here.

fruffy avatar May 09 '24 15:05 fruffy