CompCert icon indicating copy to clipboard operation
CompCert copied to clipboard

Any options available for function & data purging?

Open xxuejie opened this issue 1 year ago • 3 comments

Hi,

I noticed that CompCert does not yet support -ffunction-sections and -fdata-sections, which are typically used to purge unused functions / data in gcc / clang: https://gcc.gnu.org/onlinedocs/gnat_ugn/Compilation-options.html

So one question I have here, is there any option in CompCert where we can achieve similar behavior? I know one alternative is that we can manually eliminate functions that will not be called but for bigger libraries, this might be a large amount of work. So I was curious if there is a way we can achieve this via tweaking compiler flags.

Many thanks!

xxuejie avatar Aug 04 '23 00:08 xxuejie

There's a way to eliminate unreferenced functions at link-time, but it's not very convenient. In the C source, you can annotate function definitions with the section attribute, to place them explicitly in sub-sections of the .text section. For example:

__attribute__((section(".text.f"))) int f(int x)
{
  return x + 1;
}

__attribute__((section(".text.g"))) int g(int x)
{
  return x - 1;
}

int main()
{
  return f(3);
}

Then you can link with ccomp -Wl,--gc-sections and the GNU linker will remove the text sub-sections that are never referenced. In the example above, the .text.g section is removed, since g is not used.

It would definitely be more convenient to have compile-time flags -ffunction-sections and -fdata-sections that automatically put definitions in sub-sections of .text or .data, like GCC does. I don't know how hard it would be to add this to CompCert.

xavierleroy avatar Aug 04 '23 08:08 xavierleroy

Thanks for your reply! I do agree manually annotating each function with the attribute here works. However it is still editing the code, another path I can employ, is simply eliminating those code that is not called :P In our case, it is not so hard to locate all the functions that can be removed.

That being said, I was still hoping we could automate this process somehow, since we are still talking about changing a non-trivial third party library being used, it might not be the best idea to always have to maintain a fork where we either 1) add annotations to each function; or 2) delete a significant portion of all code. Perhaps it might be possible to build something like a preprocessor that automatically annotate each function? Not sure but I will look into this path.

And I totally understand that it might or might not be a non-trivial task to introduce those flags in CompCert.

xxuejie avatar Aug 04 '23 22:08 xxuejie

After some time digging into the internals of CompCert(well, my original plan is to write a pre-processor using libclang that adds those attributes automatically, but in the end decided that hacking CompCert is actually easier :P), I've managed to put together a patch that introduces -ffunction-sections and -fdata-sections to CompCert. This is still a quick hack that is missing many parts:

  • It only changes the backend for RISC-V, other targets would fail compilation now
  • I'm intentionally skipping many enum variants in data section handling for simplicity of the patch
  • There are other ways to do the same job, for example, one might want to build the correct data section in decl_atom from C2C module, instead of doing pattern matching only at assembly printing time. But changing code at parsing phase would require more knowledge into CompCert than I currently possess

Still, I do want to check here: is this a path that makes sense to proceed? Or do you have other concerns in mind?

xxuejie avatar Aug 14 '23 06:08 xxuejie