dg
dg copied to clipboard
llvm-to-source: possible to include function declarations in the slice?
Hi, thanks for this nice set of tools. I have a question regarding the function declarations in the llvm-to-source
output. It seems they are not included in the output. I was wondering if it's possible to generate compilable slices.
For example,
include <assert.h>
#include <stdio.h>
int foo(int i) {
return i * 2;
}
long int fact(int x)
{
long int r = x;
int ba = foo(x);
while (--x >=2)
r *= x;
return r;
}
int main(void)
{
int a, b, c = 7;
while (scanf("%d", &a) > 0) {
assert(a > 0);
long int r = fact(a);
printf("fact: %lu\n", r);
}
return 0;
}
The output of llvm-to-source is like the following:
10: {
11: long int r = x;
13: while (--x >=2)
14: r *= x;
16: return r;
17: }
20: {
23: while (scanf("%d", &a) > 0) {
24: assert(a > 0);
25: long int r = fact(a);
27: }
30: }
Ideally, I would like to run other analyses (not necessary program analysis) on the slice. Because the function names are not in the slice, the flows are broken at the call side. So, I was wondering if (a) it's difficult to generate a compliable slice and (b) where I should look at if I want to try something hackish. Thanks!
Hi,
(a) it's difficult to generate a compliable slice and
Yes, it is quite difficult to generate compilable slice in C as far as I know. llvm-to-source
is just a toy program for debugging and it is not meant to produce syntactically correct programs. It just picks up the lines from the original source code that can be mapped to some instruction in the sliced LLVM bitcode. However, adding function names and prototypes to the output of llvm-to-source
should be rather easy.
(b) where I should look at if I want to try something hackish.
Well, hard to say. In this project we map LLVM to C only in llvm-to-source
which is a short piece of code where you can start. But if you just want to run other analyses on the sliced code and you do not care about readability, you can use one of the LLVM to C decompilers. I know about tree projects at this moment. One is our https://github.com/staticafi/llvm2c, the other is a revived LLVM's C backend https://github.com/JuliaComputing/llvm-cbe and the last one, yet not so mature at this moment, is https://github.com/trailofbits/rellic.
Thanks for the response! it's good to know. I have a follow-up question in the scenario of multiple source files. Suppose I generated the bitcode file from multiple c files using llvm-link
. Could we slice the bitcode? If so, how should I specify the criteria? For example, for the current criteria like line:variable
, can I extend it and include the file name? Thanks again
For example, for the current criteria like line:variable, can I extend it and include the file name?
No, at this moment you cannot use file name (you can provide a patch though ;). What you can do is to insert a call to a dummy function before the line of interest and then slice w.r.t this call.