dg icon indicating copy to clipboard operation
dg copied to clipboard

llvm-to-source: possible to include function declarations in the slice?

Open zyh1121 opened this issue 5 years ago • 3 comments

Hi, thanks for this nice set of tools. I have a question regarding the function declarations in the llvm-to-source output. It seems they are not included in the output. I was wondering if it's possible to generate compilable slices.

For example,

include <assert.h>
#include <stdio.h>

int foo(int i) {
        return i * 2;
}


long int fact(int x)
{
        long int r = x;
        int ba = foo(x);
        while (--x >=2)
                r *= x;

        return r;
}

int main(void)
{
        int a, b, c = 7;

        while (scanf("%d", &a) > 0) {
                assert(a > 0);
                long int r = fact(a);
                printf("fact: %lu\n", r);
        }

        return 0;
}

The output of llvm-to-source is like the following:

10: {
11:     long int r = x;
13:     while (--x >=2)
14:             r *= x;
16:     return r;
17: }
20: {
23:     while (scanf("%d", &a) > 0) {
24:             assert(a > 0);
25:             long int r = fact(a);
27:     }
30: }

Ideally, I would like to run other analyses (not necessary program analysis) on the slice. Because the function names are not in the slice, the flows are broken at the call side. So, I was wondering if (a) it's difficult to generate a compliable slice and (b) where I should look at if I want to try something hackish. Thanks!

zyh1121 avatar Jun 05 '19 18:06 zyh1121

Hi,

(a) it's difficult to generate a compliable slice and

Yes, it is quite difficult to generate compilable slice in C as far as I know. llvm-to-source is just a toy program for debugging and it is not meant to produce syntactically correct programs. It just picks up the lines from the original source code that can be mapped to some instruction in the sliced LLVM bitcode. However, adding function names and prototypes to the output of llvm-to-source should be rather easy.

(b) where I should look at if I want to try something hackish.

Well, hard to say. In this project we map LLVM to C only in llvm-to-source which is a short piece of code where you can start. But if you just want to run other analyses on the sliced code and you do not care about readability, you can use one of the LLVM to C decompilers. I know about tree projects at this moment. One is our https://github.com/staticafi/llvm2c, the other is a revived LLVM's C backend https://github.com/JuliaComputing/llvm-cbe and the last one, yet not so mature at this moment, is https://github.com/trailofbits/rellic.

mchalupa avatar Jun 06 '19 07:06 mchalupa

Thanks for the response! it's good to know. I have a follow-up question in the scenario of multiple source files. Suppose I generated the bitcode file from multiple c files using llvm-link. Could we slice the bitcode? If so, how should I specify the criteria? For example, for the current criteria like line:variable, can I extend it and include the file name? Thanks again

zyh1121 avatar Jun 06 '19 17:06 zyh1121

For example, for the current criteria like line:variable, can I extend it and include the file name?

No, at this moment you cannot use file name (you can provide a patch though ;). What you can do is to insert a call to a dummy function before the line of interest and then slice w.r.t this call.

mchalupa avatar Jun 07 '19 06:06 mchalupa