SVF
SVF copied to clipboard
Fail to handle the function pointers
Hi, there.
Recently, when I try to analyze the libjpeg project, I find that SVF fails to detect all function pointers in the given bc. I was curious about the reason.
data:image/s3,"s3://crabby-images/23c2f/23c2fb5558050305409cf2f2ae49b8d469c5aee5" alt="image"
To reproduce, I compile the libjpeg-turbo with
CFLAGS="-flto -fuse-ld=gold -Wl,-plugin-opt=save-temps"
Here is the bc file: cjpeg.0.0.preopt.bc.zip
After adding the patch mentioned in #280, I use this command to generate the icfg:
wpa --dump-callgraph --dump-icfg --dump-inst --fspta cjpeg.0.0.preopt.bc
Here is the icfg final: cjpeg205.dot.zip
Is there any way to make SVF handle the function pointer better?
Thanks for maintaining such a great tool!
May I know how about the results of andersen -ander
?
It is the same for the function IndEdgeSolved which is still 0.
data:image/s3,"s3://crabby-images/ea341/ea341bb7da9216f67286826e4543bda7e1c13bf3" alt="image"
Would you be able to look at the indirect call in the source code or maybe post the indirect calls of the bc code? I suspect those indirect calls might be relying on arguments of a function which is never called, thus those arguments do not have points-to values.
The average points-to set sizes are, and the max points-to set size is, extremely small, so I don't think you're getting the analysis you expect. I suspect that pointers you expect to be initialised are not intialised because this is library code. If libjpeg
(I've never worked with it) expects allocations to occur outside its functions, or takes pointer arguments initialised by the user, not inside the library, (in your case, perhaps function pointers as arguments to be used as callbacks or something), these allocations are never initalised.
For example, if in the library there is a function foo
that is never called in the library (or at least not initialised in the library),
void foo(void (*fp)(void)) {
// Do stuff relying on fp
}
the argument fp
is never intialised since there is no caller of foo
which has initialised it.
Looking a bit at the bitcode you shared, there is a single call to malloc
, so it seems likely that a problem like this is the case, unless you know of a specific example in the C code/bitcode where a pointer should point to some function pointer but is missed by SVF.
A solution, if this is the problem, would be to write a program that uses libjpeg
functions you're interested in with intialised values you are interested in, and to statically link it with libjpeg (not sure if SVF can analyse both a client and a library as separate bitcode at the same time, @yuleisui).
You can also take a look at #340 for a similar issue.
EDIT: beat to the punch :)
@mbarbar libjpeg has its own memory manager which is essentially a wrapped malloc function. The library uses function pointers (stored in a big c structure) to call such allocation functions. Therefore, this library do allocate memory object and initialize memory.
See also its malloc wrappers and the initialization of memory manager where it stores function pointers to memory allocation functions in some global struct.
I think it is a challenging scenario for pointer analysis because of the malloc wrappers and calling these allocation functions through function pointers. Is there any possible workarounds? Thanks.
I see. SVF should be able to handle allocation wrappers, it just won't be as precise as you want it to be, especially with a pool allocator. If you know what you're analysing beforehand, you can mark calls to certain functions as allocators (i.e. equivalent to malloc
and friends), though you may have to remove the function body such that it is only a declaration, not sure. See https://github.com/SVF-tools/SVF/blob/master/lib/Util/ExtAPI.cpp and its include/
counterpart.
I checked your bitcode and it appears jinit_memory_mgr
is not in it. I don't think LLVM needs to maintain names internally -- not sure what it does do -- so I compiled it into an object file, and nm
confirms this symbol doesn't exist.
nm cjpeg.0.0.preopt.o | grep mem
000000000000bab0 t get_memory_row
U jpeg_mem_dest
U memcpy
0000000000000004 C memdst
It is easier to compile with gllvm. In fact, you can compile libjpeg
with crux-bitcode, which uses gllvm
, very easily. I have done so (libjpeg-turbo-bc.zip), and this is what nm
gives me:
nm libturbojpeg.so.o | grep mem
0000000000023210 t empty_mem_output_buffer
000000000005e3c0 t empty_mem_output_buffer.306
0000000000023750 t fill_mem_input_buffer
000000000005e660 t fill_mem_input_buffer.311
00000000000215e0 r fill_mem_input_buffer.mybuffer
0000000000022780 r fill_mem_input_buffer.mybuffer.314
0000000000023200 t init_mem_destination
000000000005e3b0 t init_mem_destination.305
0000000000023740 t init_mem_source
000000000005e650 t init_mem_source.310
000000000004d500 T jinit_memory_mgr
000000000004ef80 T jpeg_mem_available
0000000000023050 T jpeg_mem_dest
000000000005e170 T jpeg_mem_dest_tj
000000000004f030 T jpeg_mem_init
0000000000023610 T jpeg_mem_src
000000000005e520 T jpeg_mem_src_tj
000000000004f040 T jpeg_mem_term
U memcpy
U memset
000000000004ee60 t out_of_memory
00000000000232f0 t term_mem_destination
000000000005e4d0 t term_mem_destination.307
We've solved one problem, but unfortunately Andersen results still look unrealistic, though they are better. We have a larger maximum points-to set size and some indirect edges solved. With a pool allocator we expect at least a massive points-to set somewhere, though I am unfamiliar with the codebase.
*********Andersen Pointer Analysis Stats***************
################ (program : )###############
-------------------------------------------------------
AvgTopLvlPtsSize 0.417301
AvgPtsSetSize 0.141172
CopyGepTime 0.387
TotalTime 1.309
SCCMergeTime 0.259
SCCDetectTime 0.251
UpdateCGTime 0.002
CollapseTime 0.001
LoadStoreTime 0.035
PointsToBlkPtr 0
PointsToConstPtr 219
TotalPointers 119470
TotalObjects 5953
NumOfFieldExpand 0
CopyProcessed 1170
NumOfSFRs 0
GepProcessed 1292
StoreProcessed 5000
Pointers 119441
DYFieldPtrs 29
NullPointer 10707
DYFieldObjs 46
NodesInCycles 4950
MaxPtsSetSize 23
Iterations 4
IndCallSites 648
IndEdgeSolved 70
TotalPWCCycleNum 31
AddrProcessed 6112
NumOfSCCDetect 4
LoadProcessed 14327
TotalCycleNum 468
MemObjects 5907
MaxNodesInSCC 2623
#######################################################
Any thoughts @yuleisui.
I find the reason why the symbol missing is that I build the shared library so that some parts of the code are not included in the bc.
I recompile the bc with static mode and this time all the function symbols exist.
Here is the bc FILE, cjpeg-static.0.0.preopt.bc.zip
Unfortunately, when I try to examine the bc with SVF. It crashed (Segmentation fault with core dump) with the following trace,
(gdb) bt
#0 0x000055555562be51 in SVF::ICFG::hasInterICFGEdge(SVF::ICFGNode*, SVF::ICFGNode*, SVF::ICFGEdge::ICFGEdgeK) ()
#1 0x000055555562c44d in SVF::ICFG::addCallEdge(SVF::ICFGNode*, SVF::ICFGNode*, llvm::Instruction const*) ()
#2 0x000055555562dff7 in SVF::ICFG::updateCallGraph(SVF::PTACallGraph*) ()
#3 0x00005555556495ee in SVF::PointerAnalysis::finalize() ()
#4 0x000055555566ce27 in SVF::AndersenBase::finalize() ()
#5 0x000055555566cfbe in SVF::Andersen::analyze() ()
#6 0x00005555555fe84f in SVF::WPAPass::runPointerAnalysis(SVF::SVFModule*, unsigned int) ()
#7 0x00005555555ff646 in SVF::WPAPass::runOnModule(SVF::SVFModule*) [clone .localalias.1060] ()
#8 0x00005555555c91f3 in main ()
We might need more effort here? Thanks!
It works at my side (no crashes for option -dump-icfg -fspta cjpeg.0.0.preopt.bc
). Could you please run again with the lastest version of SVF?
I also have no crash on the latest.
I find it still crash in the using the SVF in the newest docker.
Do you mean the image from DockerHub? That one is pretty old. I will need to update it.
No. I use the Dockerfile shown in the repo.
It should work then. You may wish to remove your previous built image and build it from scratch. Sometimes docker does not support incremental build.
What SVF option did you use?
Make sure you pull to the latest. My cloned repo was missing one or a few commits and it crashed.
I pull it again today and its commit is 5b498f6ebc4109ba6248f, which leads to the crash.
which option? -fspta -dump-icfg
?
Yes. Same as previous. -ander
is crashed too.
Here is the latest docker image from DockerHub docker pull svftools/svf:latest
You can have a try and let us know.
Yes. The latest docker image is working now. Thanks!
Still, the problem mentioned by @mbarbar exists, and the ICFG is incomplete for further analysis since some target paths are missing from the graph.
For example,
there is no result for the actual function called by the
get_pixel_rows
.
Is there a workaround or some parameters that can start a more conservative pointer analysis strategy to ensure the results are sound?
Oops, another problem comes up.
In the newest version of SVF, using the bc and the command above, I find that SVF cannot correctly handle the debug info. Here is the output:
!dbg attachment points at wrong subprogram for function
!9073 = distinct !DISubprogram(name: "read_pbm_integer", scope: !589, file: !589, line: 102, type: !9074, scopeLine: 107, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !588, retainedNodes: !9076)
i32 (%struct.jpeg_compress_struct*, %struct._IO_FILE*, i32)* @read_pbm_integer
br i1 %6, label %7, label %11, !dbg !9101, !llvm.loop !9102
!9103 = !DILocation(line: 93, column: 5, scope: !9104)
!9104 = distinct !DILexicalBlock(scope: !9100, file: !589, line: 92, column: 18)
!9089 = distinct !DISubprogram(name: "pbm_getc", scope: !589, file: !589, line: 85, type: !9090, scopeLine: 88, flags: DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !588, retainedNodes: !9092)
warning: ignoring invalid debug info in /cjpeg-static.0.0.preopt.bc
The debug info of this bc has been verified by the opt
in llvm:
opt -analyze -verify-debug-info ../cjpeg-static.0.0.preopt.bc
SVF never handles debug info, which is handled by llvm lib. It might because you did not generated a proper BC with debug info.
Just as I mentioned, I use llvm tool, opt
, to verify the correctness of the debug info of this bc. It does not raise any warning.
Also, I just try to build this bc with clang, a correct binary can be produced.
Moreover, I attempt to reproduce this issue on other bc, and it seems that this problem related to the --disable-shared
compilation options.
Any thought?
The debug information problem stems from compiling with an old Clang.
!0 = !{!"clang version 4.0.0 (tags/RELEASE_400/final)"}
Since you are using the Docker version of SVF, it is built with a new LLVM. In fact, llvm-dis
of the bitcode, on my machine (version 10), produces the same debug error.
I recompile the program with clang 10 and extract the bc again. cjpeg-static.bc.zip
Unfortunately, I find out that the newest version of SVF crashes again with a segmentation fault similar to the previous one.
Here is the reproduce command:
wpa --dump-callgraph --dump-icfg --fspta cjpeg-static.bc
--ander
crashes too.
So many thanks for the kindly help.
You mentioned that it was working in the latest SVF's docker image.
5hadowblad3 commented 4 days ago •
Yes. The latest docker image is working now. Thanks!
Why crashed, and what is the crash info?
Using this bc file, SVF indeed works fine.
Here is the bc FILE, cjpeg-static.0.0.preopt.bc.zip
However, since this was created by the old version of LLVM (4.0), the debug info is not correctly processed.
Therefore, I regenerate the bc from llvm 10, and the same error occurred again.
recompile the program with clang 10 and extract the bc again. cjpeg-static.bc.zip
Unfortunately, I find out that the newest version of SVF crashes again with a segmentation fault similar to the previous one.
Here is the reproduce command:
wpa --dump-callgraph --dump-icfg --fspta cjpeg-static.bc
--ander
crashes too.
Here is the bug trace:
Program received signal SIGSEGV, Segmentation fault.
0x000055555562f361 in SVF::ICFG::hasInterICFGEdge(SVF::ICFGNode*, SVF::ICFGNode*, SVF::ICFGEdge::ICFGEdgeK) ()
(gdb) bt
#0 0x000055555562f361 in SVF::ICFG::hasInterICFGEdge(SVF::ICFGNode*, SVF::ICFGNode*, SVF::ICFGEdge::ICFGEdgeK) ()
#1 0x00005555556313ed in SVF::ICFG::addCallEdge(SVF::ICFGNode*, SVF::ICFGNode*, llvm::Instruction const*) ()
#2 0x0000555555633257 in SVF::ICFG::updateCallGraph(SVF::PTACallGraph*) ()
#3 0x000055555564c47e in SVF::PointerAnalysis::finalize() ()
#4 0x00005555556755ca in SVF::AndersenBase::finalize() ()
#5 0x000055555566fffa in SVF::AndersenBase::analyze() ()
#6 0x0000555555699a67 in SVF::FlowSensitive::initialize() ()
#7 0x000055555569cb2d in SVF::FlowSensitive::analyze() ()
#8 0x000055555560113b in SVF::WPAPass::runPointerAnalysis(SVF::SVFModule*, unsigned int) ()
#9 0x00005555556020b6 in SVF::WPAPass::runOnModule(SVF::SVFModule*) [clone .localalias.1083] ()
#10 0x00005555555ca2c3 in main ()
PS: I have adapted the path from issue #280.
I have no problem for running either of your bcs on the image. I suspect you were using the wrong wpa
executable or outdated version of SVF in your local machine. Can you try the docker image?
I am using the newest image.
After debugging, I find the reason for crashing comes from the patch mentioned in issue #280, which I also try to add the function pointer result into the ICFG. Otherwise, it won't crash using the original SVF, whereas the callgraph generated is not complete due to the function pointers.
I notice that this feature is not officially supported in SVF currently, I wonder is there a way to add it correctly in the new version of SVF.
OK. I see. You were trying to use ICFG's updateCallGraph
method. I just reproduced the crash you had when using this obsolete method.
The error is caused by trying to connect indirect edges for the function entry block of an external function. I just submitted a patch to fix this (7565e654e403853ba19066fb5102b5cccf46934d). You can try again.
It seems the main functions are working properly now. Thanks!
However, there are is a small issue left: when the bc files get larger and larger, SVF cannot terminate correctly (even though we can use control+c to do it).
Here is the sample bc, target.bc
To reproduce, run:
./wpa --dump-icfg --fspta target.bc