False positive with sanitizing function in a loop
I've found a bug in the FlowDroid command line tool.
Consider the following code:
public class MainActivity extends AppCompatActivity {
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
String s = "";
for (int i = 0; i < 10; i++) { // If removed no false positive is reported
s = sanitize(source());
}
sink(s); // FlowDroid finds this leak
}
public String sanitize(String arg1) { // NOT defined as source or sink
if (Math.random() == 123) { // If removed no false positive is reported
}
return "NoSecret";
}
public String source() { // Defined as source
return "Secret";
}
public void sink(String param) { // Defined as sink
}
}
FlowDroid reports one leak for the sink in onCreate. This false positive only happens in the combination "calling the sanitize function within a loop and having the if-statement inside the function". If the if is removed or the call to sanitize is moved out of the loop, FlowDroid reports no leaks.
Even if we add another call s = sanitize(source()); between the loop and the sink-call, no false positive is reported.
Please provide your precise FlowDroid configuration. The taint analysis should normally not look at if statements at all, unless you enable implicit flows (which I would advise against, unless you really need it).
Implicit flows are not enabled, I call FlowDroid via
java -jar ./soot-infoflow-cmd-2.13.0-jar-with-dependencies.jar \
-a {path-to-apk} \
-s ./SourcesAndSinks.xml \
-o ./out.xml \
-d \
-p {path-to-android-platforms-folder}
and my SourcesAndSinks.xml looks like this:
<sinkSources>
<category id="NO_CATEGORY">
<method signature="{package-name}.MainActivity: java.lang.String source()>">
<return type="java.lang.String">
<accessPath isSource="true" isSink="false">
</accessPath>
</return>
</method>
<method signature="{package-name}.MainActivity: void sink(java.lang.String)>">
<param index="0" type="java.lang.String">
<accessPath isSource="false" isSink="true"/>
</param>
</method>
</category>
</sinkSources>
After tinkering around a little I found that the -d argument for merging all the dex files seems to be the culprit - if a remove -d from my call, FlowDroid doesn't find the leak anymore.
Update: After tinkering around a little more I figured that my current setup doesn't find any leaks at all withouth the --mergedexfiles (-d) option. So maybe thats not the culprit for this false positive as it just turns off all leaks.
The -d option for merging the DEX file just makes sure that all DEX files inside the APK are processed. Without this option, a a lot of code will just not be processed. That might by chance avoid the false positive, but is not a proper fix.
For the main issue, can you please check whether the problem still exists with the most recent FlowDroid version from the develop branch? I remember that we fixed some issue with the strong update rule killing a taint due to a may-alias.
I tried the latest version (this commit 3a45a5d ), it still reports one leak for the sink in onCreate.
Math.random() might prevent some optimizations during Jimple simplification and therefore change the results. However, this is just guessing as the issue is not reproducible on my side, neither with a class file nor an APK.
Indeed, the Math.random() seems to be the problem, when I remove it from the condition, no leaks are reported. However, since the whole if-statement is a no-op, shouldn't it be ignored anyway? Attaching my APK file (code with random).
Math.random() changes the internal state of the RNG and therefore, isn't exactly a no-op. With Math.random
https://github.com/secure-software-engineering/FlowDroid/blob/3a45a5d22656f20f4bb8602f13f7752252d0b536/soot-infoflow/src/soot/jimple/infoflow/codeOptimization/InterproceduralConstantValuePropagator.java#L461-L496
line 461 concluded that there is a side-effect, so it needs to keep the callee but still is able to fold the constant return value. The code assumed that it can add the constant assignment, fold it into the next use and then remove the constant assignment again. But in your case, the MOP over the for loop prevented that. In the end, it misoptimized the code to:
for (int i = 0; i < 10; i++) {
sanitize(source());
}
So Math.random is required to get into the else case and the for loop is required to let the constant propagation fail.
See #817 for the fix.
Thank you for the quick fix! I confirm that the original issue is fixed in the latest commit 2227456.
I'm trying to make sense of the fact that the misoptimized code removed the assignment to s, but s wasn't tainted before the loop and still somehow was reported as a leak. I think, there is still something going on with the loop. For instance, if I re-assign s before passing it to the sink as follows:
String s = source();
for (int i = 0; i < 10; i++) { // If removed no false positive is reported
s = sanitize(source());
}
sink(s); // FlowDroid finds this leak
FlowDroid still reports the leak, even though s has been assigned a new value in the loop. This is independent of whether I use Math.random() in the sanitize() method.
Is this expected? Does FlowDroid assume the loop might have 0 iterations?
I'm trying to make sense of the fact that the misoptimized code removed the assignment to s, but s wasn't tainted before the loop and still somehow was reported as a leak.
Maybe expressing the error in Java is a bit misleading.
Before misoptimization:
start: if $i0 >= 10 goto end
$r2 = source()
$r2 = sanitize($r2)
$i0 = $i0 + 1
goto start
end:
...
After misoptimization:
start: if $i0 >= 10 goto end
$r2 = source()
sanitize($r2)
$i0 = $i0 + 1
goto start
end:
...
Is this expected? Does FlowDroid assume the loop might have 0 iterations?
Yes, IFDS computes the join-over-all-paths solution.
Now it makes sense, thank you!
Yes, IFDS computes the join-over-all-paths solution.
I see, so this is a known imprecision issue (imprecision, given that there's no actual path around the for loop).
We are currently looking into an alternative data flow solution that is not based on IFDS and that could reason about the number of iterations of a loop, but that's still early work. We don't even have a full implementation yet.