WALA ModRef analysis erroneously indicates lambda expression writes to heap location

Description

I am running a ModRef analysis on a lambda expression. The lambda expression contains a reference to a collection but does not write to it. Nevertheless, the ModRef analysis indicates that a heap location is being modified.

Regression

Removing the collection reference from the lambda expression does not indicate a heap modification, which is correct.

Details

This is the lambda expression under analysis:

Collection<Widget> unorderedWidgets = new HashSet<>();
ArrayList<Double> results = new ArrayList<>();
unorderedWidgets.stream().mapToDouble(Widget::getWeight).filter(w -> results.size() > 0).sum();

Notice that the lambda expression given to filter() does not modify results. However, consider the following code:

ModRef<InstanceKey> modRef = ModRef.make();
Map<CGNode, OrdinalSet<PointerKey>> mod = modRef.computeMod(engine.getCallGraph(), engine.getPointerAnalysis());
OrdinalSet<PointerKey> modSet = mod.get(target);

Here, target is:

Node: synthetic < Application, Ljava/lang/invoke/LambdaMetafactory, test$p$BasicTest$1(Ljava/util/ArrayList;)Ljava/util/function/DoublePredicate; > Context: CallStringContext: [ p.BasicTest.main([Ljava/lang/String;)V@33 com.ibm.wala.FakeRootClass.fakeRootMethod()V@5 ]

The modSet returned is as follows:

for (PointerKey pointerKey : modSet) {
	System.out.println(pointerKey);
}

Prints:

[SITE_IN_NODE{synthetic < Application, Ljava/lang/invoke/LambdaMetafactory, test$p$BasicTest$1(Ljava/util/ArrayList;)Ljava/util/function/DoublePredicate; >:NEW <Primordial,Lwala/lambda$p$BasicTest$1>@0 in CallStringContext: [ p.BasicTest.main([Ljava/lang/String;)V@33 com.ibm.wala.FakeRootClass.fakeRootMethod()V@5 ]},com.ibm.wala.ipa.summaries.LambdaSummaryClass$1@1f052592]

But, the lambda expression to filter() does not modify (seemingly) any heap locations. As such, I was expecting modSet to be empty here.

In contrast, if we remove the collection reference from the lambda expression to filter():

unorderedWidgets.stream().mapToDouble(Widget::getWeight).filter(w -> w > 0).sum();

When target is:

Node: synthetic < Application, Ljava/lang/invoke/LambdaMetafactory, test$p$BasicTest$1()Ljava/util/function/DoublePredicate; > Context: CallStringContext: [ p.BasicTest.main([Ljava/lang/String;)V@32 com.ibm.wala.FakeRootClass.fakeRootMethod()V@5 ]

modSet is indeed empty, which is expected.

Notice that there's a slight difference between target in the two situations. In the former, the signature contains test$p$BasicTest$1(Ljava/util/ArrayList;)Ljava/util/function/DoublePredicate;, while in the latter the signature contains test$p$BasicTest$1()Ljava/util/function/DoublePredicate;. Thus, in the former case, the ArrayList is a parameter to the lambda expression, while in the latter case it is not. Is the analysis just super conservative here?

Jun 12 '17 18:06 khatchad

Do you have access to the source code of the size() method? I want to make sure that that method does not do something bad like cache the size in a field.

— Julian

On Jun 12, 2017, at 2:29 PM, Raffi Khatchadourian [email protected] wrote:

Description

I am running a ModRef analysis on a lambda expression. The lambda expression contains a reference to a collection but does not write to it. Nevertheless, the ModRef analysis indicates that a heap location is being modified.

Regression

Removing the collection reference from the lambda expression does not indicate a heap modification, which is correct.

Details

This is the lambda expression under analysis:

Collection<Widget> unorderedWidgets = new HashSet<>(); ArrayList<Double> results = new ArrayList<>(); unorderedWidgets.stream().mapToDouble(Widget::getWeight).filter(w -> results.size() > 0).sum(); Notice that the lambda expression given to filter() does not modify results. However, consider the following code:

ModRef<InstanceKey> modRef = ModRef.make(); Map<CGNode, OrdinalSet<PointerKey>> mod = modRef.computeMod(engine.getCallGraph(), engine.getPointerAnalysis()); OrdinalSet<PointerKey> modSet = mod.get(target); Here, target is:

Node: synthetic < Application, Ljava/lang/invoke/LambdaMetafactory, test$p$BasicTest$1(Ljava/util/ArrayList;)Ljava/util/function/DoublePredicate; > Context: CallStringContext: [ p.BasicTest.main([Ljava/lang/String;)V@33 com.ibm.wala.FakeRootClass.fakeRootMethod()V@5 ] The modSet returned is as follows:

for (PointerKey pointerKey : modSet) { System.out.println(pointerKey); } Prints:

[SITE_IN_NODE{synthetic < Application, Ljava/lang/invoke/LambdaMetafactory, test$p$BasicTest$1(Ljava/util/ArrayList;)Ljava/util/function/DoublePredicate; >:NEW <Primordial,Lwala/lambda$p$BasicTest$1>@0 in CallStringContext: [ p.BasicTest.main([Ljava/lang/String;)V@33 com.ibm.wala.FakeRootClass.fakeRootMethod()V@5 ]},com.ibm.wala.ipa.summaries.LambdaSummaryClass$1@1f052592] But, the lambda expression to filter() does not modify (seemingly) any heap locations. As such, I was expecting modSet to be empty here.

In contrast, if we remove the collection reference from the lambda expression to filter():

unorderedWidgets.stream().mapToDouble(Widget::getWeight).filter(w -> w > 0).sum(); When target is:

Node: synthetic < Application, Ljava/lang/invoke/LambdaMetafactory, test$p$BasicTest$1()Ljava/util/function/DoublePredicate; > Context: CallStringContext: [ p.BasicTest.main([Ljava/lang/String;)V@32 com.ibm.wala.FakeRootClass.fakeRootMethod()V@5 ] modSet is indeed empty, which is expected.

Notice that there's a slight difference between target in the two situations. In the former, the signature contains test$p$BasicTest$1(Ljava/util/ArrayList;)Ljava/util/function/DoublePredicate;, while in the latter the signature contains test$p$BasicTest$1()Ljava/util/function/DoublePredicate;. Thus, in the former case, the ArrayList is a parameter to the lambda expression, while in the latter case it is not. Is the analysis just super conservative here?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/wala/WALA/issues/188, or mute the thread https://github.com/notifications/unsubscribe-auth/ABk3fm4ogWXqY9G7aTiL6QxB1EH57XzTks5sDYOLgaJpZM4N3gMe.

Jun 12 '17 19:06 juliandolby

@juliandolby, yes:

public class ArrayList<E> extends AbstractList<E> //...
    //...
    /**
     * Returns the number of elements in this list.
     *
     * @return the number of elements in this list
     */
    public int size() {
        return size;
    }
    //...
}

It doesn't look like it's doing any kind of caching.

Jun 12 '17 19:06 khatchad

I wonder if this has something to do with lambda expressions being represented as anonymous inner class instances.

Nov 16 '18 23:11 khatchad