gumtree icon indicating copy to clipboard operation
gumtree copied to clipboard

[gen.javaparser] reduced depth with lambda expression

Open TigerHuang98 opened this issue 3 years ago • 9 comments

Hello,

Context

When parsing the following java source

import java.util.function.Function;

class C{
    static void fun(Function<String,String> func1,Function<String,String> func2){
        System.out.println(func1.apply("Hello")+func2.apply(" World"));
    }
    public static void main(String[] args){
        C.fun(s1->s1,s2->s2);
    }
}

I obtain the following Tree

CompilationUnit [0,288]
    ImportDeclaration [0,36]
        Name: Function [7,35]
            Name: function [7,26]
                Name: util [7,17]
                    Name: java [7,12]
    ClassOrInterfaceDeclaration [37,288]
        SimpleName: C [43,45]
        MethodDeclaration [50,206]
            Modifier: static [50,57]
            SimpleName: fun [62,66]
            Parameter [66,96]
                ClassOrInterfaceType [66,90]
                    SimpleName: Function [66,75]
                    ClassOrInterfaceType [75,82]
                        SimpleName: String [75,82]
                    ClassOrInterfaceType [82,89]
                        SimpleName: String [82,89]
                SimpleName: func1 [90,96]
            Parameter [96,126]
                ClassOrInterfaceType [96,120]
                    SimpleName: Function [96,105]
                    ClassOrInterfaceType [105,112]
                        SimpleName: String [105,112]
                    ClassOrInterfaceType [112,119]
                        SimpleName: String [112,119]
                SimpleName: func2 [120,126]
            VoidType [57,62]
            BlockStmt [126,206]
                ExpressionStmt [136,200]
                    MethodCallExpr [136,199]
                        FieldAccessExpr [136,147]
                            NameExpr [136,143]
                                SimpleName: System [136,143]
                            SimpleName: out [143,147]
                        SimpleName: println [147,155]
                        BinaryExpr [155,198]
                            MethodCallExpr [155,176]
                                NameExpr [155,161]
                                    SimpleName: func1 [155,161]
                                SimpleName: apply [161,167]
                                StringLiteralExpr: Hello [167,175]
                            MethodCallExpr [176,198]
                                NameExpr [176,182]
                                    SimpleName: func2 [176,182]
                                SimpleName: apply [182,188]
                                StringLiteralExpr:  World [188,197]
        MethodDeclaration [210,286]
            Modifier: public [210,217]
            Modifier: static [217,224]
            SimpleName: main [229,234]
            Parameter [234,248]
                ArrayType [234,243]
                    ClassOrInterfaceType [234,241]
                        SimpleName: String [234,241]
                SimpleName: args [243,248]
            VoidType [224,229]
            BlockStmt [248,286]
                ExpressionStmt [258,280]
                    MethodCallExpr [258,279]
                        NameExpr [258,260]
                            SimpleName: C [258,260]
                        SimpleName: fun [260,264]
                        LambdaExpr [264,271]
                            Parameter [264,267]
                            SimpleName: s1 [264,267]
                        ExpressionStmt [268,271]
                            NameExpr [268,271]
                                SimpleName: s1 [268,271]
                    LambdaExpr [271,278]
                        Parameter [271,274]
                        SimpleName: s2 [271,274]
                    ExpressionStmt [275,278]
                        NameExpr [275,278]
                            SimpleName: s2 [275,278]

It seems that LambdaExpr [264,271] and LambdaExpr [271,278] should have same depth.

Possible causes:

When processing node which is instance of UnknownType (thus node.range==null) (e.g. first child node of Parameter [264,267] or Parameter [271,274]): In class gen/javaparser/JavaParserVisitor method pushNode(Node n, String label), line88 throws NoSuchElementException, thus the node is not pushed to this.context. However when return to method visitPreOrder(Node node) line62 this.trees is still popped, causes all subsequent nodes to have reduced depth.

Thanks!

TigerHuang98 avatar Apr 06 '21 14:04 TigerHuang98

Created PR #238 for fixing the reduced depth issue. Not sure if the UnknownType node should also be pushed to this.context (omitted in PR)

FYI, a smaller code snippet that also exposes the same issue

class Foo{
    void foo(){
        bar(foo1->bar1,foo2->bar2);
    }
}

TigerHuang98 avatar Apr 06 '21 16:04 TigerHuang98

Hi and thanks for the bug report and PR!

However, is it normal that UnknownType has a null range or is it a bug from Javaparser?

Cheers.

jrfaller avatar Apr 07 '21 14:04 jrfaller

I am a novice user of Javaparser, but if I understood correctly, it is normal for UnknownType to have null range.

Class hierarchy of UnknownType: UnknownTypeTypeNodeNodeWithRange<N>

The Javadoc in Range states that

This range is inclusive of the characters at the "begin" and "end" positions. Note that if the given parameters are reversed (i.e. the end is earlier than begin, then the values are swapped.

Since there is no lexical representation for UnknownType, there is no way for us to construct a range for this node. (A range with Position begin equals to end would represent a single-character range.)

setRange(Range range) in Node also states that

null can be used to indicate that no range information is known, or that it is not of interest.

It seems that null range is the only case where we would get a NoSuchElementException from pushNode of JavaParserVisitor.

TigerHuang98 avatar Apr 08 '21 00:04 TigerHuang98

Thanks for the explanation, very useful!

I guess the next question is therefore : do we need this node in the tree or do we ignore it? If there is no associated range nor token, my best bet is to avoid putting it in the tree, no?

Cheers.

jrfaller avatar Apr 08 '21 13:04 jrfaller

I agree, a node with no lexical representation is not of interest for an AST differencing tool.

TigerHuang98 avatar Apr 08 '21 17:04 TigerHuang98

(Probably more appropriate to be discussed in a separate issue) I am wondering why we have the + 2; here. In SwingDiff the highlighted area seems to be one character larger than it should be

https://github.com/GumTreeDiff/gumtree/blob/7c51bb8c3c3c6f149614765b25845578c03f7765/gen.javaparser/src/main/java/com/github/gumtreediff/gen/javaparser/JavaParserVisitor.java#L86-L96

Thanks!

TigerHuang98 avatar Apr 08 '21 18:04 TigerHuang98

I agree, a node with no lexical representation is not of interest for an AST differencing tool.

Do you want to try a PR ?

Cheers!

jrfaller avatar Apr 09 '21 13:04 jrfaller

Do you want to try a PR ?

The PR #238 has already been created :)

TigerHuang98 avatar Apr 09 '21 13:04 TigerHuang98

OK thanks, I was not sure that this PR was ignoring the UnknownType nodes!

I will try it out.

jrfaller avatar Apr 09 '21 16:04 jrfaller