javaparser icon indicating copy to clipboard operation
javaparser copied to clipboard

LexicalPreservingPrinter does not update parameters to collections.

Open gddc opened this issue 3 years ago • 7 comments

Creating an issue here, but I also have a post up on StackOverflow [1] as I've noticed that some of the contributors here are also over there.

I am trying to replace parameters to collections. Example Input:

Set<Pair<String, String>> coll = new HashSet<>();
coll.add(new Pair("some", "string"));

Desired output:

Set<AbstractMap.ImmutableSimpleEntry<String, String>> coll = new HashSet<>();
coll.add(new AbstractMap.ImmutableSimpleEntry("some", "string"));

I've figured out how to identify the SimpleName I want, and I can make the change. In a debugger, the change is applied to both locations. When I print with the LexicalPreserving printer, only the second instance is changed. When I print with a PrettyPrinter, the change is applied as expected, but there are enough other changes to the source that it's undesirable.

[1] : https://stackoverflow.com/questions/69020193/how-can-i-change-the-parameter-type-to-a-collection-using-javaparser

gddc avatar Sep 02 '21 16:09 gddc

Hi @gddc It's probably a bug! The test case below shows the issue.

String code = "import javafx.util.Pair;\r\n" +
            "class Mine {\r\n" +
            "  public method () {\r\n" +
            "    Set<Pair<String, String>> pairs = new HashSet<>();\r\n" +
            "    pairs.add(new Pair(\"some\", \"string\"));\r\n" +
            "  }\r\n" +
            "}";
    CompilationUnit cu = StaticJavaParser.parse(code);
    LexicalPreservingPrinter.setup(cu);
    ClassOrInterfaceType cit = new ClassOrInterfaceType(new ClassOrInterfaceType("AbstractMap"),
            "SimpleImmutableEntry");
    cu.stream()
            .filter(n -> n instanceof ClassOrInterfaceType)
            .map(n -> (ClassOrInterfaceType) n)
            .filter(node -> node.getNameAsString().equals("Pair"))
            .forEach(type -> {
                type.getTypeArguments().ifPresent(args -> cit.setTypeArguments(args));
                type.getParentNode().get().replace(type, cit);
            });
    System.out.println(LexicalPreservingPrinter.print(cu));
    System.out.println(cu.toString());

jlerbsc avatar Sep 02 '21 19:09 jlerbsc

I've been poking at this, and I found a way to make it work. However, it feels wrong, so I'm hesitant to consider this addressed. The following does successfully work with the LexicalPreservingPrinter:

compilationUnit.stream()
        .filter(ClassOrInterfaceType.class::isInstance)
        .map(ClassOrInterfaceType.class::cast)
        .filter(node -> node.getNameAsString().equals("Pair"))
        .forEach(pair -> {
            ClassOrInterfaceType entry = StaticJavaParser
                    .parseClassOrInterfaceType("AbstractMap.SimpleImmutableEntry");
            pair.getTypeArguments().ifPresent(entry::setTypeArguments);
            Optional<ClassOrInterfaceType> parent = pair.findAncestor(ClassOrInterfaceType.class);
            Optional<VariableDeclarator> declarator = pair.findAncestor(VariableDeclarator.class);
            Optional<VariableDeclarationExpr> expr = pair.findAncestor(VariableDeclarationExpr.class);
            if (parent.isPresent() && declarator.isPresent() && expr.isPresent())  {
                for (Object key : expr.get().getDataKeys().toArray()) {
                    expr.get().removeData((DataKey<?>) key);
                }
                parent.get().setTypeArguments(entry);
                declarator.get().setType(parent.get());
            } else {
                pair.replace(entry);
            }
        });

gddc avatar Jan 05 '22 17:01 gddc

You don't have to manage data's node. I think it's a bug but i don't know this part of JavaParser. May be a workaround is to completely replace the VariableDeclarationExpr expression or the VariableDeclarator

jlerbsc avatar Jan 06 '22 13:01 jlerbsc

Yeah ... I agree this isn't ideal, but it's the only thing I've managed to make work. I've tried (unsuccessfully) to replace various other nodes up the tree. Unfortunately, I'm dealing with a pretty broad code base and attempting to reconstruct replacement VariableDeclarationExpressions dynamically becomes really complicated really fast. If there's some easy approach to that that I've overlooked I can definitely give it a shot.

gddc avatar Jan 06 '22 15:01 gddc

I did a bit of digging on this, since it was affecting a project of mine as well. Apologies if some of the following is incorrect, I'm not super familiar with the internals of the library.

It seems that due to some edge cases involving arrays, the type of a variable in a variable declaration (both when defining it locally in a function and when defining a field on a class) belongs not to the VariableDeclarationExpr or to the FieldDeclaration, but rather to each of the VariableDeclarators contained therein. This makes the workings of LexicalPreservingPrinter a bit wonky, because it associates the TokenTextElement for the type (i.e. Foo in Foo foo = create()) with the VariableDeclarationExpr, rather than a Type itself.

It seems like usually the modification of the TextElements is taken care of when a Node is mutated - but in this case, because the TextElements are stored on a different Node than the one that is mutated, it is not invalidated via the usual mechanisms.

It seems like there is some built in consideration for this - look at the com.github.javaparser.ast.body.VariableDeclarator#customInitialization method - when the Type property changes on one of the VariableDeclarators in a com.github.javaparser.ast.nodeTypes.NodeWithVariables, it pushes a notification up to the containing NodeWithVariables to make the change.

The problem is that this logic doesn't seem to cascade down / isn't an observer for the entire subtree associated with the Type - i.e. as long as you replace the entire type node (i.e. by doing VariableDeclarator.setType()), it'll invalidate properly, but if you were to mutate the Type node directly (i.e. by doing setName, or mutating one of the type parameters), the logic is not triggered.

The least annoying workaround seems to be to clone and mutate the Type node on the VariableDeclarator whenever a modification is needed.

EDIT: The clone approach doesn't really work. If you clone, modify, then set the type, it doesn't work, because the "data" is copied over (which includes the NodeText that needs to be invalidated) but the "observers" aren't (which includes the thing that would ever touch the NodeText).

rlewellen-salesforce avatar Jan 15 '22 04:01 rlewellen-salesforce

Yeah, I vaguely recall running into similar problems when trying to clone various parent-ish notes and replacing them. The only workaround I've found so far is what I posted. It gets more interesting if you have multiple pairs in the same declaration. Something like

Map<Pair<>, Pair<>> myMap = ... 

In this scenario, you have to post-process clobbering the data field. If you clobber the data field on the first replacement, then when you attempt to make the second replacement the classOrType can't find itself in the data field and throws an exception. Definitely feels a little awkward, but again, working is what I need right now.

gddc avatar Jan 24 '22 18:01 gddc

After spending a few hours trying to understand the problem, I advise you to create a new declaration. Indeed the problem you encounter is related to the way the lexical preserving printer assigns the tokens to the nodes and in addition to the presence of phantom nodes (virtual nodes). First, the lexical preserving printer tries to assign java tokens to AST nodes according to the position of the nodes. It traverses all the nodes from the root of the AST as well as all the children in a resursive way with the exception of the nodes which have no position and the 'virtual/phantom' nodes. During this processing, the java tokens are assigned to the node closest to the token and whose position includes that of the token. Unfortunately, this logic does not work in the case of a variable declaration because the tokens describing the type of the variable are positioned before the actual declaration of the variable in the AST. In fact they are assigned to the expression in which the variable is declared. I am describing all this because when you modify a type, the modification is propagated to the direct parent but not to the grand parent or the great grand parent... So the modification does not reach the node which carries the tokens corresponding to the type of the variable and the modification is not taken into account by the lexical preserving printer. In conclusion, it is extremely complicated to modify this mechanism and I strongly advise you to create a new declaration to replace the previous one.

jlerbsc avatar Feb 08 '23 17:02 jlerbsc