rewrite icon indicating copy to clipboard operation
rewrite copied to clipboard

Java Parser slow with 3 particular sources (12-20 sec / each) in `spring-data-neo4j` project

Open BoykoAlex opened this issue 2 years ago • 4 comments

There are 3 java sources under tests in spring-data-neo4j project that are extremely slow to parse with Rewrite Java 17 parser:

  • RepositoryIT
  • ReactiveRepositoryIT
  • IssuesIT

Git repo: https://github.com/spring-projects/spring-data-neo4j

Each takes 12-18 sec to parse while JDT is done in under 1 sec. Looks like all the time is spent in compiler.attribute(...) call of org.openrewrite.java.isolated.ReloadableJava17Parser.parseInputsToCompilerAst (). I think this is javac type attribution. I think this is very likely to be caused by the use of Lombok by the project and heavy usage of calls of the methods generated by Lombok in the 3 classes above.

BoykoAlex avatar Mar 22 '23 16:03 BoykoAlex

@BoykoAlex Thanks for reporting this one. Is this still reproducible? When I tried reproducing the LSTs for this project now, the parsing of all sources took around 30 seconds, so I am assuming that the problem is either not as bad, these sources have changed or my computer is much faster 😄

knutwannheden avatar Dec 01 '23 19:12 knutwannheden

This is not a high priority issue anymore for us. It was problematic when we had Rewrite parser working during Java source reconciling. We have switched back to JDT parser since it faster due to lazy type resolution.

I have measured times to parse sources from spring-data-neo4j project and attempted to use ParsingEventListener to see how long each source took.

There are two java source sets:

  • main consisted of 231 files took ~7 sec to parse
  • test consisted of 516 files took ~ 60 sec to parse (!!?)

There is a big chunk of time spent somewhere between start of parsing sources and the first time source is parsed (until the time ParsingEventListener starts doing anything)

For the main source set took 4 sec:

16:45:52.482 [ForkJoinPool.commonPool-worker-5] INFO  o.s.i.v.c.rewrite.java.ProjectParser - Start parsing number of Java files: 231
16:45:56.192 [ForkJoinPool.commonPool-worker-5] INFO  o.s.i.v.c.rewrite.java.ProjectParser - Parsing took 105 for src/main/java/org/springframework/data/neo4j/repository/Neo4jRepository.java

For the test source set it took 55 sec:

16:46:00.315 [ForkJoinPool.commonPool-worker-5] INFO  o.s.i.v.c.rewrite.java.ProjectParser - Start parsing number of Java files: 516
16:46:55.001 [ForkJoinPool.commonPool-worker-5] INFO  o.s.i.v.c.rewrite.java.ProjectParser - Parsing took 70 for src/test/java/org/springframework/data/neo4j/repository/config/EnableReactiveNeo4jRepositoriesTests.java

In both of these cases parser are brand new with separate classpaths. I can only guess where all this time goes. The code looks as follows:

			log.info("Start parsing number of Java files: " + ss.javaSources().size()); 
			long start = System.currentTimeMillis();
			List<SourceFile> javaSources = ListUtils.map(javaParser.parseInputs(
					() -> getInputs(ss.javaSources.stream()).iterator(), projectDir, parseContext).collect(Collectors.toList()), addProvenance(projectProvenance));
			log.info("Done parsing in " + (System.currentTimeMillis() - start)); 

BoykoAlex avatar Dec 01 '23 22:12 BoykoAlex

Here is where most of the time spent according to VisualVM:

Screenshot 2023-12-01 at 17 17 47

BoykoAlex avatar Dec 01 '23 22:12 BoykoAlex

I suspect that you attempt to re-use the parser for different java source sets of the project hence the time is lower...

BoykoAlex avatar Dec 01 '23 22:12 BoykoAlex