Missing matches when using JPlag 3
Hi!
I found that JPlag v.3 misses some of the matches that JPlag v.2.12.1 finds. For example:
JPlag3 breaks the match at lines with is Empty() / size() == 0, (int) a.get(i +1) / l.get(i + 1) and seems to skip matches due to presence/absence of braces. In my experiment, because of such behavior JPlag3 returned 10% similarity for a pair of plagiarized programs, while JPlag2 returned 70%. I used frontend for Java17 in the JPlag v.2.12.1.
Maybe the third version can be made a little less sensitive so that such matches can be found?
This might be an issue based on the difference between the java 1.7 frontend of the legacy version and the java frontend of JPlag v3. I will look into it. When running the legacy version with -l java19, does it still work?
This might be an issue based on the difference between the java 1.7 frontend of the legacy version and the java frontend of JPlag v3. I will look into it. When running the legacy version with
-l java19, does it still work?
Yes, it works. And I got the same matches missing problem when running legacy version with java19 frontend.
Ah, then the issue most likely lies in the token transfer, meaning what parts of the language AST are transformed into JPlag token. The two different Java frontends use different parsers and thus might vary. I will see what I can do!
I have added frontend for Java 1.7 to JPlag v.3. Maybe it's worth opening a PR with these changes?
Currently, we do not want to go back to a multi-java-frontend system. However, this might change depending on the outcome of this issue.
Closed by #911.
