JPlag icon indicating copy to clipboard operation
JPlag copied to clipboard

Missing matches when using JPlag 3

Open sloboegen opened this issue 4 years ago • 5 comments

Hi!

I found that JPlag v.3 misses some of the matches that JPlag v.2.12.1 finds. For example:

jplag-letter

JPlag3 breaks the match at lines with is Empty() / size() == 0, (int) a.get(i +1) / l.get(i + 1) and seems to skip matches due to presence/absence of braces. In my experiment, because of such behavior JPlag3 returned 10% similarity for a pair of plagiarized programs, while JPlag2 returned 70%. I used frontend for Java17 in the JPlag v.2.12.1.

Maybe the third version can be made a little less sensitive so that such matches can be found?

sloboegen avatar Apr 16 '22 18:04 sloboegen

This might be an issue based on the difference between the java 1.7 frontend of the legacy version and the java frontend of JPlag v3. I will look into it. When running the legacy version with -l java19, does it still work?

tsaglam avatar Apr 17 '22 06:04 tsaglam

This might be an issue based on the difference between the java 1.7 frontend of the legacy version and the java frontend of JPlag v3. I will look into it. When running the legacy version with -l java19, does it still work?

Yes, it works. And I got the same matches missing problem when running legacy version with java19 frontend.

sloboegen avatar Apr 17 '22 07:04 sloboegen

Ah, then the issue most likely lies in the token transfer, meaning what parts of the language AST are transformed into JPlag token. The two different Java frontends use different parsers and thus might vary. I will see what I can do!

tsaglam avatar Apr 17 '22 09:04 tsaglam

I have added frontend for Java 1.7 to JPlag v.3. Maybe it's worth opening a PR with these changes?

sloboegen avatar Apr 18 '22 08:04 sloboegen

Currently, we do not want to go back to a multi-java-frontend system. However, this might change depending on the outcome of this issue.

tsaglam avatar Apr 19 '22 12:04 tsaglam

Closed by #911.

tsaglam avatar Feb 13 '23 14:02 tsaglam