Support top-level statements in C#
Bug Description
I've updated from v6.1.0 to v6.2.0 and the same set of files are now showing up as failed submissions for being "too small"
Is there a CLI option that controls this behaviour? It was working well previously and would like those files to be included
JPlag Version
6.2.0
Operating System
No response
Java Version
No response
Which language module are you using?
@Kr0nox C#
Looking over some of the failed submissions I'm now realising it's because they're all using top-level statements (no Main class).. the valid flagged similarities are all using classes. Going back and comparing 6.2.0 with 6.1.0 reports i think they were failing back then too.. the error reporting in v6.2.0 must have thrown me off
I understand that JPlag supports up to C# v6 which doesn't support top-level statements, but is this something that could be fixed from within JPlag or is this an ANTLR4 parser limitation?
Here's a test analysis using v6.2.0 on duplicate submissions using top-level statements
2025-08-25-10:35:36_945 [INFO] AntlrLoggerErrorListener - Summary of all errors:
2025-08-25-10:35:36_920 [ERROR] AntlrLoggerErrorListener - ANTLR error - in /Users/xxx/Desktop/jplag/1/toplevel.cs line 5:0 mismatched input 'string' expecting {<EOF>, 'abstract', 'async', 'class', 'delegate', 'enum', 'extern', 'interface', 'internal', 'namespace', 'new', 'override', 'partial', 'private', 'protected', 'public', 'readonly', 'ref', 'sealed', 'static', 'struct', 'unsafe', 'virtual', 'volatile', '['}
2025-08-25-10:35:36_920 [ERROR] AntlrLoggerErrorListener - ANTLR error - in /Users/xxx/Desktop/jplag/2/toplevel.cs line 5:0 mismatched input 'string' expecting {<EOF>, 'abstract', 'async', 'class', 'delegate', 'enum', 'extern', 'interface', 'internal', 'namespace', 'new', 'override', 'partial', 'private', 'protected', 'public', 'readonly', 'ref', 'sealed', 'static', 'struct', 'unsafe', 'virtual', 'volatile', '['}
2025-08-25-10:35:36_946 [INFO] Submission - Summary of all errors:
2025-08-25-10:35:36_929 [ERROR] Submission - Submission 2 contains 2 tokens, which is below the minimum match length 8!
2025-08-25-10:35:36_929 [ERROR] Submission - Submission 1 contains 2 tokens, which is below the minimum match length 8!
toplevel.cs
using static System.Convert;
using static SplashKitSDK.SplashKit;
// Variables
string examplevar1;
string examplevar2;
string name;
int duration;
Write("Hello world");
// ...
Hi again, b0ink, thank you for this remark!
For parsing C# with ANTLR, we use the grammar from the official ANTLR4 grammar repo on GitHub, which apparently is still covering the same feature set as 8 years ago, based on the last update to the README. We also encounter similar problems with other programming languages.
It is possible to adapt the grammar manually, but then also other parts of the pipeline need to be adjusted. Also, the idea was to rely on the community-trusted, proven grammar files from the official ANTLR repo. Since those tend to be severely outdated, however, we are currently looking into alternative parsing libraries which we hope will keep up with language updates more closely.