dolos
dolos copied to clipboard
feature request: ignoring specified template
This is more of a feature request and less of an issue. Let me preface it by saying dolos is really awesome. It is really difficult to find any open source plagiarism detection software, especially one which complies a web based report. I wanted to pitch a feature request for dolos cli. The ability to specify a pre declared code template, which is ignored by the plagiarism checker. The use-case for this is during assignments, wherein a code template is specified, and students need to add on to the code snippet, only the most important logical section of the assignment. The code template (which should be ignored) usually consists of trivial components like class and function definitions.
This is completely doable and I seem to remember that this was implemented at some point before a major rewrite. We will consider adding this again in the future.
However, you should currently be able to use the -m
and -M
option to ignore fingerprints occurring in a lot of files:
The -m option sets the maximum number of times a given fingerprint may appear before it is ignored. A code fragment that appears in many programs is probably legitimate sharing and not the result of plagiarism. With -m N any fingerprint appearing in more than N programs is filtered out. This option has precedence over the -M option, which is set to 0.9 by default.
The -M option sets how many percent of the files the fingerprint may appear in before it is. With -M N any fingerprint appearing in more than N percent of the files is filtered out. Must be a value between 0 and 1. This option is ignored when comparing only two files, because each match appear in 100% of the files
I would like to vote for this feature request. It would be awesome to have direct possibility of ignoring teachers' code templates! (BTW, using -m
and -M
would filter out also e.g. shared code chunks which were not part of the template.)
Codio users are also requesting this
We have added experimental support for ignoring template code and frequent fingerprints in PR https://github.com/dodona-edu/dolos/pull/1524