Rien

Results 71 issues of Rien

The Dolos API to create a new report is currently very simple: the caller should upload a ZIP-file, and can optionally specify a report `name` and `programming_language`. Some use-case require...

Dolos API

The tokenization performed by `tree-sitter` can be slow for large datasets: up to 50% or even more is spent on this part. Multiple improvements are possible: - Try out similar...

web UI
algorithm
experiment

Running the `Plutokiller` dataset (1000+ files) takes 45 seconds: - 25 seconds are spent tokenizing - 20 seconds are spent calculating all pairs During pair calculation, we create all $\Theta(n^2)$...

algorithm
experiment

This PR allows serializing a Dolos analysis to a DuckDB file instead of CSV-files. ## Performance Benchmark | Dataset | DuckDB write-out | DuckDB parse | CSV write-out | CSV...

experiment

This issue is primarily meant as a braindump/storm to collect ideas and remarks. Feel free to add your own! ### Problem Most programming platforms allow students to submit more than...

Include these parsers: - https://github.com/RubixDev/tree-sitter-asm/ - https://github.com/erihsu/tree-sitter-riscvasm We also need a way to distinguish these languages from each other, both are reported to use the `.s` extension for their files.

Dolos parsers

Dolos ignores comments since (#1645), this is generally an improvement, but sometimes this causes submissions with little to no code to have a high similarity.

enhancement
algorithm

Currently the Dolos server does not use all memory available, increasing this a little would allow for larger datasets to be analyzed.