datumaro icon indicating copy to clipboard operation
datumaro copied to clipboard

Test suite to cover typical scenarios on supported datasets

Open nmanovic opened this issue 4 years ago • 0 comments

To improve testing of the library we need to have a table with experiments for supported datasets.

Experiments:

  • Convert from the public dataset to Datumaro format and back to the original format. Metrics: status and correctness of the conversion procedure, time to download, read, write, convert the dataset.
  • Take a public dataset and merge it with itself in different modes. Modes: remove duplicates, keep all annotations, merge similar annotations into one. Metrics: status and correctness of the operation, time to merge.
  • Extract a subset of a supported dataset, modify the subset, merge it back. Metrics: status and correctness of the operation, time to extract the subset, merge.
  • Take a public dataset, modify it, navigate by its history. Metrics: status and correctness of the operation, time to move backward and forward by the history, estimate disk space which is required.
  • etc

It will be grade to run the test suite once a week and public results every release. There several benefits here:

  • Stress testing on real dataset and scenarios
  • Performance testing
  • Improve stability and correctness of the library

nmanovic avatar Jun 10 '21 02:06 nmanovic