rdf4j icon indicating copy to clipboard operation
rdf4j copied to clipboard

Remove obsolete testcases-sparql-1.1 testsuite

Open abrokenjester opened this issue 1 year ago • 5 comments

The testcases in the testsuites/sparql/src/main/resources/testcases-sparql-1.1 directory are no longer in use in RDF4J, and have therefore become obsolete and incorrect (see GH-4167).

This directory was originally intended as an extension of official W3C test cases with our own additions for as we discovered them, but it quickly become cumbersome, and the line between "officially approved" tests and our own edges cases (or corrections where we deviated from the standard) became hard to maintain. In addition, the manifest-based tests are tricky to write and hard to work with (for one thing it's almost impossible to run just a single test while debugging - you have to run the full suite every time).

In RDF4J we have switched our own testing framework for SPARQL over to the RepositorySPARQLComplianceTestSuite, which is not based on an manifest-based set of test data, but a JUnit 4 Suite of java Junit tests.

The obsolete datadir should be removed to make it clear that it is no longer maintained and should not be relied upon for compliance testing.

Note that the official W3C test suites for SPARQL 1.0 and 1.1 (in testcases-sparql-1.0-w3c and testcases-sparql-1.1-w3c) are still supported.

abrokenjester avatar Sep 16 '22 23:09 abrokenjester

We should take a look at what code coverage these tests used to have in comparison to our current tests and port over any tests that add code coverage to what is already covered by tests.

hmottestad avatar Sep 17 '22 07:09 hmottestad

We should take a look at what code coverage these tests used to have in comparison to our current tests and port over any tests that add code coverage to what is already covered by tests.

Good plan. The difficulty is of course that there's a shedload of tests in there, some (but not all) of which are already covered in the W3C suite anyway. So we'll need to do some weeding.

Edit it's actually not as bad as I expected - there is a little over 100 test cases in there. I had it in my head that it was thousands.

abrokenjester avatar Sep 18 '22 03:09 abrokenjester

Plan:

  • Run JaCoCo https://www.baeldung.com/jacoco and save the report for later
  • Add a test class that runs the old tests, rerun JaCoCo
  • Diff the reports to see what code paths are not already covered

hmottestad avatar Sep 18 '22 05:09 hmottestad

Certainly worth doing a code coverage analysis (possibly something we should think about adding as a matter of course?).

I wonder how much that will tell for this particular test suite though, which are more about edge case behaviour based on different shapes of the data and the query. I imagine most test cases looking at, say, the behaviour of the SUM aggregate in SPARQL will cover roughly the same lines of code but test very different behaviours based on the shape of the data.

abrokenjester avatar Sep 18 '22 20:09 abrokenjester

I started looking to this and made a PR. https://github.com/eclipse/rdf4j/pull/4211

Ive added a screenshot comparing coverage, and there is a small but decent difference.

hmottestad avatar Sep 19 '22 05:09 hmottestad