BAPCtools icon indicating copy to clipboard operation
BAPCtools copied to clipboard

Implement Feat/expectations

Open thorehusfeldt opened this issue 9 months ago • 7 comments

This is my first attempt at implementing the expectations framework into BAPC.

Draft documentation is in doc/expectations.md, and the file bin/expectations.py has extensive documentation (including doctest.)

This is a very early draft, and mainly an invitation for feedback.

Why care?

Correct semantics for required and permitted verdicts

Out-of-the box, an immediate improvement to the current situation is that it implements various verification conventions correctly, such as time_limit_exceeded and run_time_exception.

Here’s an example of a submission (in test/problems/spanishinquisition, where all the examples for the expectations framework reside) that fails for two different reasons. It lives in wrong_answer, but gets TLE. The tool now correctly reports this, by distinguishing between permitted verdicts (here, “all verdict must get AC and WA”), and required verdicts (here, “some verdict must get WA”). Both violations are reported.

image

Of course, the normal behavious is still supported. A submission in accepted that gets WA gets

image

Note that the tool is more helpful in line 2 and tells us where to find the violation (here, because the submission resides in accepted), and which verdicts would have been permitted (here, ACCEPTED).

You can specify your own conventions for what should reside in submissions/other or submissions/mixed; the framework or the tool doesn’t care.

mixed/: # correct submissions that might run out of time 
  sample: accepted
  secret:
    permitted: [AC, TLE]

More fine-grained test data specification:

Requirements can, by default, be specified for all of data/. But it is also possible to be more fine-grained. A contest that wants to ensure that the submissions in wrong_answer still get accepted on all the samples, can just write

wrong_answer/:
  sample: accepted
  secret: wrong answer

This rule now applies to all submissions in wrong_answer.

The specification of “which testdata to match” is just a regex, so you can go crazy:

mixed/superstitious.py:            # for this submission
  secret/thirteen:                 # ... on testcases that match this pattern
    permitted: [WA]                # ... permit exactly the verdict WA, nothing else
  secret/(?!thirteen): accepted    # everywhere else: accept

This can be useful for specifying specific testcases that are aimed at “killing” specific submissions (or vice versa).

More generally, this allows specifying the behaviour of submissions on subdirectories of data, useful for expressing things like “the greedy submissions are expected to pass on the data in this directory, but not that”, or problems with subtasks.

More fine-grained submission specification

Matching of submissions is by prefix; the standard prefixes accepted, wrong_answer, etc. just happen to match the path of the submissions that reside in the corresponding directories. As with superstitious.py above, you can specify a specific submission by giving its full name; or anything inbetween. If Thore wants to ensure that his submissions (say, th-greedy.py) in wrong_answer at least pass sample, he can add

wrong_answer/th:
  sample: accepted

thorehusfeldt avatar Sep 16 '23 11:09 thorehusfeldt