BAPCtools icon indicating copy to clipboard operation
BAPCtools copied to clipboard

Answer validation: allow `.in`-less invalid testcases

Open thorehusfeldt opened this issue 4 months ago • 3 comments

I want to be able to have .in-less invalid .ans-testcases, like this: (For a problem whose output is in [0..100])

invalid_answers:
  data:
    range_hi:
      ans: 101 
    range_lo:
      ans: -1

Instead of

invalid_answers:
  data:
    range_hi:
      in: 1 1
      ans: 101
    range_lo:
      in: 1 1
      ans: -1   

The .ctd and .viva Answer Validators already do exactly this, but our definition of AnswerValidator insists on the following invocation:

answer_validator testcase.in [flags] < testcase.ans

even if the answer_validator doesn’t even open the testcase.in.

I think .in-free invalid answer invalidatoin makes sense, is useful, and strictly increases problem quality because it allow me do state that “101 is wrong no matter what the input file is”. (I am not so concerned about saving a line of typing a redundant in: 1 1 in the generator. I’m concerned about the stronger semantics.)

To do this, we must give semantics to ”what it means to run an AnswerValidator on a pseudo-testcase without .in”.

Solution 1

Add --input_oblivious to the specification of AnswerValidator. Those who read .ctd and .viva are always input-oblivious anyway; handwritten AnswerValidators can receive this flag (which means they promise to not open(args[1]).) When bt validate iterates over its validators, it can look for this flag in the source code, much like --constraints.

Solution 2

Non-backwardscompatibly change the invocation of AnswerValidators to always be

answer_validator [--input testcase.in] [flags] < testcase.ans

Then the semantics of validating a pseudotestcase is clear: If there is both .in and .ans, both are sent to the validator, else only one is sent.

There are probably other ways of doing this. One of the difficulties is that the Testcase class is very much tied to in_file.

thorehusfeldt avatar Feb 06 '24 19:02 thorehusfeldt