problem-specifications icon indicating copy to clipboard operation
problem-specifications copied to clipboard

Update canonical-schema.json with "timeout" property

Open martinfreedman opened this issue 4 years ago • 42 comments

Added non-required timeout (in milliseconds) property to labeledTest.properties to support tests with timeouts, such as for alphametics and palindrome-products exercises. This would support automatic generation/update of the tests rather than specific track manual modification of the relevant exercise's test file.

martinfreedman avatar Nov 08 '21 16:11 martinfreedman

Do we expect a timeout? As in, do we want students to write code that times out?

SleeplessByte avatar Nov 08 '21 22:11 SleeplessByte

@SleeplessByte For the test tooling that supports it, it is quite clear that this is a condition for a test to not time out and that a timeout is a test failure. The whole point is to make clear in the problem specification as to what is expected in the exercise as for the two already mentioned exercises. This is where it should be. Without this, the problem specification is inaccurate and/or misleading. Whether a specific track maintainer uses this property or not, they know it is part of the test constraints for a solution to pass.

martinfreedman avatar Nov 09 '21 08:11 martinfreedman

@SaschaMann example added to Readme

martinfreedman avatar Nov 09 '21 08:11 martinfreedman

I wonder if instead of this being property, it should instead be a scenario (see https://github.com/exercism/problem-specifications/blob/main/SCENARIOS.txt). That avoids any contention with how timeouts will be different between tracks, but you'd still be able to somehow mark a test case as being a performance-related test case (this is of course still contentious).

For the timeouts, I think the tests.toml file is a perfect place to define these timeouts.

ErikSchierboom avatar Nov 09 '21 10:11 ErikSchierboom

I am happy to change to @ErikSchierboom 's suggestion using scenariosand tests.toml. IMV the key point is that these are performance constrained tests and this needs to be captured as this level of the exercise specification.

Shall I create a new PR and then link back this and then close this one?

martinfreedman avatar Nov 12 '21 16:11 martinfreedman

@martinfreedman normally we would try to get some consensus, but you, Erik, and I now agree, so that's a consensus of at least three. Perhaps you want to discuss first what the scenario would be?

I would like to see: scenario: timeout or ineffecient or high-time-complexity and then we use error as usual. All of those would allow you to automate this in the test generator.

SleeplessByte avatar Nov 12 '21 16:11 SleeplessByte

I am happy to change to @ErikSchierboom 's suggestion using scenariosand tests.toml. IMV the key point is that these are performance constrained tests and this needs to be captured as this level of the exercise specification.

Great!

I would like to see: scenario: timeout or ineffecient or high-time-complexity and then we use error as usual. All of those would allow you to automate this in the test generator.

Yeah, something like that would be nice. Another option for a scenario is performance. 🤷

ErikSchierboom avatar Nov 12 '21 17:11 ErikSchierboom

I think "performance" scenario best captures what is being tested, whereas "timeout" states how to do that and can be a function of different languages and test frameworks. "Inefficient" and "high-time-complexity" also don't quite get to the fundamental point, that this is a performance test.

martinfreedman avatar Nov 12 '21 22:11 martinfreedman

I would be happy to accept "performance".

SleeplessByte avatar Nov 14 '21 03:11 SleeplessByte

Works for me too!

ErikSchierboom avatar Nov 14 '21 11:11 ErikSchierboom

I don't like performance for this. performance suggests to me that it could also be about optimising the implementation in the language to make it faster without changing the algorithm, whereas the exercises you mention are primarily about algorithmic optimisations. The former would require significantly more track-specific changes, so I think the distinction is important.

(the change request is about the previous comments, not the name of the scenario)

SaschaMann avatar Nov 16 '21 11:11 SaschaMann

(the change request is about the previous comments, not the name of the scenario)

Eh, are you against or in favor of using performance as the scenario?

ErikSchierboom avatar Nov 16 '21 12:11 ErikSchierboom

Eh, are you against or in favor of using performance as the scenario?

Against it and wanted to explain why because the reason wasn't brought up yet, but if people disagree with that reason, it's fine by me, too.

Just wanted to clarify what the blocking change request is about.

SaschaMann avatar Nov 16 '21 12:11 SaschaMann

Do you have an alternative name perhaps?

ErikSchierboom avatar Nov 16 '21 12:11 ErikSchierboom

Maybe something like algorithmic-optimisation or algorithmic-optimisation-needed?

SaschaMann avatar Nov 18 '21 09:11 SaschaMann

I don't like performance for this. performance suggests to me that it could also be about optimising the implementation in the language to make it faster without changing the algorithm, whereas the exercises you mention are primarily about algorithmic optimisations. The former would require significantly more track-specific changes, so I think the distinction is important.

I disagree. Performance covers both and that is why I like it. Plus different languages might have different facilities. The exercises are both about combinatorics and that is likely to be the main use case for any exercise for these performance tests. Having built-in fast or optimised combinatorics alone will not or should not help.

martinfreedman avatar Nov 18 '21 10:11 martinfreedman

Performance covers both and that is why I like it.

That's precisely why I think it's unsuitable. Both cases require different handling in the implementation, which is why the broad term is too general for a scenario.

SaschaMann avatar Nov 18 '21 13:11 SaschaMann

I do not follow this since, both scenarios (if there are really two , not sure) require same handling in a test, in the case of F# & C# by using the xunit timeout attribute but still keeping the same test name.

martinfreedman avatar Nov 18 '21 13:11 martinfreedman

@exercism/reviewers Your thoughts on what the scenario should be named?

ErikSchierboom avatar Dec 09 '21 08:12 ErikSchierboom

I do not follow this since, both scenarios (if there are really two , not sure) require same handling in a test, in the case of F# & C# by using the xunit timeout attribute but still keeping the same test name.

They might require the same handling in a test but not necessarily in the changes to the instructions. Using different scenarios adds that additional information so that human maintainers can make use of it, even if auto generators handle them identically.

SaschaMann avatar Dec 09 '21 12:12 SaschaMann

I am somewhat ambivalent to the name, as long as it gives me a clean way to flag tests that could be problematic, rather than turning off the runner altogether for an exercise. I favor any method that will also make it possible for me to add in explanation/warning for students. Something along the lines of:

"While you can use a naive or 'brute force' methods to solve this exercise, large inputs may cause processor or memory issues. The following tests are designed with large inputs and may timeout, fail, or cause problems if your code is inefficient."

or

"This exercise can be solved using recursion, but carefully check your code. Python does not have tail-call optimization, and larger inputs could cause problems. The following tests are designed with large inputs and may timeout, fail, or cause other issues if your solution is inefficient."

FWIW, Pytest (what we're using more or less for the Python test runner) has a slow flag, intended to be used for "resource-heavy" tests. That's probably what I'd map any of these terms to once they're in place.

To me, performance implies hitting a certain speed or processor or memory benchmark. Then again, with our test runner timeouts, we are indeed implying that there is one.

BethanyG avatar Dec 09 '21 21:12 BethanyG

I quite like "slow"!

ErikSchierboom avatar Dec 10 '21 07:12 ErikSchierboom

What do other people think of "slow" as the scenario name?

ErikSchierboom avatar Jan 05 '22 14:01 ErikSchierboom

Given the point that @SaschaMann made namely " whereas the exercises you mention are primarily about algorithmic optimisations." how about Optimise?

martinfreedman avatar Jan 07 '22 13:01 martinfreedman

At the risk of continuing to bikeshed: I don't like slow for the use cases outlined in this PR. Not all test cases that might require timeout handling of some sort are inherently slow. There are cases where a naïve implementation might be slow yet a solution that implements a more efficient algorithm won't be. In that case, tagging the case as slow doesn't really make sense to me but that's what we'd be doing in the canonical data.

@martinfreedman you've also downvoted the suggestion, could you elaborate why?

SaschaMann avatar Jan 09 '22 01:01 SaschaMann

Seeing that @SaschaMann and @martinfreedman have downvoted "slow", what do people things about the suggested "optimise"?

ErikSchierboom avatar Jan 11 '22 07:01 ErikSchierboom

I like optimized, but it leaves the question unanswered for "optimized for what? Time, Memory, Energy, Readability?"

kotp avatar Jan 11 '22 08:01 kotp

I like what @BethanyG implies "Efficient" better than "Optimise" (and especially not "Optimize"). Or maybe "TimeEfficiency" as that is the specific issue I am addressing and offering a solution for.

In addition the expected BigO restriction could be added into a test name e.g. "Large Inputs require at least Logarithmic Performance" ) or (Linear or LogLinear etc.)

martinfreedman avatar Jan 14 '22 09:01 martinfreedman

Maybe time-optimization or optimize-time?

ErikSchierboom avatar Jan 14 '22 10:01 ErikSchierboom

@ErikSchierboom you mean "time-optimise" or "optimise-time". Most of us are not from the USA, why should we have the imposition of USA spelling?

martinfreedman avatar Jan 14 '22 10:01 martinfreedman