problem-specifications Meta: adding new keys to the schema

I've changed the title to be a meta discussion, so that we don't have the same discussion in multiple issues:

optional flags #1492
immutability flags #1493
other types of tests #1225
multiline "type"
identity field #1473 (to avoid compromising descriptions #1525 to fit this purpose)

It seems there really is a lot of ask for new fields and there also seems to be a notion that we don't want to break the current generators.

Personally, I am currently parsing the comments for special behaviour and this means that my generators will all break when someone decides the comments should be a special format. I can derive no special meaning from the english in these comments. I don't think this is the right place to hold special cases, but it is for things not covered in the current scheme.

I know that you (@pgaspar @ErikSchierboom @coriolinus @sshine @SaschaMann and @NobbZ) are all involved with writing and maintaining the generators, or at least the discussions on GitHub:

I think it would be wise to discuss first if we'll add new keys, if at all, because so far each issue has already raised the "yeah okay but we don't really need it and it costs me work"-type comment
...and then which ones are candidates, so we can merge it at once and the burden on maintainers is much lower, as the raised point just mentioned is valid.

Original issue about multiline.

There is canonical data such as food-chain and twelve-days that correctly comment:

JSON doesn't allow for multi-line strings, so all verses are presented here as arrays of strings. It's up to the test generator to join the lines together with line breaks.

Okay, that's great! But in this case I think it would be great if we'd add a meta property (and keep a list of those, as there are more like these (such as a property preferably being a constant or at least immutable) so that we can automate generation of these exercises.

Apr 02 '19 13:04 SleeplessByte

I think it would be wise to discuss first if we'll add new keys, if at all, because so far each issue has already raised the "yeah okay but we don't really need it and it costs me work"-type comment

Perhaps to address the first point it would be useful to know which tracks have generators that would stop working when solely adding keys (without changing the existing ones). Usually adding non-mandatory fields to JSON is not considered a breaking change in semver, so I'd expect many/most parsers to handle this just fine (though that doesn't mean generators follow that versioning and can handle it!). Doing this could help estimating how much work it would cause.

Wouldn't break:

C#
Dart (according to @Stargator)
Erlang (according to @NobbZ on Slack)
F#
JavaScript
Julia (though the generator is in very early development, so it should probably not be counted here, just adding it for completeness)
Go
Pharo
Ruby
TypeScript

Would break:

Unsure (from #1411):

Bash
ColdFusion
Common Lisp
Factor
OCaml
Perl 6
Rust
Scala
Vimscript

(feel free to edit my comment to move tracks around in the list)

Apr 02 '19 17:04 SaschaMann

@SaschaMann Great suggestion!

Apr 02 '19 18:04 SleeplessByte

Pharo (Smalltalk) has a reasonably comprehensive (although slightly hacked) generator that wouldn’t break by addition.

I’ve been trying to normalise inconsistent specs where I’ve encountered my generated exercises weren’t great (and am curious how the current generators handled them - allergies and etl were recent ones).

I’d love some extra flags to use as hints.

I currently treat exceptions specially - I have to look for an expected map with a single key of ‘error’ and then generate an exception check instead of an equality one.

I didn’t manage to handle clock very well with its instance equality comparison (I badly generated an isEqual(dict1,dict2) type call against the model instead of: assertEquals(clock1, clock2).

I also name test cases by concatenating subsequent levels of case descriptions - which works ok, but some cases are very long and often repetitive. Fortunately Pharo doesn’t mind 100+ char method names ;)

Some of the input variable names are not ideal , and some conventions around what would be a Boolean sounding property are not great.

I’m 25 exercises in - spot checked ~20 more but the others might well reveal more useful flags.

_{Sent with GitHawk}

Apr 02 '19 23:04 macta

@sshine You've responded in #1225 that you prefer comments because of what @coriolinus , which seems to be "because it's ambitious and might break current generators". Does this mean you have a generator that breaks?

Apr 03 '19 12:04 SleeplessByte

@SleeplessByte: I should have been more clear:

Because property based test frameworks [...] have a programmatic interface.

Apr 12 '19 12:04 sshine

👍

@sshine So is that instead of adding new flags, or on top of new flags.

Apr 12 '19 14:04 SleeplessByte

I'm a little confused as to the status of this issue. @sshine is it correct to say that you are not in favor of adding new flags/keys to the schema?

Apr 24 '19 12:04 ErikSchierboom

I, personally, would be in favor of adding new flags/keys to the schema, as it seems the only way to support the kind of flexibility that we require at the track level (individual tracks want to include/exclude selectively).

While this would potentially break some generators, I am of the firm opinion that generators should be able to handle new keys by ignoring them, and if we're breaking because of that, it's worth fixing.

May 02 '19 21:05 kytrinyx

Thanks for chipping in with your thoughts @kytrinyx! If I look at the comments in this issue, it seems like most people don't mind adding new keys to the schema. I'd really like to move forward with this issue, as there are other issues that depend on its resolution.

@sshine you seem to be the person most hesitant to add new fields. Is that correct?

May 03 '19 06:05 ErikSchierboom

As people have had ample time to respond, the general consensus (including @kytrinyx) is that adding new fields is a good thing, when done judiciously.

As such, I believe this issue can be closed. If you feel otherwise, feel free to re-open.

May 07 '19 07:05 ErikSchierboom

There's a 2nd step in the first comment of the issue, so I think it should stay open:

...and then which ones are candidates, so we can merge it at once and the burden on maintainers is much lower, as the raised point just mentioned is valid.

May 07 '19 12:05 SaschaMann

There's a 2nd step in the first comment of the issue, so I think it should stay open:

Ah right! Thanks for catching that.

My preference would be to add the optional flag: https://github.com/exercism/problem-specifications/issues/1492#issuecomment-490058378.

May 07 '19 13:05 ErikSchierboom

Bump. Are there other keys that people want to add to the schema?

May 10 '19 05:05 ErikSchierboom

Is this flags for the whole exercise or individual tests?

For individual tests, an “exception” tag and “equality” tag would disambiguate (the latter - Clock is an example where I believe you want to compare the object and not it’s string output on some tests). Tests with exceptions gives a hint to catch vs equal.

May 10 '19 07:05 macta

tests with exceptions gives a hint to catch vs equal.

I'm not sure I understand this. Could you give an example?

May 10 '19 08:05 ErikSchierboom

Is this flags for the whole exercise or individual tests?

All the linked issues in the top post are about individual tests, but I don't see a reason why adding something to the exercise as a whole would be different.

Tests with exceptions gives a hint to catch vs equal.

Errors are already marked in a distinct way:

    "expected": {
        "error": "You should never bar a number"
    }

May 10 '19 09:05 SaschaMann

...and then which ones are candidates, so we can merge it at once and the burden on maintainers is much lower, as the raised point just mentioned is valid.

Not sure about the best practices for this case. Is it better to have one branch that all changes are pushed to that will then be merged? Multiple branches that will be merged at the same time to master?

May 10 '19 16:05 SaschaMann

Multiple branches that will be merged at the same time to master?

I think this is the preferred way to go.

May 11 '19 11:05 ErikSchierboom

@ErikSchierboom, @SleeplessByte: Sorry I didn't get back to you, personal matters came in the way.

I think you and @SaschaMann have picked an excellent candidate from the discussion and I look forward to seeing #1518 merged.

Reading back, it seems that the lack of my response was keeping this issue open. If I read @macta's comments, they seem to come with examples where the proposed key of #1518 would be compatible.

May 14 '19 07:05 sshine

Sorry I didn't get back to you, personal matters came in the way.

No worries!

I think you and @SaschaMann have picked an excellent candidate from the discussion and I look forward to seeing #1518 merged.

I've added you as a secondary reviewer of #1518. If you don't feel like it, feel free to ignore this.

May 14 '19 08:05 ErikSchierboom

Multiple branches that will be merged at the same time to master?

I think this is the preferred way to go.

This was my intention, or at least in close succession.

@ErikSchierboom, @SleeplessByte: Sorry I didn't get back to you, personal matters came in the way.

We all have lives! Don't worry about it. I just wasn't sure if you were in favour, against or something else :)

May 14 '19 11:05 SleeplessByte

To address the last suggested key in the first post (it doesn't have its own issue, if it gets too cluttered we can open one, tho):

Okay, that's great! But in this case I think it would be great if we'd add a meta property (and keep a list of those, as there are more like these (such as a property preferably being a constant or at least immutable) so that we can automate generation of these exercises.

I'm wondering if there are any exercises that contain both multiline strings and arrays of strings. If not, I don't think an extra property would be necessary, but I'm not opposed to it if it means that some generators require less adjustments for specific exercises.

May 15 '19 20:05 SaschaMann

I currently search comments in order to determine if something is a multiline string; because I then need to concatenate the values in the array using \n -- the same exercise also tests for single line strings. So there are not array + multiline string in one exercise, but I do need special handling.

If we don't want to add a property, that's fine :)

May 16 '19 09:05 SleeplessByte

I currently search comments in order to determine if something is a multiline string;

How many of those exercises are there?

May 16 '19 12:05 ErikSchierboom

And then there is beer-song, which doesn't actually have the comment, but should...

May 16 '19 17:05 SleeplessByte

tournament should also be on that list.

May 17 '19 14:05 cmccandless

Alright, that makes 8 exercises. Maybe an additional keys would be useful then. Any suggestions?

May 18 '19 10:05 ErikSchierboom

Tournament is already an issue currently as its comment is distinct from the others.

I don't have great suggestion for the key :p

May 18 '19 10:05 SleeplessByte

join-strings or, a bit more verbose, join-string-arrays with boolean value?

May 19 '19 13:05 SaschaMann

@SleeplessByte Do you have any suggestions?

May 20 '19 05:05 ErikSchierboom