Add multiple-units support in search length criteria
Support multiple units in search length criteria
The following formats are now possible:
-
7m27s -
1h2m3s -
7:27 -
1:2:3
- Did you run the tests at all here? Several of them are failing.
- This parsing is very lax. I found several examples of wonky inputs that are accepted by this suggested parser in 5 minutes:
-
0:4:1(parses to 0:04:01, arguably shouldn't be allowed due to no leading zeroes) -
0:3:(parses to 0:03:00, one part is entirely skipped) -
0:and0:0:(parse to zero) -
4s9m(parses to 0:09:04, part order is inverted) -
2m2m2m2m2m(parses to... 0:10:00)
-
Should we allow criteria with decimals? Like these:
-
6.5m -
6m15.5s -
6.5m25.3s
I think only the first two are valid, which solely have decimals in the last segment.
Probably but I wouldn't necessarily mind removing support for decimals entirely. They were never super intuitive to use with time which is generally base-60 or base-24.
Now it should be working fine.
These are valid:
-
7m27s -
1h2m3s -
1h2m3.5s -
7:27 -
1:2:3 -
1:02:03 -
6 -
6.5 -
6.5s -
6.5m
There are not valid:
-
7.5m27s -
7m27 -
7m7m7m -
7m70s -
5s6m -
0: -
:0 -
0:3: -
3:15.5
I would expect the test coverage to be expanded to cover the above cases, both positive and negative. And I still take issue with some of the accepted inputs (1:2:3?)
The code is also... not as I would write it. The regexes are a step forward but they're really complicated. Nested optional groups like @"^\d+(:\d+(:\d+)?)?$", or the way that the hms ordering enforcement is done, are really way too difficult than they need to be. This could be vastly improved using named capturing groups, for instance:
^(?<hours>\d+):(?<minutes>[0-5]\d):(?<seconds>[0-5]\d)$ // h:mm:ss format
^(?<minutes>[0-5]?\d):(?<seconds>[0-5]\d)$ // m:ss format
^((?<hours>\d+)h)?((?<minutes>[0-5]?\d)m)?((?<seconds>[0-5]?\d)s)?$ // XhXmXs format
The above examples are doing format enforcement and are simultaneously splitting parts out for better value extraction. I think they are far more understandable that what the current code is doing, but I suppose this is personal opinion.
I suppose formats like 1:2:3 are acceptable because they are just simplified forms. 0:02:03 looks better but users' input without the prefix 0 should be allowed too. That increases the tolerance of the user input.
The nest regex has been simplified to ^\d+(:\d+){1,2}$. Forgive me for not using the static format of regex because we accept inputs like 1h2m, 1m2s, and 1h2m3s so there are multiple regexes which should be implemented if we separate each format like this. I think that will make the code redundant.
Honestly I am not familiar with unit tests so I don't know how to add negative cases while the function tryUpdateLengthRange returns false.
there are multiple regexes which should be implemented if we separate each format like this
I think that the three I gave above are sufficient for any needs you may have. Maybe except for the fractional format, but as above, I say we drop it.
Honestly I am not familiar with unit tests so I don't know how to add negative cases while the function tryUpdateLengthRange returns false.
You'd write negative tests by checking that the bad queries result in the FilterCriteria having Length.HasFilter == false.
^((?<hours>\d+)h)?((?<minutes>[0-5]?\d)m)?((?<seconds>[0-5]?\d)s)?$ // XhXmXs format
65m, 90s or 80m20s should be valid too, while 2m80s is invalid.
One possible way I think is to extract every part first regardless of the range (for every case), and check once at the end, or we should check them in every case.
https://github.com/ppy/osu/pull/19275/files#diff-55ee49f7d5113ff54cf89f0567697cac68678297bf4fe046d0f3b1b8fb7658daR330-R332
3 regexes had been implemented instead of the previous splitting parsing method.
Additional, an extra test case 1h20s which is valid had been added