wca-live
wca-live copied to clipboard
Issues with inconsistency warning
It’s pretty common for competitors to get a lucky single for events like 2x2, pyra and skewb, submitting results for these events can be pretty annoying because of the pop up that happens for inconsistent results. Maybe a fix could be to make the pop up toggle-able for these events?
I see how this can be annoying, but in that case I'd rather just figure out a more accurate check. Do you have a round in mind what was particularly painful to enter? Currently the check triggers if best is 4 times better than worst, perhaps it would make sense to use different ratio for those fast events?
It might make sense to query the db and figure out average differences between single and avg for all events?
That's a great idea! I've just had a look at best and worst (I think it makes sense to consider worst as one time may as well be typed way higher than lower).
Code just for the reference:
best_worst_event = pd.read_sql_query("""
SELECT
event_id,
best,
MAX(attempt1, attempt2, attempt3, attempt4, attempt5) worst
FROM results
WHERE best > 0 AND event_id NOT IN ('333mbf', '333mbo', 'mmagic', 'magic')
""", connection)
best_worst_event.groupby("event_id").apply(
lambda results: np.quantile(results.worst / results.best, 0.99)
).sort_values()
That's what I got:
Event | Quantile 0.99 of worst-to-best ratio |
---|---|
777 | 1.37689 |
666 | 1.53675 |
333fm | 1.58333 |
555bf | 1.61060 |
555 | 1.63111 |
444bf | 1.64473 |
minx | 1.69979 |
333bf | 1.99669 |
444 | 2.16669 |
333oh | 2.64037 |
333 | 2.67603 |
333ft | 2.68439 |
clock | 3.23144 |
sq1 | 3.67549 |
pyram | 4.54093 |
skewb | 5.23354 |
222 | 5.48407 |
Interpretation: 99% of the time worst-to-best ratio is less than 1.37 for 7x7x7 and less than 5.48 for 2x2x2. So theoretically if we use these ratios and all results are entered correctly, then the warning should show up (incorrectly) for 1 in 100 results.
Analyzing 2x2x2 further, results with worst / best > 4
represent around 1.1%
, so the warning should still appear like 1 in 100 valid results. This means that statistically it should't really be annoying for 2x2x2.
On top of that for most events we could actually use a more strict ratio.
Perhaps we could use a different consistency check, keeping in mind that the mistakes usually involve omitting one digit when typing.
Maybe an easier calculation would be 4x+10 or similar? That way you have more leeway for fast results without doing separate thresholds for every event.