gtfs-validator
gtfs-validator copied to clipboard
Improve acceptance rules
Feature request
It would be beneficial to add new rules to the acceptance tests. Currently, two rules are tested:
- Does the validator that uses the new code contribution generate more new errors than allowed?
- Were there too many corrupted sources/datasets during validation? Although these rules are fundamental to the acceptance tests, other aspects are not considered: disappearing errors or warnings, unusual info notices, etc.
Is your feature request related to a problem? Please describe.
Specifically in #1188, the acceptance tests passed, even though the initial implementation prevented the validator from checking most of the validation rules. The reports generated did not contain any errors or warnings. Also, the reports contained the unknown_file
info notice for required files such as stop_times.txt
. Although the acceptance tests were not originally designed to cover this use case, it would be nice if they did.
Proposed solution First, I think adding two new rules would be helpful:
- Are too many errors disappearing?
- Are too many warnings disappearing?
Another potential additional rule would be
- Do the reports contain the
unknown_file
notice for the required files? since this is an indicator of a problem, but it is very specific.
Thanks for opening this! There is another issue about a next iteration on the acceptance tests: #1076. This one isn't a duplicate, both flag different improvements to be made.
I was looking at the acceptance test implementation and I noticed that there is apparently code in the acceptance_test.yml to post a comment to the pull request with acceptance test status. I've never actually seen this in action before. Does it actually work?
Asking because I was taking an initial stab at improving the logic here and was thinking of refactoring some things, but I wanted to make sure I understood the existing behavior first.
Oh, you're right. It looks like it's working when PRs are made from this repo (see here and here), but not from forks.
Looking more closely to the run logs, it looks like the problem is related to the GitHub token. Here (from our repo) vs here (from a fork).
This is because the token is accessed via a secret. To the best of my knowledge, GitHub disables using secrets in workflows originating from forks. @isabelle-dr How valuable do you think the comment is? Could we refactor the acceptance tests without it?
Well, we can definitely improve the acceptance test without fixing the comment issue.
That said, I was looking at this method in the comparator and I think it could definitely be improved a bit. I was also thinking of combining the acceptance_report.json
and sources_corruption_report.json
into one combined report (not sure why they need to be separate?). This would have some implications for the commenting code but i'm sure I could keep it working the same either way (even if it doesn't actually work for me :)
I agree with you - I thought the same thing looking at the checkRuleValidity
method. About combining the reports. I don't think they need to be separate, that's how it was before I started working on the validator, so merging them works for me. @isabelle-dr do you have any objection?
How valuable do you think the comment is?
I'd say pretty valuable, the initial idea was to have something user-friendly to see the impact of a PR, and how this impact might change as the contributor is sending commits. It makes it easy to trace stuff back. See it kind of in action in #1083. The scope of this test changed as we were working on it (see this comment that followed a discussion with @asvechnikov2) and I see value in having in this validator:
- tests to make sure everything is working properly
- the ability to analyze the impact of a new piece of code on production data - this is more a functionality to me whereas the above is about safety & stability.
These two "tests" might contain similar checks (e.g. see if many additional / disappearing errors) but we won't want to run them at the same time, and their output might not be used in the same way. The first one is simply blocking a merge, whereas the second could be used ad-hoc by the developer, and the results would be looked at in the review process with follow-up actions (inform agencies and consumers concerned, make it a major release, etc).
No strong opinion on combining the report, this seems like a good idea.
Also related: https://github.com/MobilityData/gtfs-validator/issues/1233