gtfs-validator icon indicating copy to clipboard operation
gtfs-validator copied to clipboard

Clarify @Required annotations for validation

Open briandonahue opened this issue 2 years ago • 2 comments

Describe the problem

Currently the @Required annotation is used on the schema files both at a Class level (for required files) and the Property level (for required fields). This can be confusing and recently, we in #1198 we had a scenario where a column was required, but it was not required that the field have data for the column. We discussed adding a new annotation when a column Header is required, but the column data can be empty (not required), which led to discussion of the existing @Required annotation.

Proposed solution

I think for additional clarity, it would make sense to break the original @Required header into two: @RequiredFile - when the file can not be missing from the feed (valid on the schema Class) @RequiredValue - when a value is required for the field ( @RequiredColumn - when the column (header) is required, but the field (value) is allowed to be empty

Alternatives you've considered

Leave @Required as-is, and just add the new @ RequiredColumn annotation for the new case.

Additional context

No response

briandonahue avatar Feb 28 '23 20:02 briandonahue

Thanks for opening your first issue in this project! If you haven't already, you can join our slack and join the #gtfs-validators channel to meet our awesome community. Come say hi :wave:!

Welcome to the community and thank you for your engagement in open source! :tada:

welcome[bot] avatar Feb 28 '23 20:02 welcome[bot]

@isabelle-dr mentioned another scenario in #1350 regarding a warning for a RecommendedColumn. I think adding several more specific annotations may be useful.

Currently we have the following:

Current:

Rule Annotation Target Example Type Occurs Notes
File Required @Required Class Agency Error Feed Dual purpose annotation may cause confusion
Column & Value Required @Required Field agencyName Error File & Record Dual purpose annotation may cause confusion
Column & Value Optional @RequiredColumn Field transferType Error File

I propose splitting the @Required annotation into @RequiredFile and @RequiredValue annotations, and adding @RecommendedColumn and potentially a @RecommendedValue annotation would be clearer for readability and testability:

Proposal:

Rule Annotation Target Example Type Occurs Notes
File Required @RequiredFile Class Agency Error Feed
Column Required @RequiredColumn Field transferType Error File Column must exist, values are optional
Value Required @RequiredValue Field agencyName Error Record Column must exist, value is required for each record
Column Recommended @RecommendedColumn Field timepoint Warning File Recommended to include column
Value Recommended @RecommendedValue Field ? Warning Record Recommended to include value

One potential consideration is documenting these in RULES.md could be confusing? E.g. if multiple fields in multiple files are marked with @RequiredValue we would have to illustrate them all in the required_value error section, with links to the correct area of the specification. As opposed to having a custom error notice for each situation which gives the ability to provide a specific example/link for each error.

From a development standpoint I think the proposal is much clearer as long as we feel we can document the errors effectively.

briandonahue avatar Mar 13 '23 14:03 briandonahue