gtfs-validator
gtfs-validator copied to clipboard
Best Practice: Include agency_id in routes.txt (WARNING)
Including agency_id
in routes.txt
is a best practice and should generate a warning even though it is not required if there is only one.
Implementatin thoughts: Rather than returning here in this override, generate a WARNING.
https://github.com/MobilityData/gtfs-validator/blob/52434454b747bb4085038f27563e98fee6018ca5/main/src/main/java/org/mobilitydata/gtfsvalidator/validator/TripAgencyIdValidator.java#L46-L59
Where is this best practice coming from and what is the rationale? Does that also include a requirement to provide agency_id
in agency.txt
even for a single agency?
To be honest, I fully support your proposal. I am just looking for a strong rationale to convince the transit feeds to fix that warning.
Where is this best practice coming from and what is the rationale?
I think that @e-lo refers to this part of the best practices:
Field Name Recommendation agency_id Must be included if it is defined in agency.txt
.
Does that also include a requirement to provide agency_id in agency.txt even for a single agency?
According to best practices for agency.txt
, yes:
Field Name Recommendation agency_id Should be included, even if there is only one agency in the feed. (See also recommendation to include agency_id
inroutes.txt
andfare_attributes.txt
)
Just to clarify, I see a recommendation in Best Practices for agency.txt
to include agency_id
even if only one agency is specified (link), but I don't see similar text for routes.txt
. Did that text get dropped at some point? Does it need to be added?
See my comment here 🙃
@isabelle-dr Hello! I am taking a look at this. It seems like the most comprehensive solution would be:
- generate a warning if there is a single agency in
agency.txt
butagency_id
is not provided - generate a warning if
agency_id
is missing inroutes.txt
and a single agency is defined inagency.txt
- generate an error if
agency_id
is missing inroutes.txt
but is defined inagency.txt
(with either one or multiple agencies defined inagency.txt
)
Am I missing anything?
@briandonahue I'd clarify that for (3) that I don't believe it's an error for agency_id
to be missing from routes.txt
if only a single agency is defined in agency.txt
with a corresponding agency_id
value. It's only an error in the multi-agency case.
@bdferris-v2 OK, I was looking at this comment (don't think I have access to see the original repo page)
I think that @e-lo refers to this part of the best practices:
Field Name Recommendation
agency_id Must be included if it is defined in
agency.txt
.Does that also include a requirement to provide agency_id in agency.txt even for a single agency?
But I can remove that validation.
@briandonahue thanks for working on this issue. You have access to the repo, but this link is broken because we removed this part of the Best Practices, and I believe this was a mistake (see my comment here).
- generate a warning if there is a single agency in agency.txt but agency_id is not provided
Correct, and this comes from the agency.txt in the GTFS Best Practices.
I recommend using the @recommended
annotation in the schema file, just like what Kevin did in #1156, which will lead to the missing_recommended_field
notice. The "References" and "Affected files" sections of the documentation for this notice will have to be updated, because currently it's only used on fields in feed_info.txt
- generate a warning if agency_id is missing in routes.txt and a single agency is defined in agency.txt
Correct, although there is no mention of this anymore in the GTFS Best Practices for the reason I shared above. I will open a PR in the Best Practices repo shortly to fix that. Meanwhile, I propose you still add the validation, refer to the routes.txt when you update RULES.md, and we will just wait to have this fixed on the Best Practice repo before we merge yours.
- generate an error if agency_id is missing in routes.txt but is defined in agency.txt (with either one or multiple agencies defined in agency.txt)
This is a good catch! It's a portion of the spec we didn't have before. As @bdferris-v2 flagged, we have an error only if we have multiple agencies. This refers to the routes.txt specification.
I hope this is clear, let us know if you'd like to discuss how to implement 2
and 3
, pick the names for the notices, etc.
@isabelle-dr I've created a draft PR, thank you for the pointers on where to update the documentation, I need to do that before it's ready for review.
I recommend using the
@recommended
annotation in the schema file, just like what Kevin did in https://github.com/MobilityData/gtfs-validator/pull/1156
Would this be confusing since it is recommended in the case of a single agency, but required when more than one? I see a @ConditionallyRequired
attribute on there now... I'm still learning, so thank you for your explanations!
it is recommended in the case of a single agency, but required when more than one
I think you might be mixing the routes.txt
and agency.txt
files? They both have agency_id
.
Here is another way to see this:
-
agency.agency_id
should be included ifagency.txt
is provided (WARNING - and I suggest we re-useMissingRecommendedField
with the same logic we applied forfeed_info.txt
in #1156) -
routes.agency_id
should be included ifagency.txt
contains at least 1 record, or 1 value for `agency_id (WARNING) -
routes.agency_id
must be included ifagency.txt
contains more than one value foragency_id
(ERROR)
Writing this, I realize that we have different options for how to implement 2 and 3.
Option 1 - create a new MissingAgencyId
notice, which could be an ERROR or a WARNING based on the conditions.
⚠️ I am not in favor of this option, because we have a longer-term vision of separating the notice from the severity, and allowing users to specify a different severity for a rule (ex: Cal-ITP might want to require routes.agency_id
at all times, so they would want this to be an error regardless of the number of agencies defined in agency.txt
).
Option 2 - create two new custom rules, similarly than what we have for feed_expiration_date_7_days and feed_expiration_date_30_days
Option 3 - find a way to trigger MissingRecommendedField
(which is a WARNING) for case 2 and MissingRequiredField
for case 3(which is an ERROR).
Let me know if you see this in any other way, happy to take this discussion to your PR :)
@isabelle-dr I've removed custom errors, deferring to the existing MissingRecommendedField
and MissingRequiredField
depending on single or multiple agencies scenarios. I was looking at updating the Best Practices documentations, and added some questions here
@briandonahue not sure where you are at with this but Alexei implemented a similar logic here, based on the following statement in the spec:
fare_attributes.agency_id
is Required if multiple agencies are defined in agency.txt