gtfs-validator
gtfs-validator copied to clipboard
[REQUEST] Implement feed start and expiration date (GTFS rule)
Is your feature request related to a problem? Please describe. Feed must have a valid start date and expiration date. This is a GTFS rule implemented in Google Python validator and featured in Google Type Error as TYPE_FEED_EXPIRATION_DATE, TYPE_FEED_FUTURE_SERVICE, TYPE_FEED_HAS_VERY_SHORT_SERVICE and TYPE_EXPIRED_FEED_HAS_VERY_SHORT_SERVICE.
Describe the solution you'd like Actual Google GTFS validator behaviour : verifies if feed does not start in the future, or does not expire within 60 days.
Describe alternatives you've considered
Additional context Line 49, 50, 54 and 55 in Error support priorities https://docs.google.com/spreadsheets/d/1vqe6wq7ctqk1EhYkgtZ0_TbcQ91vccfs2daSjn20BLE/edit#gid=0
@maximearmstrong This is looking at feed_info.txt feed_start_date
and feed_end_date
, right?
Does the Python tool put any parameters on what a "valid" date is? For example, does it need to be in the future from now (or past, for start)? Or within the next 10 years? Or does it just validate the YYYYMMDD format?
@barbeau Yes. In the Python validator, feed_start_date
must be in the future, and feed_end_date
must be in more than 60 days.
@maximearmstrong Thanks, that's interesting.
@timMillet this is another rule we'll need to look at in terms of what's considered current GTFS best practice.
@barbeau I agree. AFAIK, there are no such rules anywhere in the spec or in the Best Practices. The only things specified in the Best Practices are:
- "At any time, the published GTFS dataset should be valid for at least the next 7 days, and ideally for as long as the operator is confident that the schedule will continue to be operated." and
- "If possible, the GTFS dataset should cover at least the next 30 days of service."
Why the Google Validator was issuing errors when feed_start_date
was not in the future, and feed_end_date
was not in more than 60 days?
Why the Google Validator was issuing errors when feed_start_date was not in the future, and feed_end_date was not in more than 60 days?
I'm guessing this rule goes way back to the early days of GTFS when producers were consistently publishing a GTFS dataset every 3-4 months, with no overlap between feed service periods.
Obviously industry practices have changed significantly since then, with some agencies publishing updates daily, which may be an update of a feed (in which case feed_start_date would be in the past) and it may only be valid for the next 30 days (in which case feed_end_date would be less than 60 days away).
Interesting! I agree with your last paragraph.
IMHO, a past date for feed_start_date
should be allowed. feed_end_date
should be at least cover the next 7 days, but it would be great if it covers the next 30 days. So less than 8 days (current day+7 days): error; between 8 days and 30 days: notice; 31 (current day+30 days) and more: nothing.
Btw: at my previous job, I noticed that most of the feeds providing feed_end_date
and feed_start_date
were providing a schedule for the next 2 months. Now I know who was responsible for that 😛
@timMillet are you aware of any use case where only one of the following fields would be provided: feed_start_date
, feed_end_date
?
I'm not aware of any - it's actually strange that these fields aren't explicitly conditionally required, in that if you provide one you must provide both.
Indeed, should we open a new issue to flag this as a warning? @barbeau
@lionel-nj Yes, I would flag one of the two fields being empty as a warning for now. However, I also think the spec should be updated to explicitly say if one field is provided the other must be too. That seems to be the intent from the description, but it's not explicitly defined.
Some of these rules have been implemented in the linked PR. However, some questions remain. I invite you read the PR description :)
@barbeau , @timMillet https://github.com/MobilityData/gtfs-validator/pull/270
Re-opening, since TYPE_FEED_EXPIRATION_DATE seems to be the only one of these currently supported.