TIDES icon indicating copy to clipboard operation
TIDES copied to clipboard

📄🚀 – options for defining requirements for end-uses of TIDES data

Open botanize opened this issue 2 years ago • 4 comments

Describe the feature you want and how it meets your needs or solves a problem

People want a way of standardizing which optional fields are required for various end-uses of TIDES data. How do I tell a vendor I need data in TIDES format, with at least the fields required for NTD service supplied reporting?

Describe the solution you'd like

I prefer Fork Repo to Require Everything to Feature Flags.

Describe alternatives you've considered

  • Fork Repo: fork the TIDES repo, change the spec to make the fields you need required.
    • Pros:
      • forker maintains control over requirements
      • validation is easy with existing tools
    • Cons:
      • requirements not standardized across agencies
  • Feature Flags: add a features property to each field in the table specs. features is an array of strings describing the features that require the field, e.g, "features": [ "Playback", "NTDServiceSupplied" ].
    • Pros:
      • standardizes requirements for common end-uses
    • Cons:
      • more difficult to add or remove fields from the spec
      • requires building a validator that supports the feature flags
      • must know the requirements a priori
      • tools that produce the same output (e.g., NTD service supplied) could have different requirements depending on methods, or their own optional features (e.g., a departure prediction engine that predicts dwell time from APC data has very different requirements from one that predicts dwell time from historical dwells)
  • Require Everything: require all tables and fields unless the vendor can demonstrate that they are not applicable to the system.
    • Pros:
      • simple
    • Cons:
      • self-certification of compliance can be problematic
      • validation would require forking the TIDES spec and setting the required constraint based on the vendor-negotiated requirements

Additional context and sample data

Describing the features required for a playback tool is a good example of the pitfalls of setting requirements based on features.

A Playback tool can use every field of the vehicle_locations, passenger_events and fare_transactions tables, as well as additional event data that aren't (yet) part of the TIDES spec, and it doesn't require some of the required fields, like trip_id_performed. The only absolutely required fields of vehicle_locations are probably timestamp and vehicle_id, since vehicle position may not always be available (position is optional in GTFS-realtime VehiclePositions).

It may be the case that you want a field to be required, but allow nulls when information isn't available, for example, you might want to require latitude and longitude, but allow them to be nullable when GPS is unavailable. Frictionless doesn't allow this, nulls/missing values are not allowed in required fields.

Finally, adding feature flags complicates changes to the spec. If a field has a feature flag and we decide it should be removed, does that mean the feature will break? If we want to add a field do we need to figure out what features would require it? How do feature flags interact with versioning? There's a desire for a stable document for RFP requirements, but what happens when you discover an optional field is required for a feature. Do you have to update the version?

botanize avatar Dec 15 '22 21:12 botanize