mobility-data-specification icon indicating copy to clipboard operation
mobility-data-specification copied to clipboard

Data Retention Requirements

Open quicklywilliam opened this issue 4 years ago • 1 comments

Is your feature request related to a problem? Please describe.

With MDS 1.2 and the Requirements API, it is possible for the first time for Agencies to specify which data they need, and which data should not be shared. This is a major step forward for Data Minimization by allowing for the creation of best practices and example Requirements files that describe how to make minimal data requests for common use cases.

A related step I'd love to see MDS take is to describe data retention practices. This could cover some or all of the following capabilities:

  • Allowing Agencies to specify a minimum duration for data retained on operators' systems
  • Allowing Agencies to specify a maximum duration for data retained on their own (or a chosen aggregators') systems
  • Allowing operators to specify a maximum duration that they retain data and/or
  • Allowing Agencies to specify a maximum duration for data retained on operators' systems

These requirements could be expressed at the API, endpoint and perhaps even the field level. Here are some example use cases:

  • An agency specifies that they require status events to be retained by operators (ie queryable) for at least 6 months
  • An agency specifies that they will hold onto metrics data for up to five years, but will only hold onto disaggregate trip data for 1 month.
  • An agency interested in trying Geography Driven Events specifies that they will hold onto GDE event data for up to one year, and they will hold onto full event data for one month (to allow for the verification and evaluation of GDE data).

A note on aggregate data My intention with this issue is to allow for explicit requirements around the retention of raw (particularly sensitive) data being delivered via MDS. This seemed like a good place to draw the line for now, but aggregate data can also sometimes be sensitive. It's an open, related question whether and how MDS might someday attempt to describe the retention of data that has been aggregated, analyzed etc.

Describe the solution you'd like

I'm imagining this as an extension to the Requirements API.

Is this a breaking change

Probably, insofar as it might allow for the creation of breaking requirements on operator implementations. Possibly this could be avoided with a conditional period, as we allowed for with Requirements in 1.2.

Impacted Spec

For which spec is this feature being requested?

  • requirements

Additional context

See also the #608 and #646.

quicklywilliam avatar Aug 30 '21 21:08 quicklywilliam

We may be able to add some of these items to MDS 2.0 with PR #813 , but I'm not sure how best to add them. Do you have some ideas of which are both the highest priority to add and have a straightforward implementation, @quicklywilliam ?

schnuerle avatar Dec 15 '22 13:12 schnuerle