Data-Product-Specification
Data-Product-Specification copied to clipboard
Add retentionTime information
At the moment, in the DP descriptor only the startDate of the dataset is available but not all dataset will have an infinite retention time: it's necessary to include a field that represents the time window data will be kept available in the dataset. The proposed solution is to add a RetentionTime field at StartDate level.
I agree with the proposal. Very often, data exposed by an output port has a fixed retention time that ranges from days (in case on output ports of type "events") to years (for output port of type "Files").
Currently the specifications has only the field startDate to hold this information. This could be a date or a time interval.
It would be beneficial to leave startData only for the timestamp of the oldest data initially published on the output port, and add retentionTime for specifying the time interval of the retention policy (if any).
With these two fields is possible to express also transient states. For example, an outputport with retention time of 1 year can be first published with 1 month of data. It will have a startDate fixed to 1 month earlier than the date of publication, and retentionTime = 1Y. During the first year the output port is "accumulating data" and the startDate indicates the oldest data published. After the first year the retention kicks in and all without changing the descriptor.
CC: @agile-lab @erond