GAIA-DataSet icon indicating copy to clipboard operation
GAIA-DataSet copied to clipboard

Duplicate timestamps in metrics

Open mistycheney opened this issue 1 year ago • 1 comments

There are duplicate timestamps in many metrics. Some of these duplicates have the same value, but often the same timestamp appears in multiple rows with different values. Usually in such cases, one of these rows has a valid value and the remaining rows are 0. Can I just take the non-zero row as the correct row to use for this timestamp? Is this expected when you collected and compiled the data? Thanks.

mistycheney avatar Mar 27 '23 01:03 mistycheney

Thank you for your concern to GAIA-Dataset. First, in general, this situation is normal because there may be some uncertainty in the data collection process, resulting in multiple records being recorded under the same timestamp. Second, some metrics in GAIA dataset (mainly those starting with "system" in the filename) were recorded without tags, resulting in data from different time series being recorded together. In this case, it is necessary to perform aggregation operations on the metric data based on their specific situation. For example, for the "system_network_out_dropped" metric, it can be aggregated using the sum function.

Xander-cloudwise avatar Apr 12 '23 07:04 Xander-cloudwise