GAIA-DataSet
GAIA-DataSet copied to clipboard
Duplicate timestamps in metrics
There are duplicate timestamps in many metrics. Some of these duplicates have the same value, but often the same timestamp appears in multiple rows with different values. Usually in such cases, one of these rows has a valid value and the remaining rows are 0. Can I just take the non-zero row as the correct row to use for this timestamp? Is this expected when you collected and compiled the data? Thanks.
Thank you for your concern to GAIA-Dataset. First, in general, this situation is normal because there may be some uncertainty in the data collection process, resulting in multiple records being recorded under the same timestamp. Second, some metrics in GAIA dataset (mainly those starting with "system" in the filename) were recorded without tags, resulting in data from different time series being recorded together. In this case, it is necessary to perform aggregation operations on the metric data based on their specific situation. For example, for the "system_network_out_dropped" metric, it can be aggregated using the sum function.