promscale Support Delete Series by time range

Currently promscale just error out if we send a request to delete by time: https://github.com/timescale/promscale/blob/master/pkg/api/delete.go#L50

I haven't fully understood promscale's implementation, but we do be able to construct SQL to achieve deleting by time range. Wonder if there is a plan for promscale to add that support.

Jul 18 '22 06:07 zfy0701

As noted in Promscale's documentation, the delete_series endpoint does not support deleting data with a time range:

NOTE The start and end timestamp options are not currently supported. The delete_series HTTP API endpoint lets you delete the metric series only across an entire time range.

This section of the documentation provides guidance on how to delete data by time using SQL: https://docs.timescale.com/promscale/latest/manage-data/delete-data/#delete-metric-data

Jul 18 '22 06:07 JamesGuthrie

@zfy0701 could you share more details about the reason why you want to delete by time?

Jul 18 '22 08:07 ramonguiu

@ramonguiu

Hi, we are trying to use promscale to monitor some metrics derived from blockchain in real time. However, blockchain is an eventually consistent system, so part of the chain can change from time to time, causing parts of the metrics computed from the chain invalid, and we need to replace them with updated ones.

We saw some article from timescale db supporting blockchain data, but they are mostly offline data that doesn't need to be changed.

Jul 18 '22 16:07 zfy0701

@ramonguiu a followup on this topic I think I could delete data by using the SQL query directly. First I tried to do it directly, then I occasionally get error like this: ERROR: cannot update/delete rows from chunk "_hyper_29_5623_chunk" as it is compressed

Then I searched promscale doc: https://docs.timescale.com/promscale/latest/manage-data/delete-data/#delete-metric-data, it says I should decompress the data first. But when I do that, using

SELECT decompress_chunk(show_chunks('prom_data.xxxx'));

It often reports an error complaining that the latest chunk is uncompress (therefore I can't uncompress the whole thing).

Seems like I have to manually figure out what chunks to decompress? Is there an easier way of doing it? thanks!

Jul 24 '22 07:07 zongjiasushi

@zongjiasushi it looks like you should be able to pass true as a second argument to decompress_chunk in order to tell it to skip uncompressed chunks (documentation here).

I would add that depending on over what time range you're decompressing data, you could end up (temporarily) using a large amount of storage to decompress and recompress chunks. Also, decompressing and recompressing chunks is expensive. If you need to replace data, it would probably be more efficient if you would first drop the chunks completely and recompute and write the data for the whole time period. The default chunk size in Promscale is 8 hours, so if the data that you're rewriting is less than 8 hours then you'll usually only be decompressing the most recent chunk. If you're writing data over much longer periods of time then you'll be touching more and more chunks.

If you're interested in understanding this better, I can give you more insights. If you're happy with sticking to decompress_chunk then that's fine too.

Jul 25 '22 07:07 JamesGuthrie

@zongjiasushi it looks like you should be able to pass true as a second argument to decompress_chunk in order to tell it to skip uncompressed chunks (documentation here).

I would add that depending on over what time range you're decompressing data, you could end up (temporarily) using a large amount of storage to decompress and recompress chunks. Also, decompressing and recompressing chunks is expensive. If you need to replace data, it would probably be more efficient if you would first drop the chunks completely and recompute and write the data for the whole time period. The default chunk size in Promscale is 8 hours, so if the data that you're rewriting is less than 8 hours then you'll usually only be decompressing the most recent chunk. If you're writing data over much longer periods of time then you'll be touching more and more chunks.

If you're interested in understanding this better, I can give you more insights. If you're happy with sticking to decompress_chunk then that's fine too.

Thanks a lot for the insights! I will try it out first and let you know how it goes!

Jul 25 '22 16:07 zongjiasushi

@zongjiasushi Do you have an update for us?

Aug 17 '22 18:08 VineethReddy02

Hi, it mostly worked with occasional error on "can't decompress", but a retry would almost always succeed. Will follow up again if it happens again! Feel free to close the issue for now

Aug 18 '22 03:08 zongjiasushi

promscale promscale copied to clipboard

Support Delete Series by time range

promscale
promscale copied to clipboard