mimir
mimir copied to clipboard
Support deleting or rewriting series in long-term storage.
Is your feature request related to a problem? Please describe.
I would like the ability to delete, edit, or replace samples that have already been commited to a long-term storage in these cases:
- There is sensitive data in label values that needs to be redacted. In this case I might like to re-write the label values to a one-way hash of the original value, to keep the series but remove the sensitive information.
- There are sensitive label keys that need to be redacted. In this case, I'd probably like to delete series with any values for that label key entirely.
- A recording rule was recording incorrect results because the query was incorrect. After fixing the recording rule, there will be a discontinuity, and the user needs to remember the reason the discontinuity exists. Best practice here might be to record a new series after updating the recording rule, but that may not be feasible because of upstream uses of the recorded series. In this case, I'd like to delete values before the discontinuity or perhaps replace them with corrected values for the series.
Describe the solution you'd like
If the Prometheus delete series API was supported, I could address the minimum needs of these cases by removing sensitive or incorrect samples.
For the more complex case: A Mimir feature that makes it possible to replace TSDB blocks in the block index could also be combined with some utilities to download and modify TSDB blocks outside of Mimir itself. The rewrite utilities could be provided by a separate project if they'd be broadly useful to the Prometheus ecosystem. This feature is slightly different than TSDB remote block upload in that it combines new block upload and block deletion into an atomic operation so that the replacement and original block are never queried at the same time by a store-gateway instance.
Describe alternatives you've considered
We've could Rube Goldberg some existing tools to achieve something similar:
- Download the block index.
- Block-by-block, download a copy and edit it. This would involve the same sort of block edit tools as in the sketch above.
- Use delete the block from the long-term storage. I don't know how this affect the bucket index.
- Use TSDB remote block upload to replace the block.
Is there any progress on this or an ETA?
FWIW, I found this guide how to do this in Thanos; https://thanos.io/tip/operating/modify-objstore-data.md/ - might serve as a good inspiration on how to do the same in Mimir?
FWIW, I found this guide how to do this in Thanos; https://thanos.io/tip/operating/modify-objstore-data.md/ - might serve as a good inspiration on how to do the same in Mimir?
On a few occasions when we needed this functionality, we did use bucket rewrite tool from Thanos. That only works when one has direct access to the blocks in the long-term storage, and is not usable by end users.
Is it something that will be supported in the future ?
@zhehao-grafana this is quite similar to the project in planning about remove sensitive data in the object store. Do you think we would support this at least as a CLI anytime soon?
I think we would like to support it in the future, but it is on the immediate plan. If any community member in Mimir has time to start the work, we will find people to help review and push it
Would love to see that feature implemented as well. We have a lot of use cases for it to be honest. All the mentioned ones + fixing backfilled data that was wrongly uploaded.
+1
i would like to see this feature implimented as well.
+1