influxdb
influxdb copied to clipboard
InfluxDB delete with predicate works for _measurement, but not for specific _field
Steps to reproduce: List the minimal actions needed to reproduce the behavior.
Here is the code I use (in python) to try and delete a specific _field
:
def delete_data(data: str, org: str, bucket: str, token: str, db_url=INFLUX_URL) -> str:
url = f'{db_url}/api/v2/delete?org={org}&bucket={bucket}'
headers = {
'Authorization': f'Token {token}',
'Content-type': 'application/json'
}
response = requests.post(url, json=data, headers=headers)
response.raise_for_status()
return response
data = {
"start": "2020-12-01T00:00:00Z",
"stop": "2021-12-20T00:00:00Z",
"predicate": "_field=\"my_field\""
}
r = delete_data(data=data, org=ORG, bucket='my_bucket', token=MY_TOKEN)
I also tried with curl
- same story:
curl --request POST "<INFLUX_URL>/api/v2/delete/?org=<org>&bucket=<bucket>" -H 'Authorization: Token <MY_TOKEN>' -H 'Content-Type: application/json' --data '{
"start": "2020-12-01T00:00:00Z",
"stop": "2020-12-20T00:00:00Z",
"predicate": "_field=\"my_field\""
}'
This does not delete the _field
.
The same code, works just fine with _measurement
Expected behavior: I expect the data to be deleted.
Actual behavior: Nothing.
Environment info:
- System info: Linux 4.15.0-128-generic x86_64
- InfluxDB version: I am running it through docker: quay.io/influxdb/influxdb:v2.0.3 (when I do influxd version it does not show the docker instance)
- Other relevant environment details: Container runtime is up 7 days.
@snenkov can you provide some extra info to help me set up a realistic reproduction? Ideally:
- Example line-protocol inputs that would be affected by your delete call (ideally with multiple field=value pairs in their field-sets)
- Expected output of querying the bucket after executing the delete on the inputs
Here is some example data: 2020-12-22-21-42_chronograf_data.csv.zip
One the delete is executed, there should be no _field of close_price
(so each row with this field should be gone)
(Disclaimer: not an InfluxQL expert)
I think delete-with-predicate might be running into trouble because the InfluxQL engine pivots _field
values into column names, so a search for the _field
tag finds nothing. i.e. you wouldn't write this query in V1:
select * from price_volume_1_min where _field = "close_price"
Instead, you'd probably write:
select "close_price" from price_volume_1_min
I'm not sure of the best way to extend support for deleting via _field
. SQL has 2 separate APIs for these use-cases (DELTE WHERE
vs. DROP COLUMN
); we might want to take a simliar route, or else rearrange our query logic so the filters run before data is pivoted.
Thanks @danxmoran. This confirms the bug then. So, what are the options to delete it now? Is there work which needs to be done on the backend, or some loophole that we can use for now?
I'm honestly not sure. I'll raise it internally, but I'm not sure how much attention it'll get until after the holiday season since so many people are on vacation.
Okay cool. Thanks. Yeah now is kind of a laid-back time. :) It’s important to have it fixed at some point. Otherwise we can’t really delete data :/
#6150 is the (ancient) parallel from the 1.x line
I cannot believe this does not exist. It's quite crazy : / Thanks for linking to the #6150 . I hope it gets resolved soon.
Thanks to Dan for pointing me here.
This is a major problem. Not only is a broken feature documented as something that works, some people (such as myself) are totally blocked from the most basic database functionality; deleting data points in a time series. I am blocked.
Since I picked up the 2.x version, I have not yet once successfully deleted any individual data points. This feature is long overdue for a released product.
Agreed. It is documented as working and it seems a necessary feature when working with time series.
Any update on this? Also recently just ran into this when attempting to onboard a new dataset and the value type was set incorrectly seems I now have to delete the entire measurement as I cannot do a _field="name" delete..
The docs clearly state that
Delete predicates can use any column or tag except _time or _value.
this should be updated to add _field as this clearly does not work.
Still nothing on this after so long time? It's like creating a file system and releasing it without a way to delete some of the files.
We need a way to clean up data that is no longer needed or in my case was added by misstake. A fundamental feature in any database I might say.
curl --request POST "influx:/api/v2/delete?org=Org&bucket=heating" --header 'Authorization: Token api' --header 'Content-Type: application/json' --data '{
"start": "2021-12-26T02:08:00Z",
"stop": "2022-01-09T02:08:03Z",
"predicate": "_time=\"2022-01-06T17:05:47Z\" "
}'
Titme tag not working.
I don't get it. I was able to delete a specific _field in InfluxDB Cloud but not on 2.0.7. Is this really NOT supported in the OSS version?
If so, please update documentation here: https://docs.influxdata.com/influxdb/v2.0/reference/syntax/delete-predicate/ which states that it cannot be used with _time
or _value
but it doesn't mention that it doesn't work with _field
.
This issue has been silent for so long, how come this has not even been implemented yet? Deleting data based on the common name seems like the most basic of features, doesn't it?
I could work around the issue by saving the measurement to a temporary measurement, deleting the original measurement, and then restoring the measurement from the copy.
Is there any update on this? This is really a bad bug and some of the things that bug me a lot in V2.0
I get it that you need to make money and cannot offer all features in the OSS version. But this is a really really basic feature, which IMHO should be included.
Still does not work in 2.6.1
Seems to be still fucked; I do
# influx delete --org mini31 --bucket solar --start 2020-01-01T00:00:00Z --stop 2023-02-19T00:00:00Z --predicate '_field="batt_level"'
# influx version
Influx CLI 2.6.1 (git: 61c5b4d) build_date: 2022-12-29T15:41:09Z
and nothing happens.
InfluxDB OSS 2.x does not support deleting by field, as noted in the documentation: https://docs.influxdata.com/influxdb/v2.6/write-data/delete-data/#cannot-delete-data-by-field
This is a limitation of the underlying storage engine used in 2.x.
This is a limitation of the underlying storage engine used in 2.x.
Yes, everyone knows. Still, this is not a limitation, it is a bug from a users point of view. This is a really simple operation, which worked in 1.x and in 2.x it just fails silently.
I can recommend everyone to look for alternatives. Such ridiculos limitations together with the powerful, but weird-ass query language not suitable for non-techies makes influx 2.x just not considerable anymore.
InfluxDB OSS 2.x does not support deleting by field, as noted in the documentation: https://docs.influxdata.com/influxdb/v2.6/write-data/delete-data/#cannot-delete-data-by-field
That's nice way to handle bugs, just change the specification instead of fixing the issue... (yes, sometimes bugs in design requires significant changes).....
I was looking as well why I could not delete using _field predicate. Thanks for ending this little quest with a clear (yet surprising) answer !
So this will never get addressed, huh??
When looking at how to struture your data, the use of 'measurement' is very intuitive. However, if you can never use 'measurement' to simplify data deletion, then why is it still here? While tags are far less intuitive to use for structuring data, it may the be only solution to more easily clean up data easily.
This is the most frustrating aspect of influxDB. The general outcome is that I don't bother cleaning things up as much as I should. I am considering resturctuing to use tags instead of measurment as a result, but suspect that this comes at a cost in efficiency. But, If I need to go this length to make deleting data simpler, then I should probably look at other options working with time series data given the solutions have been evolving.
It would be fine if someone will fix this "bug" - yes, it is a bug if you can't perform such a simple task as delete points from field is. Deleting was implemented in earlier versions so I can't see any reason to implement such a basic function in latest versions. Don't throw dirt on an otherwise great product.