influxdb icon indicating copy to clipboard operation
influxdb copied to clipboard

InfluxDB delete with predicate works for _measurement, but not for specific _field

Open newskooler opened this issue 4 years ago • 26 comments

Steps to reproduce: List the minimal actions needed to reproduce the behavior.

Here is the code I use (in python) to try and delete a specific _field:

def delete_data(data: str, org: str, bucket: str, token: str, db_url=INFLUX_URL) -> str:
    url = f'{db_url}/api/v2/delete?org={org}&bucket={bucket}'
    headers = {
        'Authorization': f'Token {token}',
        'Content-type': 'application/json'
    }
    response = requests.post(url, json=data, headers=headers)
    response.raise_for_status()
    return response

data = {
    "start": "2020-12-01T00:00:00Z",
    "stop": "2021-12-20T00:00:00Z",
    "predicate": "_field=\"my_field\""
}
r = delete_data(data=data, org=ORG, bucket='my_bucket', token=MY_TOKEN)

I also tried with curl - same story:

curl --request POST "<INFLUX_URL>/api/v2/delete/?org=<org>&bucket=<bucket>" -H 'Authorization: Token <MY_TOKEN>'   -H 'Content-Type: application/json'   --data '{
    "start": "2020-12-01T00:00:00Z",
    "stop": "2020-12-20T00:00:00Z",
    "predicate": "_field=\"my_field\""
  }'

This does not delete the _field. The same code, works just fine with _measurement

Expected behavior: I expect the data to be deleted.

Actual behavior: Nothing.

Environment info:

  • System info: Linux 4.15.0-128-generic x86_64
  • InfluxDB version: I am running it through docker: quay.io/influxdb/influxdb:v2.0.3 (when I do influxd version it does not show the docker instance)
  • Other relevant environment details: Container runtime is up 7 days.

newskooler avatar Dec 22 '20 18:12 newskooler

@snenkov can you provide some extra info to help me set up a realistic reproduction? Ideally:

  • Example line-protocol inputs that would be affected by your delete call (ideally with multiple field=value pairs in their field-sets)
  • Expected output of querying the bucket after executing the delete on the inputs

danxmoran avatar Dec 22 '20 19:12 danxmoran

Here is some example data: 2020-12-22-21-42_chronograf_data.csv.zip

One the delete is executed, there should be no _field of close_price (so each row with this field should be gone)

newskooler avatar Dec 22 '20 19:12 newskooler

(Disclaimer: not an InfluxQL expert)

I think delete-with-predicate might be running into trouble because the InfluxQL engine pivots _field values into column names, so a search for the _field tag finds nothing. i.e. you wouldn't write this query in V1:

select * from price_volume_1_min where _field = "close_price"

Instead, you'd probably write:

select "close_price" from price_volume_1_min

I'm not sure of the best way to extend support for deleting via _field. SQL has 2 separate APIs for these use-cases (DELTE WHERE vs. DROP COLUMN); we might want to take a simliar route, or else rearrange our query logic so the filters run before data is pivoted.

danxmoran avatar Dec 22 '20 20:12 danxmoran

Thanks @danxmoran. This confirms the bug then. So, what are the options to delete it now? Is there work which needs to be done on the backend, or some loophole that we can use for now?

newskooler avatar Dec 22 '20 21:12 newskooler

I'm honestly not sure. I'll raise it internally, but I'm not sure how much attention it'll get until after the holiday season since so many people are on vacation.

danxmoran avatar Dec 22 '20 21:12 danxmoran

Okay cool. Thanks. Yeah now is kind of a laid-back time. :) It’s important to have it fixed at some point. Otherwise we can’t really delete data :/

newskooler avatar Dec 22 '20 21:12 newskooler

#6150 is the (ancient) parallel from the 1.x line

danxmoran avatar Dec 22 '20 22:12 danxmoran

I cannot believe this does not exist. It's quite crazy : / Thanks for linking to the #6150 . I hope it gets resolved soon.

newskooler avatar Dec 23 '20 12:12 newskooler

Thanks to Dan for pointing me here.

This is a major problem. Not only is a broken feature documented as something that works, some people (such as myself) are totally blocked from the most basic database functionality; deleting data points in a time series. I am blocked.

Since I picked up the 2.x version, I have not yet once successfully deleted any individual data points. This feature is long overdue for a released product.

ski2day avatar Jan 05 '21 17:01 ski2day

Agreed. It is documented as working and it seems a necessary feature when working with time series.

MarcoPignati avatar Jan 25 '21 21:01 MarcoPignati

Any update on this? Also recently just ran into this when attempting to onboard a new dataset and the value type was set incorrectly seems I now have to delete the entire measurement as I cannot do a _field="name" delete..

The docs clearly state that Delete predicates can use any column or tag except _time or _value. this should be updated to add _field as this clearly does not work.

Aqualie avatar Dec 28 '21 20:12 Aqualie

Still nothing on this after so long time? It's like creating a file system and releasing it without a way to delete some of the files.

We need a way to clean up data that is no longer needed or in my case was added by misstake. A fundamental feature in any database I might say.

Znubbis avatar Jan 21 '22 16:01 Znubbis

curl --request POST "influx:/api/v2/delete?org=Org&bucket=heating" --header 'Authorization: Token api' --header 'Content-Type: application/json' --data '{                                            
    "start": "2021-12-26T02:08:00Z",
    "stop": "2022-01-09T02:08:03Z",               
    "predicate": "_time=\"2022-01-06T17:05:47Z\" "            
  }'

Titme tag not working.

mishop avatar Jan 23 '22 02:01 mishop

I don't get it. I was able to delete a specific _field in InfluxDB Cloud but not on 2.0.7. Is this really NOT supported in the OSS version? If so, please update documentation here: https://docs.influxdata.com/influxdb/v2.0/reference/syntax/delete-predicate/ which states that it cannot be used with _time or _value but it doesn't mention that it doesn't work with _field.

dg-eparizzi avatar Feb 16 '22 16:02 dg-eparizzi

This issue has been silent for so long, how come this has not even been implemented yet? Deleting data based on the common name seems like the most basic of features, doesn't it?

DutchEllie avatar Aug 26 '22 13:08 DutchEllie

I could work around the issue by saving the measurement to a temporary measurement, deleting the original measurement, and then restoring the measurement from the copy.

2opremio avatar Sep 26 '22 01:09 2opremio

Is there any update on this? This is really a bad bug and some of the things that bug me a lot in V2.0

I get it that you need to make money and cannot offer all features in the OSS version. But this is a really really basic feature, which IMHO should be included.

eni23 avatar Dec 20 '22 09:12 eni23

Still does not work in 2.6.1

yozik04 avatar Feb 13 '23 16:02 yozik04

Seems to be still fucked; I do

# influx delete --org mini31 --bucket solar --start 2020-01-01T00:00:00Z --stop 2023-02-19T00:00:00Z --predicate '_field="batt_level"'
# influx version
Influx CLI 2.6.1 (git: 61c5b4d) build_date: 2022-12-29T15:41:09Z

and nothing happens.

rwb196884 avatar Feb 18 '23 19:02 rwb196884

InfluxDB OSS 2.x does not support deleting by field, as noted in the documentation: https://docs.influxdata.com/influxdb/v2.6/write-data/delete-data/#cannot-delete-data-by-field

This is a limitation of the underlying storage engine used in 2.x.

sanderson avatar Feb 18 '23 19:02 sanderson

This is a limitation of the underlying storage engine used in 2.x.

Yes, everyone knows. Still, this is not a limitation, it is a bug from a users point of view. This is a really simple operation, which worked in 1.x and in 2.x it just fails silently.

I can recommend everyone to look for alternatives. Such ridiculos limitations together with the powerful, but weird-ass query language not suitable for non-techies makes influx 2.x just not considerable anymore.

eni23 avatar Feb 19 '23 08:02 eni23

InfluxDB OSS 2.x does not support deleting by field, as noted in the documentation: https://docs.influxdata.com/influxdb/v2.6/write-data/delete-data/#cannot-delete-data-by-field

That's nice way to handle bugs, just change the specification instead of fixing the issue... (yes, sometimes bugs in design requires significant changes).....

Subnum12 avatar Mar 05 '23 07:03 Subnum12

I was looking as well why I could not delete using _field predicate. Thanks for ending this little quest with a clear (yet surprising) answer !

Tornix242 avatar Mar 20 '23 21:03 Tornix242

So this will never get addressed, huh??

NTong97 avatar Jul 05 '23 20:07 NTong97

When looking at how to struture your data, the use of 'measurement' is very intuitive. However, if you can never use 'measurement' to simplify data deletion, then why is it still here? While tags are far less intuitive to use for structuring data, it may the be only solution to more easily clean up data easily.

This is the most frustrating aspect of influxDB. The general outcome is that I don't bother cleaning things up as much as I should. I am considering resturctuing to use tags instead of measurment as a result, but suspect that this comes at a cost in efficiency. But, If I need to go this length to make deleting data simpler, then I should probably look at other options working with time series data given the solutions have been evolving.

ski2day avatar Jul 05 '23 20:07 ski2day

It would be fine if someone will fix this "bug" - yes, it is a bug if you can't perform such a simple task as delete points from field is. Deleting was implemented in earlier versions so I can't see any reason to implement such a basic function in latest versions. Don't throw dirt on an otherwise great product.

Roberto6969 avatar May 06 '24 07:05 Roberto6969