influxdb-python
influxdb-python copied to clipboard
Python & InfluxDB: Setting column of unique values as key for writing DataFrame to InfluxDB with some values having the same timestamp
I have the following DataFrame and try to write into an InfluxDB-Measurement:
dataframe1
UNIQUE FK_TAG_ID VALUE
TIME
2016-07-11 06:01:04 0 1694 326699.995995
2016-07-11 06:01:04 1 1694 325999.999046
2016-07-11 06:01:04 2 1694 325300.002098
2016-07-11 06:01:04 3 1694 -1600.000076
2016-07-11 06:01:04 4 1694 -1099.999994
2016-07-11 06:01:04 5 1694 -1600.000076
2016-07-11 06:01:04 6 1694 -1600.000076
2016-07-11 06:01:04 7 1694 327699.995041
2016-07-11 06:01:04 8 1694 326699.995995
2016-07-11 06:01:04 9 1694 326500.010490
2016-07-11 06:03:12 10 1694 325500.011444
2016-07-11 06:03:12 11 1694 325300.002098
2016-07-11 06:03:12 12 1694 324300.003052
2016-07-11 06:03:12 13 1694 324000.000954
2016-07-11 06:03:12 14 1694 323499.989510
2016-07-11 06:03:12 15 1694 322600.007057
2016-07-11 06:03:12 16 1694 322300.004959
2016-07-11 06:03:12 17 1694 321399.998665
2016-07-11 06:03:12 18 1694 321099.996567
2016-07-11 06:03:12 19 1694 320600.008964
... ... ...
2016-07-11 21:29:04 9090 1743 305200.004578
2016-07-11 21:31:12 9091 1743 305200.004578
2016-07-11 21:31:12 9092 1743 305699.992180
2016-07-11 21:31:12 9093 1743 305699.992180
2016-07-11 21:33:20 9094 1743 305200.004578
2016-07-11 21:33:20 9095 1743 305200.004578
2016-07-11 21:33:20 9096 1743 305699.992180
As some of the values have the same timestamp, I am trying to use the UNIQUE-Column as a workaround to avoid them being overwritten as described in here: https://docs.influxdata.com/influxdb/v0.13/troubleshooting/frequently_encountered_issues/#writing-duplicate-points
I use the following code:
client = DataFrameClient(host, port, user, password, dbname)
tags = { 'tag1': dataframe1[['UNIQUE']], 'tag2': dataframe1[['FK_TAG_ID']]}
client.write_points(dataframe1, "table", protocol = json, tags)
When I try to execute the code the following Stack Trace is displayed:
File "C:/Users/Chris/Desktop/balenciagaAI-master/AzureWebAppJson.py", line 173, in <module>
client.write_points(dataframe1, "table", protocol = protocol, tags = tags)
File "C:\Users\Chris\Anaconda3\lib\site-packages\influxdb\_dataframe_client.py", line 137, in write_points
protocol=protocol)
File "C:\Users\Chris\Anaconda3\lib\site-packages\influxdb\client.py", line 468, in write_points
tags=tags, protocol=protocol)
File "C:\Users\Chris\Anaconda3\lib\site-packages\influxdb\client.py", line 533, in _write_points
protocol=protocol
File "C:\Users\Chris\Anaconda3\lib\site-packages\influxdb\client.py", line 312, in write
headers=headers
File "C:\Users\Chris\Anaconda3\lib\site-packages\influxdb\client.py", line 271, in request
raise InfluxDBClientError(response.content, response.status_code)
InfluxDBClientError: 413: {"error":"Request Entity Too Large"}
I am more or less following the way of tagging columns as key described in this link:
https://github.com/influxdata/influxdb-python/issues/286#issuecomment-333391449
But there seems to be something I'm missing when declaring the "UNIQUE"-Column as Key for the Influx-Measurement.
Dear @K1Zeit,
the Request Entity Too Large error indicates that the HTTP request body might be too large to handle for InfluxDB. I don't know if there is a server-side limit which can be reconfigured, but using the batch_size parameter on the write_points() method also works like a charm. See also https://github.com/earthobservations/wetterdienst/issues/235 and https://github.com/earthobservations/wetterdienst/pull/245.
With kind regards, Andreas.