influxdb-python
influxdb-python copied to clipboard
InfluxDB-python version 5.3.0 chunk=True
- InfluxDB-python version: 5.3.0
- Python version: 3.7.4
- Operating system version: macOS 10.14.5
msgpack.exceptions.ExtraData: unpack(b) received extra data
Traceback (most recent call last):
File "/Users/dong/Desktop/mosaic-research/analysis/analysis.py", line 17, in <module>
public_book = mosaic_client.public_book(exchange=exchange, instrument=instrument, ts_start=ts, ts_end=ts+save_interval, depth=1)
File "/Users/dong/Desktop/mosaic-research/py_mosaic_client/py_mosaic_client/mosaic_client.py", line 74, in public_book
result = self.client.query(f'SELECT * FROM "l2_book-{exchange}" WHERE time > {ts_start} AND time <= {ts_end}', chunked=True, chunk_size=10000)
File "/Users/dong/opt/anaconda3/lib/python3.7/site-packages/influxdb/client.py", line 518, in query
expected_response_code=expected_response_code
File "/Users/dong/opt/anaconda3/lib/python3.7/site-packages/influxdb/client.py", line 352, in request
raw=False)
File "msgpack/_unpacker.pyx", line 209, in msgpack._cmsgpack.unpackb
msgpack.exceptions.ExtraData: unpack(b) received extra data.
https://github.com/influxdata/influxdb-python/commit/c903d73efcf49b4e340490072d777d8f34ac8e1c
I think it may be related to this PR
Thanks for reporting this @xiandong79, I'll investigate ASAP. I should have added a test to the dataframe_client
for this.
I can take a look too if that helps, I haven't come across that issue though.
the version 5.2.3. works well
I'm having the same issue querying from both Influx 1.7.10 and 1.7.7 Interestingly with Influx 1.0.2 the bug is not present.
There are a lot of differences between 5.2.3
and 5.3.0
, which is why we stepped a minor release instead of point-release.
@hrbonz if you want to take a look that would be AWESOME!
I am getting a different error, but seemingly from a similar place. InfluxDB-python version: 5.3.0 Python version: 3.7.4 Operating system version: Ubuntu 16.04
influxdb/client.py in request(self, url, method, params, data, stream, expected_response_code, headers)
350 packed=response.content,
351 ext_hook=_msgpack_parse_hook,
--> 352 raw=False)
353 else:
354 response._msgpack = None
msgpack/_unpacker.pyx in msgpack._cmsgpack.unpackb()
`UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 8: invalid continuation byte`
Similar with SHOW DIAGNOSTICS query
- InfluxDB 1.7.6
- InfluxDB-python version: 5.3.0
- Python version: 3.6.9
- Operating system version: Ubuntu 18.04
python3
Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from influxdb import client
>>> influxdb_client = client.InfluxDBClient("192.168.10.6", "8086")
>>> influxdb_client.ping()
'1.7.6'
>>> influxdb_client.query('SHOW DIAGNOSTICS')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/vagrant/.local/lib/python3.6/site-packages/influxdb/client.py", line 518, in query
expected_response_code=expected_response_code
File "/home/vagrant/.local/lib/python3.6/site-packages/influxdb/client.py", line 352, in request
raw=False)
File "msgpack/_unpacker.pyx", line 213, in msgpack._cmsgpack.unpackb
ValueError: Unpack failed: incomplete input
- With InfluxDB-python version: 5.2.3 the bug is not present.
I can confirm query
with chunk=True
does not work on 5.3.0.
I can confirm query with chunk=True does not work on 5.3.0.
Hello Team, Any workaround for this issue?
Hello Team, Any workaround for this issue?
Sure use <5.3.0
having the same issue - any progress?
Hi,
Having same issue. Any solution ?
Debian GNU/Linux 9.4 (stretch) python 2.7.13 Influx 1.8.3 Influxdb 5.3.1 msgpack 1.0.2
msgpack.exceptions.ExtraData: unpack(b) received extra data.
Traceback (most recent call last): File "/code/apps/FuelChangeoverPlot.py", line 179, in exportFromDb data = data_fetcher.fetch_fuel_change_over_plot(start_time=rangeStart, end_time=rangeEnd) File "/code/db_interface/data_fetcher.py", line 35, in fetch_fuel_change_over_plot df_dict = db_connector.query_for_single_measurement_range( File "/code/db_interface/db_connector.py", line 80, in query_for_single_measurement_range df_dict = client.query( File "/usr/local/lib/python3.9/site-packages/influxdb/_dataframe_client.py", line 199, in query results = super(DataFrameClient, self).query(query, **query_args) File "/usr/local/lib/python3.9/site-packages/influxdb/client.py", line 521, in query response = self.request( File "/usr/local/lib/python3.9/site-packages/influxdb/client.py", line 358, in request response._msgpack = msgpack.unpackb( File "msgpack/_unpacker.pyx", line 202, in msgpack._cmsgpack.unpackb
There are actually two issues here:
-
Unpack issue when using msgpack. I did not debug this further, but here's a workaround that works for me: use json instead of msgpack. This can be forced using:
client = InfluxDBClient(host, port, u, p, db, headers={'Accept': 'application/json'}, gzip=True)
-
Even when using the above, DataFrameClient does not work. This is because DataFrameClient was not updated along with this commit c903d73.
Debian GNU/Linux bullseye/sid Python 3.9.2 influxdb-python master branch Influxdb 1.8.5 and 1.7.3 msgpack 1.0.2
Run my test scripts with export MSGPACK_PUREPYTHON=1
to use python implementation of msgpackrather than the C, easier for debugging.
Analysis
I've looked into this issue today, it looks to me like a combination of two problems:
-
I forgot to add the stream/chunk changes to
DataFrameClient
because I didn't even realize it existed, I'll submit a PR with the proper changes for it. -
If I run a 'SHOW DIAGNOSTIC' with headers set to accept JSON, I get the following:
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "build",
"columns": [
"Branch",
"Build Time",
"Commit",
"Version"
],
"values": [
[
"1.7",
"",
"ff383cdc0420217e3460dabe17db54f8557d95b6",
"1.7.8"
]
]
},
{
"name": "config",
"columns": [
"bind-address",
"reporting-disabled"
],
"values": [
[
"127.0.0.1:8098",
true
]
]
},
{
"name": "config-coordinator",
"columns": [
"log-queries-after",
"max-concurrent-queries",
"max-select-buckets",
"max-select-point",
"max-select-series",
"query-timeout",
"write-timeout"
],
"values": [
[
"0s",
0,
0,
0,
0,
"0s",
"10s"
]
]
},
{
"name": "config-cqs",
"columns": [
"enabled",
"query-stats-enabled",
"run-interval"
],
"values": [
[
true,
false,
"1s"
]
]
},
{
"name": "config-data",
"columns": [
"cache-max-memory-size",
"cache-snapshot-memory-size",
"cache-snapshot-write-cold-duration",
"compact-full-write-cold-duration",
"dir",
"max-concurrent-compactions",
"max-index-log-file-size",
"max-series-per-database",
"max-values-per-tag",
"series-id-set-cache-size",
"wal-dir",
"wal-fsync-delay"
],
"values": [
[
1073741824,
26214400,
"10m0s",
"4h0m0s",
"/var/lib/influxdb/data",
0,
1048576,
1000000,
100000,
100,
"/var/lib/influxdb/wal",
"0s"
]
]
},
{
"name": "config-httpd",
"columns": [
"access-log-path",
"bind-address",
"enabled",
"https-enabled",
"max-connection-limit",
"max-row-limit"
],
"values": [
[
"",
":8096",
true,
false,
0,
0
]
]
},
{
"name": "config-meta",
"columns": [
"dir"
],
"values": [
[
"/var/lib/influxdb/meta"
]
]
},
{
"name": "config-monitor",
"columns": [
"store-database",
"store-enabled",
"store-interval"
],
"values": [
[
"_internal",
true,
"10s"
]
]
},
{
"name": "config-precreator",
"columns": [
"advance-period",
"check-interval",
"enabled"
],
"values": [
[
"30m0s",
"10m0s",
true
]
]
},
{
"name": "config-retention",
"columns": [
"check-interval",
"enabled"
],
"values": [
[
"30m0s",
true
]
]
},
{
"name": "config-subscriber",
"columns": [
"enabled",
"http-timeout",
"write-buffer-size",
"write-concurrency"
],
"values": [
[
true,
"30s",
1000,
40
]
]
},
{
"name": "network",
"columns": [
"hostname"
],
"values": [
[
"db01"
]
]
},
{
"name": "runtime",
"columns": [
"GOARCH",
"GOMAXPROCS",
"GOOS",
"version"
],
"values": [
[
"amd64",
2,
"linux",
"go1.11"
]
]
},
{
"name": "system",
"columns": [
"PID",
"currentTime",
"started",
"uptime"
],
"values": [
[
10884,
"2021-04-26T09:59:01.187859258Z",
"2021-04-26T08:10:39.214602676Z",
"1h48m21.973256582s"
]
]
}
]
}
]
}
When running without any headers, we get msgpack back with the following:
b'\x81\xa7results\x91\x82\xacstatement_id\x00\xa6series\x9e\x83\xa4name\xa5build\xa7columns\x94\xa6Branch\xaaBuild Time\xa6Commit\xa7Version\xa6values\x91\x94\xa31.7\xa0\xd9(ff383cdc0420217e3460dabe17db54f8557d95b6\xa51.7.8\x83\xa4name\xa6config\xa7columns\x92\xacbind-address\xb2reporting-disabled\xa6values\x91\x92\xae127.0.0.1:8098\xc3\x83\xa4name\xb2config-coordinator\xa7columns\x97\xb1log-queries-after\xb6max-concurrent-queries\xb2max-select-buckets\xb0max-select-point\xb1max-select-series\xadquery-timeout\xadwrite-timeout\xa6values\x91\x97\x00\x00\x00\x00\x83\xa4name\xaaconfig-cqs\xa7columns\x93\xa7enabled\xb3query-stats-enabled\xacrun-interval\xa6values\x91\x93\xc3\xc2\x83\xa4name\xabconfig-data\xa7columns\x9c\xb5cache-max-memory-size\xbacache-snapshot-memory-size\xd9"cache-snapshot-write-cold-duration\xd9 compact-full-write-cold-duration\xa3dir\xbamax-concurrent-compactions\xb7max-index-log-file-size\xb7max-series-per-database\xb2max-values-per-tag\xb8series-id-set-cache-size\xa7wal-dir\xafwal-fsync-delay\xa6values\x91\x9c\xb6/var/lib/influxdb/data\x00\xd2\x00\x0fB@\xd2\x00\x01\x86\xa0d\xb5/var/lib/influxdb/wal\x83\xa4name\xacconfig-httpd\xa7columns\x96\xafaccess-log-path\xacbind-address\xa7enabled\xadhttps-enabled\xb4max-connection-limit\xadmax-row-limit\xa6values\x91\x96\xa0\xa5:8096\xc3\xc2\x00\x00\x83\xa4name\xabconfig-meta\xa7columns\x91\xa3dir\xa6values\x91\x91\xb6/var/lib/influxdb/meta\x83\xa4name\xaeconfig-monitor\xa7columns\x93\xaestore-database\xadstore-enabled\xaestore-interval\xa6values\x91\x93\xa9_internal\xc3\x83\xa4name\xb1config-precreator\xa7columns\x93\xaeadvance-period\xaecheck-interval\xa7enabled\xa6values\x91\x93\xc3\x83\xa4name\xb0config-retention\xa7columns\x92\xaecheck-interval\xa7enabled\xa6values\x91\x92\xc3\x83\xa4name\xb1config-subscriber\xa7columns\x94\xa7enabled\xachttp-timeout\xb1write-buffer-size\xb1write-concurrency\xa6values\x91\x94\xc3\xd1\x03\xe8(\x83\xa4name\xa7network\xa7columns\x91\xa8hostname\xa6values\x91\x91\xa4db01\x83\xa4name\xa7runtime\xa7columns\x94\xa6GOARCH\xaaGOMAXPROCS\xa4GOOS\xa7version\xa6values\x91\x94\xa5amd64\x02\xa5linux\xa6go1.11\x83\xa4name\xa6system\xa7columns\x94\xa3PID\xabcurrentTime\xa7started\xa6uptime\xa6values\x91\x94\xd1*\x84\xc7\x0c\x05\x00\x00\x00\x00`\x86\x8e\xe5\x12J\xde\xab\xc7\x0c\x05\x00\x00\x00\x00`\x86u\x7f\x0c\xca\x93\xb4\xb21h48m22.092293879s'
Both should be representing the same data but the config-coordinator
structure doesn't include all the values:
x83\xa4name\xb2config-coordinator\xa7columns\x97\xb1log-queries-after\xb6max-concurrent-queries\xb2max-select-buckets\xb0max-select-point\xb1max-select-series\xadquery-timeout\xadwrite-timeout\xa6values\x91\x97\x00\x00\x00\x00\x83\xa4name\xaaconfig-cqs
We can see here by the end of the string, we have \x97
that defines a 7 entries 'fixarray' but we're getting only three zeroes (\x00) before seeing an \x83
that should start the next data structure ('config-cqs').
For this reason, I believe the bug actually exists server side. That might be a similar issue generated when doing a regular query, I couldn't figure it out. I'm also not extra comfortable with go so couldn't really find where this is implemented in the server.
This behavior appeared soon after my commit because 7fb5e946062dd36a84801e4a03012a3c032a70db changed the default headers to request msgpack instead of the default JSON.
Summary
- I should push a PR to implement the fixed chunked behavior in
DataFrameClient
. - I suspect there is a bug with the msgpack implementation server side but can't help with this. I think someone with better go knowledge should dig on that one.
@sebito91
Tried to do the request directly on the line with curl and still got a messed up msgpack answer with the same issue:
$ curl -G 'http://localhost:8096/query' --data-urlencode q='SHOW DIAGNOSTICS' --header "Accept: application/x-msgpack" --header "Content-Type: application/json" -u root --output response.txt
@hrbonz @sebito91 May be i am asking a silly question here. Above fix is part of current released library or future release. If future when it is expected to release?
As i tested today i still get below issue. msgpack.exceptions.ExtraData: unpack(b) received extra data.
Same issue here: msgpack/_unpacker.pyx in msgpack._cmsgpack.unpackb()
ExtraData: unpack(b) received extra data.
For any future readers,
- The error persists in 5.3.1 and in 5.3.0 as well.
- This query works and doesnt throw
msgpack.exceptions.extradata: unpack(b) received extra data
without adding additional headers like{'Accept': 'Application/json'}
and while still usingmsgpack
i believe. Please note that I am not using the DataFrameClient. This query in my case fetches around6.67mil points and takes 403.592 seconds
.
client = InfluxDBClient(host=host, port=port, username=user, password=password, database=dbname)
start_time = time.monotonic()
res = pd.DataFrame(client.query("select * from X where time > now() - 30m", chunked=True).get_points())
end_time = time.monotonic()
with outlock:
print("Result from {} took {}".format(host,end_time-start_time))
print(res)
Versions used
python --version
= Python 3.7.8
influxdb.__version__
= 5.2.3
Still get ExtraData: unpack(b) received extra data.
, but after trying @KirannBhavaraju suggestion it worked!
Only thing I did was to remove thechunk_size=xxxx
argument.
client = InfluxDBClient(blah blah)
result = client.query(q, chunked=True)
python = "^3.8" influxdb = "5.3.1"
Only thing I did was to remove the
chunk_size=xxxx
argument.
Responses will be chunked by series or by every 10,000 points, whichever occurs first. https://docs.influxdata.com/influxdb/v1.7/guides/querying_data/#chunking