unearth icon indicating copy to clipboard operation
unearth copied to clipboard

json.decoder.JSONDecodeError during find_all_packages

Open paugier opened this issue 1 year ago • 6 comments

On a Gitlab CI, I get a traceback using PDM (https://github.com/pdm-project/pdm/issues/2532). I think that the problem is related to unearth. The exception can be reproduced only with unearth:

$ python3.9 -c "from unearth import PackageFinder as F; f = F(index_urls=['https://pypi.org/simple/']); print(list(f.find_all_packages('flit-core')))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/finder.py", line 295, in find_all_packages
    self._find_packages(package_name, allow_yanked), hashes=hashes or {}
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/finder.py", line 275, in _find_packages
    return sorted(all_packages, key=self._sort_key, reverse=True)
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 135, in collect_links_from_location
    yield from _collect_links_from_index(session, location)
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 85, in parse_json_response
    data = json.loads(page.content)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)

Interestingly, (i) this code runs fine locally (and even locally in the same Docker image used for the CI) and (ii) I can install packages with pip in the Gitlab CI.

System (please complete the following information):

  • unearth version: 0.12.1
  • Python version: 3.9
  • OS: Linux

Additional context

Cause https://github.com/pdm-project/pdm/issues/2532

paugier avatar Jan 02 '24 23:01 paugier

Since it is not reproducible, can you inspect what is the response content, around exactly this line:

  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 85, in parse_json_response
    data = json.loads(page.content)

Print page.content and you can probably figure out what the problem is.

frostming avatar Jan 03 '24 00:01 frostming

  File "/builds/fluiddyn/unearth/src/unearth/collector.py", line 88, in parse_json_response
    raise RuntimeError(page.content)
RuntimeError: b'{"files":[{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc1-py2.py3-none-any.whl","hashes":{"sha256":"1d717e7336997feed076c4f5dbdbe9ce45062e680f2b1de319b4c759f809a561"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84110,"upload-time":"2019-11-17T20:54:08.119802Z","url":"https://files.pythonhosted.org/packages/12/4f/8a0a7b2033b8a80451d214a289aecf486afdfb8e155b25986b0cbd3eb6e8/flit_core-2.0rc1-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc1.tar.gz","hashes":{"sha256":"d78f4b5b8fb2b484a98974b6da8d0edc8e7af55f60da7f40e0a9ddd2c36a5932"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22702,"upload-time":"2019-11-17T20:54:10.798088Z","url":"https://files.pythonhosted.org/packages/7f/8c/583b4412da71153ec70ed78341983c242a234d47abcfc8485284c6bb7b48/flit_core-2.0rc1.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc2-py2.py3-none-any.whl","hashes":{"sha256":"35a83504f509fcfd19bc53859d938cf2ad3385a2a19bfeb1745d1c957d39115c"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84124,"upload-time":"2019-11-17T21:01:19.976000Z","url":"https://files.pythonhosted.org/packages/d0/72/0fe258ce61fa1b59adb6c76a701b19a96e0033fbe054b297a2012e33ad44/flit_core-2.0rc2-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc2.tar.gz","hashes":{"sha256":"b34eef2a6da426c659b5bbfc7a18cbfba2a72bbf7dc20d75a15fd5fc90c1d937"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22697,"upload-time":"2019-11-17T21:01:22.111432Z","url":"https://files.pythonhosted.org/packages/2f/1b/41ac0da91712d9c3e7d06a6e1eb7dfe616c96a14e4036e9b9c37ea9ee6f8/flit_core-2.0rc2.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc3-py2.py3-none-any.whl","hashes":{"sha256":"9c5e882e51ddb4206626f576f0a8217ebdf011ab34aeb9d4bb91f101cad03981"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84135,"upload-time":"2019-11-19T09:38:40.419494Z","url":"https://files.pythonhosted.org/packages/6a/ff/be83d749ff1ad481b09e1e6069178c2d5d6c56a10b493353f0cc405e8475/flit_core-2.0rc3-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc3.tar.gz","hashes":{"sha256":"207a70987a60e67c475955996813ed95d485f97eee288d03fc04bff01b2c56b8"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22527,"upload-time":"2019-11-19T09:38:42.261378Z","url":"https://files.pythonhosted.org/packages/2d/36/bcd4bfb529261a27f113eb2e6fb9f5e5aed4d0b79be59a76ce65689a1892/flit_core-2.0rc3.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0-py2.py3-none-any.whl","hashes":{"sha256":"6315800ae208f0f1de1ee89997e16f69dacc5e18d3fd2a65e4e518e3d78dbdda"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84102,"upload-time":"2019-11-23T09:24:17.224079Z","url":"https://files.pythonhosted.org/packages/dc/81/1f336b50c81e5345aafe7469e4f4c1104faa82b76e6e9885456b47d898fe/flit_core-2.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.tar.gz","hashes":{"sha256":"8e91d877c663b16e70d88a2f652bc9e0ae71501cbb81c5ab8d48c838e731ba80"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22527,"upload-time":"2019-11-23T09:24:19.013574Z","url":"https://files.pythonhosted.org/packages/ec/cc/60e05480a5bf4b44ee1dbd179ca715ca4d192597d054e8c97bc0403060e8/flit_core-2.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.1-py2.py3-none-any.whl","hashes":{"sha256":"1eb2bf3fd805560ed3ad6abca365a03681d1bf1f7d80707dc3bc3ce6833d52f4"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84353,"upload-time":"2019-11-23T13:23:44.884523Z","url":"https://files.pythonhosted.org/packages/36/6a/b0e5ba2ad9d801887c8df7095535635292ce9b97f63cbb86f2b4d96dfebf/flit_core-2.0.1-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.1.tar.gz","hashes":{"sha256":"96e7708bc88c03b58e0d35f1171197737e701e29a901a8b49c13d3fd21866560"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22616,"upload-time":"2019-11-23T13:23:46.654078Z","url":"https://files.pythonhosted.org/packages/0e/3d/e9b28cd1d220ca635234e37567099bf4d50ea0a98a77b360b8d8042352e6/flit_core-2.0.1.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.2-py2.py3-none-any.whl","hashes":{"sha256":"c49546abb6afe371a13b78a2595d5afe1c0cd0aaa9dd753d800cd21259e51222"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84764,"upload-time":"2019-11-23T13:40:37.588869Z","url":"https://files.pythonhosted.org/packages/66/d2/c520657053052af580573e32aeafe50a9f68fc77c5d87ff551ca856d2aa3/flit_core-2.0.2-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.2.tar.gz","hashes":{"sha256":"9efcdca4ae84fd4d831e18d3cdb85a0b4f211a52d4b832408ff9a65bcc309928"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22764,"upload-time":"2019-11-23T13:40:39.483209Z","url":"https://files.pythonhosted.org/packages/89/cf/a76f37dfded167e97936b8d53308abe5a8d00b97d417a6a405e69167e685/flit_core-2.0.2.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.1.0-py2.py3-none-any.whl","hashes":{"sha256":"c6dff661e9e290d51084cefc38b0971d692290e8a352d0b6cec6006be764b4d1"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":39162,"upload-time":"2019-11-26T09:48:48.530001Z","url":"https://files.pythonhosted.org/packages/b6/b0/50719ef7d12cd39ccfa4e48abb593764c8e4a6d0d9bdf7815be1949142ff/flit_core-2.1.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.1.0.tar.gz","hashes":{"sha256":"d2ebad9351c34083c16388d1df64a6e19579affcec02bfc05746714eef9f82fb"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22978,"upload-time":"2019-11-26T09:48:49.922785Z","url":"https://files.pythonhosted.org/packages/6c/6a/f945cf72957752ba0655260a8cb9c1139ea134c5f4b104bc48027349a6f4/flit_core-2.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.2.0-py2.py3-none-any.whl","hashes":{"sha256":"4df2b9b43f00764a81e7ea742829749183a7f5a9e360fa5c3a9e8643dadd716a"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":40023,"upload-time":"2020-01-14T10:57:57.314090Z","url":"https://files.pythonhosted.org/packages/25/4c/0b1ed660937d96ed192c376d3983dd7b052b887c8041ae020c950c0d06f0/flit_core-2.2.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.2.0.tar.gz","hashes":{"sha256":"4efb8bffc1a04d8e550e877f0c9acf53109a021cc27c2a89b1b467715dc1d657"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":23131,"upload-time":"2020-01-14T10:57:59.011481Z","url":"https://files.pythonhosted.org/packages/77/72/5dda5dc417a4e702e0d7e4a77e9802792a0e4a2daec2aeed915ead7db477/flit_core-2.2.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.3.0-py2.py3-none-any.whl","hashes":{"sha256":"a8f8904b534966712390e0a2e434cd33f76037730a0aaed299a286f9e18cac2b"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":40020,"upload-time":"2020-04-08T08:04:01.308900Z","url":"https://files.pythonhosted.org/packages/4b/3c/82798771fc1fd978c9225c5ae25eef45cb23b0df4728f208024a5b57901f/flit_core-2.3.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.3.0.tar.gz","hashes":{"sha256":"a50bcd8bf5785e3a7d95434244f30ba693e794c5204ac1ee908fc07c4acdbf80"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22995,"upload-time":"2020-04-08T08:04:02.852440Z","url":"https://files.pythonhosted.org/packages/bb/92/e51c58d463ebbabb7b226662655cef6d17d3b4b83f570b08f6be0fe2b1b8/flit_core-2.3.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.0.0-py3-none-any.whl","hashes":{"sha256":"a787754978cfe3c192a5fc6baf2179ae85b05395804de7d7fe2864d9431e8d03"},"requires-python":">=3.4","size":36921,"upload-time":"2020-09-06T10:57:29.444835Z","url":"https://files.pythonhosted.org/packages/a8/66/67758f788959c2557c4d0f80e4895c3c0802873be95b82a5213ea39542d7/flit_core-3.0.0-py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.0.0.tar.gz","hashes":{"sha256":"a465052057e2d6d957e6850e9915245adedfc4fd0dd5737d0791bf3132417c2d"},"requires-python":">=3.4","size":22037,"upload-time":"2020-09-06T10:57:30.734781Z","url":"https://files.pythonhosted.org/packages/0e/b9/040baf94b40c80081bbecbd90365a5d7765a1c07e31b6c949838cc4c93d1/flit_core-3.0.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.1.0-py3-none-any.whl","hashes":{"sha256":"1d06e64a6af7e1fd1496563b160df29dd32714e00b473f3b763f6e6810476517"},"requires-python":">=3.4","size":38715,"upload-time":"2021-03-01T15:36:57.289033Z","url":"https://files.pythonhosted.org/packages/ed/0c/50352b127c0936cd59dd762db41d0e17986401c42ba613fa502e926d33ec/flit_core-3.1.0-py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.1.0.tar.gz","hashes":{"sha256":"22ff73be39a2b3c9e0692dfbbea3ad4a9d127e5733736a87dbb8ddcbf7309b1e"},"requires-python":">=3.4","size":22706,"upload-time":"2021-03-01T15:36:58.522778Z","url":"https://files.pythonhosted.org/packages/4c/8f/bed80c03f71cb3a2935882f391b53d2510c359191e5e0361650fa02d1365/flit_core-3.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl","hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'

page.content ends with

[...],{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl",
"hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},
"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'

so indeed, an unterminated json text...

Questions:

  • why do I get this unterminated response?
  • how can unearth avoid that or complete the loading of this data when it happens?

paugier avatar Jan 03 '24 07:01 paugier

Note that I can reproduce the exception with this simple code:

from datetime import datetime
from requests import Session

session = Session()

print("before get:", datetime.now())
resp = session.get(
    "https://pypi.org/simple/flit-core/",
    headers={
        "Accept": "application/vnd.pypi.simple.v1+json",
        "Cache-Control": "no-cache",
    },
    timeout=120,
)
print("after get:", datetime.now())

print(resp)
print(resp.content[-400:])
print(resp.json()["versions"])

which gives

before get: 2024-01-03 10:22:01.244619
after get: 2024-01-03 10:22:01.277003
<Response [200]>
b'g/packages/4c/8f/bed80c03f71cb3a2935882f391b53d2510c359191e5e0361650fa02d1365/flit_core-3.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl","hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'
Traceback (most recent call last):
  File "/home/appuser/.local/lib/python3.9/site-packages/requests/models.py", line 960, in json
    return complexjson.loads(self.content.decode(encoding), **kwargs)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/fluiddyn/fluidsim/tmp_bug_unearth.py", line 20, in <module>
    print(resp.json()["versions"])
  File "/home/appuser/.local/lib/python3.9/site-packages/requests/models.py", line 968, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)

Note that everything is fine with pip in the same environment (Gitlab CI). In particular pip index versions flit-core prints the correct data.

paugier avatar Jan 03 '24 10:01 paugier

print(resp.content[-400:])

Did this line play an important role in reproducing the issue?

frostming avatar Jan 04 '24 02:01 frostming

Did this line play an important role in reproducing the issue?

No. This line was only to visualize what happens, i.e. the response is truncated.

paugier avatar Jan 04 '24 08:01 paugier

No. This line was only to visualize what happens, i.e. the response is truncated.

If requests itself can reproduce this, why not asking it there? I don't think there is any behavior of requests that can be tweaked via arguments to bypass this.

frostming avatar Jan 04 '24 08:01 frostming