gdelt-doc-api
gdelt-doc-api copied to clipboard
JSONDecodeError
For certain dates i receive a JSONDecodeError as well as an AttributeError: 'ValueError' object has no attribute 'pos'. Does this mean there are no news articles available for the selected day and if yes is there a way to access GDELT directly to get the respective data for the date?
Thanks for the help!
I just had the same thing happen:
Traceback (most recent call last):
File "C:\Python311\Lib\site-packages\gdeltdoc\helpers.py", line 15, in load_json
result = json.loads(json_message)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid \escape: line 1 column 99103 (char 99102)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python311\Lib\site-packages\gdeltdoc\helpers.py", line 15, in load_json
result = json.loads(json_message)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\json\__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
ValueError: Exceeds the limit (4300) for integer string conversion: value has 248854 digits; use sys.set_int_max_str_digits() to increase the limit
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\boss\Dropbox (ASU)\merck grant\gdelt-search.py", line 74, in <module>
new_articles = gd.article_search(f)
^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\gdeltdoc\api_client.py", line 79, in article_search
articles = self._query("artlist", filters.query_string)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\gdeltdoc\api_client.py", line 168, in _query
return load_json(response.content, self.max_depth_json_parsing)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\gdeltdoc\helpers.py", line 27, in load_json
return load_json(json_message=new_message, max_recursion_depth=max_recursion_depth,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\gdeltdoc\helpers.py", line 20, in load_json
idx_to_replace = int(e.pos)
^^^^^
AttributeError: 'ValueError' object has no attribute 'pos'
Thanks for the reports! I'll do some digging and figure this out
@networks1 @pdb159 could you give me an example query that gives this error?
Running this should reproduce it. I can't remember the exact days. The first was in late September I think. There were a couple in December too.
date_generated = pd.date_range('2020-09-01','2020-12-31',freq ="D").strftime("%Y-%m-%d").tolist()
api_timeout = 5
for dt in date_generated:
start_date = dt
end_date = (datetime.strptime(dt,"%Y-%m-%d") + timedelta(days=1)).strftime("%Y-%m-%d")
f = Filters(
# keyword = kw,
near = near(20,"COVID","vaccine"),
start_date = start_date,
end_date = end_date,
num_records = 250,
country = "US"
)
new_articles = gd.article_search(f)
time.sleep(api_timeout)