logparser icon indicating copy to clipboard operation
logparser copied to clipboard

JSONDecodeError is not JSON serializable

Open budiantoip opened this issue 1 year ago • 3 comments

logparser reported an error in my docker container:

[2024-12-26 12:09:50,917] ERROR    in logparser.logparser: Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/logparser/logparser.py", line 467, in run
    data = self.handle_logfile(log_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/logparser/logparser.py", line 255, in handle_logfile
    self.parse_appended_log(data, appended_log)
  File "/usr/lib/python3.11/site-packages/logparser/logparser.py", line 315, in parse_appended_log
    self.logger.debug("Parsed data_ from appended_log:\n%s", self.json_dumps(data_))
                                                             ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/logparser/logparser.py", line 284, in json_dumps
    return json.dumps(obj, ensure_ascii=False, indent=4, sort_keys=sort_keys)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
          ^^^^^^^^^^^
  File "/usr/lib/python3.11/json/encoder.py", line 202, in encode
    chunks = list(chunks)
             ^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/encoder.py", line 432, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.11/json/encoder.py", line 406, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.11/json/encoder.py", line 406, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.11/json/encoder.py", line 439, in _iterencode
    o = _default(o)
        ^^^^^^^^^^^
  File "/usr/lib/python3.11/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type JSONDecodeError is not JSON serializable

FYI, my container uses the alpine image.

budiantoip avatar Dec 26 '24 12:12 budiantoip

Can you update the settings.py or run logparser with “-v” to print more info?

https://github.com/my8100/logparser/blob/ed7948b271884af68eb3bb13fa9ee51a4892552c/logparser/settings.py#L79-L80

my8100 avatar Dec 26 '24 12:12 my8100

When I run the logparser with the "-v" option inside the container, I only got this output:

****************************************************************************************************
Loading settings from /usr/lib/python3.11/site-packages/logparser/settings.py
****************************************************************************************************


****************************************************************************************************
Visit stats at: http://127.0.0.1:6800/logs/stats.json
****************************************************************************************************

{"downloader/request_bytes": 279,
 "downloader/request_count": 1,
 "downloader/request_method_count/GET": 1,
 "downloader/response_bytes": 379076,
 "downloader/response_count": 1,
 "downloader/response_status_count/200": 1,
 "elapsed_time_seconds": 82.607657,
 "feedexport/success_count/FileFeedStorage": 2,
 "finish_reason": "finished",
 "finish_time": "datetime.datetime(2024, 12, 27, 0, 2, 58, 283984, tzinfo=datetime.timezone.utc)",
 "item_scraped_count": 442,
 "items_per_minute": None,
 "log_count/DEBUG": 448,
 "log_count/INFO": 15,
 "log_count/WARNING": 2,
 "memusage/max": 119156736,
 "memusage/startup": 110751744,
 "response_received_count": 1,
 "responses_per_minute": None,
 "scheduler/dequeued": 1,
 "scheduler/dequeued/memory": 1,
 "scheduler/enqueued": 1,
 "scheduler/enqueued/memory": 1,
 "start_time": "datetime.datetime(2024, 12, 27, 0, 1, 35, 676327, tzinfo=datetime.timezone.utc)"}
Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/logparser/common.py", line 163, in parse_crawler_stats
    return json.loads(text)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 12 column 22 (char 486)

However, I got the JSONDecodeError is not JSON serializable message by running a command to filter the container logs, something like this:

docker logs -f my-scrapper 2>&1 | grep "JSONDecodeError is not JSON serializable"

And, the output would be like this:

TypeError: Object of type JSONDecodeError is not JSON serializable
TypeError: Object of type JSONDecodeError is not JSON serializable
TypeError: Object of type JSONDecodeError is not JSON serializable
TypeError: Object of type JSONDecodeError is not JSON serializable
TypeError: Object of type JSONDecodeError is not JSON serializable

I believe that the JSONDecodeError is not JSON serializable error message was triggered by scrapydweb. Hopefully, the above complete logparser output can give you a hint. But, let me know if you need an additional information.

budiantoip avatar Dec 27 '24 07:12 budiantoip

Please try the latest version: pip install git+https://github.com/my8100/logparser.git@master or pip install --upgrade logparser

my8100 avatar Jan 01 '25 10:01 my8100