crawlab icon indicating copy to clipboard operation
crawlab copied to clipboard

datetime not JSON serializable when using Data integration with Scrapy

Open oyhel opened this issue 1 year ago • 2 comments

First of all; amazing project!!

I have enabled data integration by adding ´'crawlab.CrawlabPipeline': 888´ to my list of pipelines. The (Scrapy) project runs without problems without the pipeline enabled. When enabled I get the following error message.

  File "/usr/local/lib/python3.10/dist-packages/twisted/internet/defer.py", line 892, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "/usr/local/lib/python3.10/dist-packages/scrapy/utils/defer.py", line 307, in f
    return deferred_from_coro(coro_f(*coro_args, **coro_kwargs))
  File "/usr/local/lib/python3.10/dist-packages/crawlab/scrapy/pipelines.py", line 10, in process_item
    save_item(result)
  File "/usr/local/lib/python3.10/dist-packages/crawlab/result.py", line 74, in save_item
    get_result_service().save_item(*items)
  File "/usr/local/lib/python3.10/dist-packages/crawlab/result.py", line 23, in save_item
    self.save(list(items))
  File "/usr/local/lib/python3.10/dist-packages/crawlab/result.py", line 36, in save
    self._save(_items)
  File "/usr/local/lib/python3.10/dist-packages/crawlab/result.py", line 50, in _save
    data = json.dumps({
  File "/usr/lib/python3.10/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.10/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.10/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type datetime is not JSON serializable`

Can this be fixed with something like:

now = datetime.datetime.now()

def serialize_datetime(obj):
    if isinstance(obj, datetime.datetime):
        return obj.isoformat()
    raise TypeError("Type not serializable")

json.dumps(now, default=serialize_datetime)`

oyhel avatar Aug 01 '23 14:08 oyhel

Thanks for your input. Will implement in the next version

tikazyq avatar Aug 02 '23 04:08 tikazyq

这补丁貌似至今没打上啊~

glacierck avatar Nov 06 '23 02:11 glacierck