datachain icon indicating copy to clipboard operation
datachain copied to clipboard

File error in parallel mode

Open dreadatour opened this issue 7 months ago • 1 comments

When running datachain query in parallel mode and there is an error with file (prefetch/download/cache), there is an error with exception pickle/unpickle:

  File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/multiprocess/queues.py", line 138, in get_nowait
    return self.get(False)
           ~~~~~~~~^^^^^^^
  File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/multiprocess/queues.py", line 125, in get
    return _ForkingPickler.loads(res)
           ~~~~~~~~~~~~~~~~~~~~~^^^^^
  File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/dill/_dill.py", line 303, in loads
    return load(file, ignore, **kwds)
  File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/dill/_dill.py", line 289, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/dill/_dill.py", line 444, in load
    obj = StockUnpickler.load(self)
TypeError: FileError.__init__() missing 1 required positional argument: 'message'

How to reproduce:

import datachain as dc

def process_file(file: dc.File) -> dc.File:
    file.path = "."
    return file

def process_path(file2: dc.File) -> int:
    print(file2.path)
    return len(file2.path)

(
    dc.read_storage("s3://bucket/")
    .limit(10)
    .settings(prefetch=0)
    .map(file2=process_file)
    .settings(prefetch=1, parallel=2)
    .map(path_len=process_path)
    .save("test")
)

Note: in distributed mode (SaaS related) there are no logs, job being stuck.

Update: in CLI in parallel mode sometimes job is being stuck too without no logs or errors.

dreadatour avatar May 29 '25 15:05 dreadatour

In https://github.com/iterative/datachain/pull/1126 unpickling error was fixed, but it still being stuck in parallel mode and SaaS.

dreadatour avatar May 29 '25 16:05 dreadatour

@dreadatour is this fixed? feel free to reopen?

shcheklein avatar Jun 15 '25 18:06 shcheklein