gcsfs icon indicating copy to clipboard operation
gcsfs copied to clipboard

TypeError: __init__() got an unexpected keyword argument 'callback_timeout' ERROR 2022-07-04 13:39:43 +0200 master-replica-0 NoneType: None

Open RobinGeibel opened this issue 3 years ago • 13 comments

Hi,

I am trying to train a PyTorch model on GCP and I am reading a csv file from a cloud storage bucket in the mean time.

I get the same error when reading the file in two different ways:

1 - simply using pd.read_csv:

code:

`GCS_BUCKET = "led-test-run"

GCS_BASE_ROOT = f"gs://{GCS_BUCKET}"

TRAIN_DIR = os.path.join(GCS_BASE_ROOT, "bigpatent_sample_corpus_BIG_train_df.csv") VAL_DIR = os.path.join(GCS_BASE_ROOT, "bigpatent_sample_corpus_BIG_val_df.csv") train_dataset = pd.read_csv(TRAIN_DIR)`

error:

Exception ignored in: <function ClientSession.__del__ at 0x7f1ede3bd710> ERROR 2022-07-04 13:39:43 +0200 master-replica-0 Traceback (most recent call last): ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/aiohttp/client.py", line 326, in __del__ ERROR 2022-07-04 13:39:43 +0200 master-replica-0 if not self.closed: ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/aiohttp/client.py", line 963, in closed ERROR 2022-07-04 13:39:43 +0200 master-replica-0 return self._connector is None or self._connector.closed ERROR 2022-07-04 13:39:43 +0200 master-replica-0 AttributeError: 'ClientSession' object has no attribute '_connector' ERROR 2022-07-04 13:39:43 +0200 master-replica-0 ERROR:root:Traceback (most recent call last): ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main ERROR 2022-07-04 13:39:43 +0200 master-replica-0 "__main__", mod_spec) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code ERROR 2022-07-04 13:39:43 +0200 master-replica-0 exec(code, run_globals) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/trainer/task.py", line 78, in <module> ERROR 2022-07-04 13:39:43 +0200 master-replica-0 main() ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/trainer/task.py", line 74, in main ERROR 2022-07-04 13:39:43 +0200 master-replica-0 experiment.run(args) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/trainer/experiment.py", line 66, in run ERROR 2022-07-04 13:39:43 +0200 master-replica-0 train_dataset, test_dataset = utils.load_data() ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/trainer/utils.py", line 65, in load_data ERROR 2022-07-04 13:39:43 +0200 master-replica-0 train_dataset = pd.read_csv(TRAIN_DIR) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py", line 610, in read_csv ERROR 2022-07-04 13:39:43 +0200 master-replica-0 return _read(filepath_or_buffer, kwds) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py", line 462, in _read ERROR 2022-07-04 13:39:43 +0200 master-replica-0 parser = TextFileReader(filepath_or_buffer, **kwds) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py", line 819, in __init__ ERROR 2022-07-04 13:39:43 +0200 master-replica-0 self._engine = self._make_engine(self.engine) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py", line 1050, in _make_engine ERROR 2022-07-04 13:39:43 +0200 master-replica-0 return mapping[engine](self.f, **self.options) # type: ignore[call-arg] ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py", line 1867, in __init__ ERROR 2022-07-04 13:39:43 +0200 master-replica-0 self._open_handles(src, kwds) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/parsers.py", line 1368, in _open_handles ERROR 2022-07-04 13:39:43 +0200 master-replica-0 storage_options=kwds.get("storage_options", None), ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/common.py", line 563, in get_handle ERROR 2022-07-04 13:39:43 +0200 master-replica-0 storage_options=storage_options, ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/pandas/io/common.py", line 334, in _get_filepath_or_buffer ERROR 2022-07-04 13:39:43 +0200 master-replica-0 filepath_or_buffer, mode=fsspec_mode, **(storage_options or {}) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/core.py", line 476, in open ERROR 2022-07-04 13:39:43 +0200 master-replica-0 **kwargs, ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/core.py", line 306, in open_files ERROR 2022-07-04 13:39:43 +0200 master-replica-0 expand=expand, ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/core.py", line 657, in get_fs_token_paths ERROR 2022-07-04 13:39:43 +0200 master-replica-0 fs = cls(**options) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/spec.py", line 76, in __call__ ERROR 2022-07-04 13:39:43 +0200 master-replica-0 obj = super().__call__(*args, **kwargs) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/opt/conda/lib/python3.7/site-packages/gcsfs/core.py", line 270, in __init__ ERROR 2022-07-04 13:39:43 +0200 master-replica-0 self.loop, get_client, callback_timeout=self.callback_timeout ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 66, in sync ERROR 2022-07-04 13:39:43 +0200 master-replica-0 raise return_result ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 26, in _runner ERROR 2022-07-04 13:39:43 +0200 master-replica-0 result[0] = await coro ERROR 2022-07-04 13:39:43 +0200 master-replica-0 File "/root/.local/lib/python3.7/site-packages/fsspec/implementations/http.py", line 29, in get_client ERROR 2022-07-04 13:39:43 +0200 master-replica-0 return aiohttp.ClientSession(**kwargs) ERROR 2022-07-04 13:39:43 +0200 master-replica-0 TypeError: __init__() got an unexpected keyword argument 'callback_timeout' ERROR 2022-07-04 13:39:43 +0200 master-replica-0 NoneType: None

2 - using gcsfs.GCSFileSystem:

code:

fs = gcsfs.GCSFileSystem(project='master-thesis-351911') with fs.open('led-test-run/bigpatent_sample_corpus_BIG_train_df.csv') as f: train_dataset = pd.read_csv(f)

error:

The replica master 0 exited with a non-zero status of 1. ERROR 2022-07-04 14:27:34 +0200 service Traceback (most recent call last): ERROR 2022-07-04 14:27:34 +0200 service File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main ERROR 2022-07-04 14:27:34 +0200 service "__main__", mod_spec) ERROR 2022-07-04 14:27:34 +0200 service File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code ERROR 2022-07-04 14:27:34 +0200 service exec(code, run_globals) ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/trainer/task.py", line 3, in <module> ERROR 2022-07-04 14:27:34 +0200 service from trainer import experiment ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/trainer/experiment.py", line 13, in <module> ERROR 2022-07-04 14:27:34 +0200 service from trainer import model, metadata, utils ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/trainer/utils.py", line 24, in <module> ERROR 2022-07-04 14:27:34 +0200 service fs = gcsfs.GCSFileSystem(project='master-thesis-351911') ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/fsspec/spec.py", line 76, in __call__ ERROR 2022-07-04 14:27:34 +0200 service obj = super().__call__(*args, **kwargs) ERROR 2022-07-04 14:27:34 +0200 service File "/opt/conda/lib/python3.7/site-packages/gcsfs/core.py", line 270, in __init__ ERROR 2022-07-04 14:27:34 +0200 service self.loop, get_client, callback_timeout=self.callback_timeout ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 66, in sync ERROR 2022-07-04 14:27:34 +0200 service raise return_result ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 26, in _runner ERROR 2022-07-04 14:27:34 +0200 service result[0] = await coro ERROR 2022-07-04 14:27:34 +0200 service File "/root/.local/lib/python3.7/site-packages/fsspec/implementations/http.py", line 29, in get_client ERROR 2022-07-04 14:27:34 +0200 service return aiohttp.ClientSession(**kwargs) ERROR 2022-07-04 14:27:34 +0200 service TypeError: __init__() got an unexpected keyword argument 'callback_timeout'

RobinGeibel avatar Jul 04 '22 13:07 RobinGeibel

I'm having the same problem here. Mine had something to do with the function call being not asynchronous.

gcs = gcsfs.GCSFileSystem(token='anon')

`TypeError Traceback (most recent call last) Input In [169], in <cell line: 1>() ----> 1 gcs = gcsfs.GCSFileSystem(token='anon')

File /opt/anaconda3/lib/python3.9/site-packages/fsspec/spec.py:68, in _Cached.call(cls, *args, **kwargs) 66 return cls._cache[token] 67 else: ---> 68 obj = super().call(*args, **kwargs) 69 # Setting _fs_token here causes some static linters to complain. 70 obj.fs_token = token

File /opt/anaconda3/lib/python3.9/site-packages/gcsfs/core.py:269, in GCSFileSystem.init(self, project, access, token, block_size, consistency, cache_timeout, secure_serialize, check_connection, requests_timeout, requester_pays, asynchronous, loop, callback_timeout, **kwargs) 267 self.callback_timeout = callback_timeout 268 if not asynchronous: --> 269 self._session = sync( 270 self.loop, get_client, callback_timeout=self.callback_timeout 271 ) 272 weakref.finalize(self, sync, self.loop, self.session.close) 273 else:

File /opt/anaconda3/lib/python3.9/site-packages/fsspec/asyn.py:65, in sync(loop, func, timeout, *args, **kwargs) 63 raise FSTimeoutError from return_result 64 elif isinstance(return_result, BaseException): ---> 65 raise return_result 66 else: 67 return return_result

File /opt/anaconda3/lib/python3.9/site-packages/fsspec/asyn.py:25, in _runner(event, coro, result, timeout) 23 coro = asyncio.wait_for(coro, timeout=timeout) 24 try: ---> 25 result[0] = await coro 26 except Exception as ex: 27 result[0] = ex

File /opt/anaconda3/lib/python3.9/site-packages/fsspec/implementations/http.py:29, in get_client(**kwargs) 28 async def get_client(**kwargs): ---> 29 return aiohttp.ClientSession(**kwargs)

TypeError: init() got an unexpected keyword argument 'callback_timeout'`

HaynesStephens avatar Aug 31 '22 00:08 HaynesStephens

Please check your versions - fsspec 2022.8.0 is out, but gcsfs has not yet gone through the process.

martindurant avatar Aug 31 '22 01:08 martindurant

>>> gcsfs.__version__
'0.7.2'
>>> fsspec.__version__
'2022.02.0'

HaynesStephens avatar Aug 31 '22 16:08 HaynesStephens

0.7.2 is very old, you should not be using that. https://pypi.org/project/gcsfs/0.7.2/

martindurant avatar Aug 31 '22 16:08 martindurant

Strange, that's the version that installed by default when I executed conda install -c conda-forge gcsfs

HaynesStephens avatar Aug 31 '22 17:08 HaynesStephens

Perhaps check your conda config? Is this perhaps an old version of python?

> conda create -n temp python=3.8 gcsfs -c conda-forge

...
## Package Plan ##

  environment location: /Users/mdurant/conda/envs/temp

  added / updated specs:
    - gcsfs
    - python=3.8

...
The following NEW packages will be INSTALLED:

...
  fsspec             conda-forge/noarch::fsspec-2022.7.1-pyhd8ed1ab_0
  gcsfs              conda-forge/noarch::gcsfs-2022.7.1-pyhd8ed1ab_0
...

(those are the latest versions on conda-forge)

martindurant avatar Aug 31 '22 17:08 martindurant

I'm using Python 3.9.12. Could it be I need Python 3.8?

HaynesStephens avatar Aug 31 '22 17:08 HaynesStephens

3.9 is the same. You must have some previously pinned packages in your system. Try installing gcsfs==2022.7.1 explicitly, and conda/mamba will tell you what other things need updating for that to happen.

martindurant avatar Aug 31 '22 17:08 martindurant

My terminal can't seem to solve the environment necessary to install the latest version. Is there any way to make the solve an easier process?

conda install -c conda-forge gcsfs=2022.7.1=pyhd8ed1ab_0
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: / failed with initial frozen solve. Retrying with flexible solve.

HaynesStephens avatar Aug 31 '22 17:08 HaynesStephens

You do not need the exact build number, which may depend on python and/or OS version.

conda install -c conda-forge gcsfs==2022.7.1

martindurant avatar Aug 31 '22 18:08 martindurant

Hmm, same responses. Not sure what the fix is here.

conda install -c conda-forge gcsfs==2022.7.1
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

HaynesStephens avatar Aug 31 '22 19:08 HaynesStephens

I am still experiencing the same exception TypeError: init() got an unexpected keyword argument 'callback_timeout', even though I am already on newer versions:

gcsfs==2023.1.0
fsspec==2023.1.0

@martindurant - wondering is the fix already included in above version?

c21 avatar Mar 06 '23 18:03 c21

@c21 , there was no fix, I don't know why this might be happening to you. Perhaps you can use pdb to dig around and see where the extra kwarg is coming from? callback_timeout does not exist anywhere in this repo :|

martindurant avatar Mar 06 '23 19:03 martindurant