modin icon indicating copy to clipboard operation
modin copied to clipboard

BUG: modin.pandas.read_csv: "FileNotFoundError: [Errno 2] No such file or directory"

Open frank0532 opened this issue 11 months ago • 1 comments

Modin version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest released version of Modin.

  • [X] I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

from dask.distributed import Client
import modin.pandas as pd
import modin.config as modin_cfg

dask_client = Client('192.168.10.15:8786')
modin_cfg.Engine.put('dask')
modin_cfg.StorageFormat.put('pandas')

pd.read_csv('E:/gd/sim/df.csv')

Issue Description

when I run above code to read csv file I get an error: FileNotFoundError: [Errno 2] No such file or directory: 'E:/gd/sim/df.csv' In the fact csv file "E:/gd/sim/df.csv" exits and it can be read by pandas.read_csv.

Expected Behavior

I want to know how shall I read csv files which are on local disk.

Error Logs

Traceback (most recent call last): File "D:\programs\miniconda\Lib\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in pd.read_csv('E:/gd/sim/df.csv') File "D:\programs\miniconda\Lib\site-packages\modin\utils.py", line 613, in wrapped return func(*params.args, **params.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\logging\logger_decorator.py", line 144, in run_and_log return obj(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\pandas\io.py", line 226, in read_csv return _read(**kwargs) ^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\pandas\io.py", line 116, in _read pd_obj = FactoryDispatcher.read_csv(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\execution\dispatching\factories\dispatcher.py", line 207, in read_csv return cls.get_factory()._read_csv(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\execution\dispatching\factories\factories.py", line 272, in _read_csv return cls.io_cls.read_csv(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\logging\logger_decorator.py", line 144, in run_and_log return obj(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\io\file_dispatcher.py", line 165, in read if not AsyncReadMode.get() and hasattr(query_compiler, "dtypes"): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\storage_formats\pandas\query_compiler.py", line 380, in dtypes return self._modin_frame.dtypes ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\dataframe\pandas\dataframe\dataframe.py", line 424, in dtypes dtypes = self._dtypes.get() ^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\dataframe\pandas\metadata\dtypes.py", line 924, in get self._value = self._value() ^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\io\text\text_file_dispatcher.py", line 946, in dtypes=lambda: cls.get_dtypes(dtypes_ids, column_names), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\logging\logger_decorator.py", line 144, in run_and_log return obj(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\storage_formats\pandas\parsers.py", line 257, in get_dtypes partitions_dtypes = cls.materialize(dtypes_ids) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\modin\core\execution\dask\common\engine_wrapper.py", line 141, in materialize return client.gather(future) ^^^^^^^^^^^^^^^^^^^^^ File "D:\programs\miniconda\Lib\site-packages\distributed\client.py", line 2566, in gather return self.sync( ^^^^^^^^^^ File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\modin\core\execution\dask\common\engine_wrapper.py", line 44, in _deploy_dask_func File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\modin\logging\logger_decorator.py", line 144, in run_and_log File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\modin\core\storage_formats\pandas\parsers.py", line 362, in parse File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\modin\logging\logger_decorator.py", line 144, in run_and_log File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\modin\core\storage_formats\pandas\parsers.py", line 181, in generic_parse File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\modin\core\io\file_dispatcher.py", line 98, in enter File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\fsspec\core.py", line 147, in open File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\fsspec\core.py", line 105, in enter File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\fsspec\spec.py", line 1310, in open File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\fsspec\implementations\local.py", line 200, in _open File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\fsspec\implementations\local.py", line 364, in init File "C:\Users\Everpine\AppData\Roaming\Python\Python312\site-packages\fsspec\implementations\local.py", line 369, in _open FileNotFoundError: [Errno 2] No such file or directory: 'E:/gd/sim/df.csv'

Installed Versions

INSTALLED VERSIONS

commit : 3e951a63084a9cbfd5e73f6f36653ee12d2a2bfa python : 3.12.7 python-bits : 64 OS : Windows OS-release : 11 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : Chinese (Simplified)_China.936 Modin dependencies

modin : 0.32.0 ray : 2.40.0 dask : 2024.12.1 distributed : 2024.12.1 pandas dependencies

pandas : 2.2.3 numpy : 2.1.3 pytz : 2024.2 dateutil : 2.9.0.post0 pip : 24.2 Cython : None sphinx : None IPython : 8.31.0 adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : 2024.12.0 html5lib : None hypothesis : None gcsfs : None jinja2 : 3.1.5 lxml.etree : None matplotlib : 3.10.0 numba : None numexpr : None odfpy : None openpyxl : 3.1.5 pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 18.1.0 pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.14.1 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : 0.23.0 tzdata : 2024.2 qtpy : None pyqt5 : None

frank0532 avatar Jan 07 '25 07:01 frank0532

can I take it ?

tsafacjo avatar Feb 02 '25 00:02 tsafacjo