modin
modin copied to clipboard
BUG: The csv file cannot be read if there are square brackets in the csv file path or full path.
Modin version checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest released version of Modin.
-
[ ] I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)
Reproducible Example
df = pd.read_csv("/home/user/[DE]/[PROJECT]/total_data_231127.csv",
names=['seq', 'A', 'B'],
usecols=['A','B'],
sep="\t",
dtype=str,
na_values=['\\N'],
)
Issue Description
I thought it was a Korean (UTF-8) problem at first, so I tested it several times to find the error. (Because, I am using the Korean version of Ubuntu 22.04 and Korean is included in the file path.)
After removing Korean, I tried to read_csv using the file full path, but the same error occurred. Below is the test screen using square brackets. (Using "MODIN_ENGINE" = "dask")
I tried these tests just in case, but it was no use.
Currently, I'm using it very well after removing square brackets. However I think it would be better to leave such a report, so I'm writing in a bug report. Thank you.
Expected Behavior
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/[DE]/[PROJECT]/total_data_231127.csv'
Error Logs
Replace this line with the error backtrace (if applicable).
Installed Versions
-
Ubuntu 22.04.2 LTS (Korean)
-
conda 22.9.0
-
jupyter notebook 6.5.3
-
pip list print ... dask 2023.5.0 modin 0.23.1.post0 modin-spreadsheet 0.1.2 ...
cc @anmyachev
Hello @JacobKwon! Thanks for your contribution and sorry for the long response.
First of all, I would like to clarify if square brackets work if you are using pandas and not modin? (you may have already tried)
Hello, @anmyachev It's okay to have a late response time. I know the time difference is considerable. 😉
First, As you thought, I tested the pandas read_csv. However, I tested it to attach the image once again after seeing your answer, and I will respond by attaching the image. 👍
-
Using Pandas
-
Using Modin
~~Actually, it's my first bug report on Github, so I'm worried that I might have made a mistake. 🙄~~ Thank you.
@JacobKwon I can reproduce the problem and seem to have found the cause. Modin uses fsspec
library in cases when pandas doesn't: https://github.com/fsspec/filesystem_spec/issues/1476