BUG: cannot .read() from Windows samba share
I have a problem with reading files from shared Windows samba storage.
I can read the catalog .yml file and all the metadata, but when I launch .read() it replaces the first slash with c: and cannot read the content of the file.
Minimal reproducible example:
from pathlib import Path
import intake
DATA_PATH = Path(r"\\path\to\windows\samba\share")
cat = intake.open_catalog(DATA_PATH / "cat.yml")
cat.test_numpy.read()
Content of the cat.yml
description: Samba share test
sources:
test_numpy:
description: Test numpy
driver: numpy
args:
path: "{{ CATALOG_DIR }}/numpy.npy"
When I launch print(cat.test_numpy) everything is ok (except of the number of slashes in the beginning and before numpy.npy):
sources:
test_numpy:
args:
path: /path/to/windows/samba/share//numpy.npy
description: Test numpy
driver: intake.source.npy.NPySource
metadata:
catalog_dir: /path/to/windows/samba/share/
But it crashes on cat.test_numpy.read():
FileNotFoundError: [Errno 2] No such file or directory: '[c:/path/to/windows/samba/sahre/numpy.npy](file:///C:/path/to/windows/samba/share/numpy.npy)'
As far as I understand, it replaces the first slash in the path //path/to/windows/samba/share/numpy.npy with C:, and therefore cannot access the file.
OS: Windows 10 intake: 0.6.8 and commit e362825 from git intake-xarray: 0.4.1 fsspec: 2023.5.0 python: 3.10.11
Before digging too far, would you mind trying with all of your r"\" characters in the path replaced by "/"?
I have tried with Path("//path/to/windows/samba/share") / numpy.npy and "//path/to/windows/samba/share/numpy.npy", and it gives the same error.
Would you mind trying with "file://" prefixed to your path? Otherwise, the following diff may do it, but I am not on windows right now to check:
--- a/intake/catalog/local.py
+++ b/intake/catalog/local.py
@@ -503,17 +503,9 @@ def register_plugin_module(mod):
def get_dir(path):
- if "://" in path:
- protocol, _ = split_protocol(path)
- out = get_filesystem_class(protocol)._parent(path)
- if "://" not in out:
- # some FSs strip this, some do not
- out = protocol + "://" + out
- return out
- path = make_path_posix(os.path.join(os.getcwd(), os.path.dirname(path)))
- if path[-1] != "/":
- path += "/"
- return path
+ protocol, _ = split_protocol(path)
+ out = get_filesystem_class(protocol)._parent(path)
+ return out
or
--- a/intake/catalog/local.py
+++ b/intake/catalog/local.py
@@ -503,17 +503,14 @@ def register_plugin_module(mod):
def get_dir(path):
- if "://" in path:
- protocol, _ = split_protocol(path)
- out = get_filesystem_class(protocol)._parent(path)
- if "://" not in out:
- # some FSs strip this, some do not
- out = protocol + "://" + out
- return out
- path = make_path_posix(os.path.join(os.getcwd(), os.path.dirname(path)))
- if path[-1] != "/":
- path += "/"
- return path
+ protocol, _ = split_protocol(path)
+ out = get_filesystem_class(protocol)._parent(path)
+ if "://" not in out and protocol:
+ # some FSs strip this, some do not
+ out = protocol + "://" + out
+ if out[-1] != "/":
+ out += "/"
+ return out
Thank you for your response. I tested the diffs and both of them fix the problem. Well done!
The notation "file://path/to/windows/samba/share" does not work at all with any options, I cannot even read the metadata with the same error:
FileNotFoundError: [Errno 2] No such file or directory: '[c:/path/to/windows/samba/share/cat.yml](file:///C:/path/to/windows/samba/share/cat.yml)'