intake icon indicating copy to clipboard operation
intake copied to clipboard

BUG: cannot .read() from Windows samba share

Open kadykov opened this issue 2 years ago • 5 comments

I have a problem with reading files from shared Windows samba storage. I can read the catalog .yml file and all the metadata, but when I launch .read() it replaces the first slash with c: and cannot read the content of the file.

Minimal reproducible example:

from pathlib import Path

import intake

DATA_PATH = Path(r"\\path\to\windows\samba\share")
cat = intake.open_catalog(DATA_PATH / "cat.yml")
cat.test_numpy.read()

Content of the cat.yml

description: Samba share test
sources:
  test_numpy:
    description: Test numpy
    driver: numpy
    args:
      path: "{{ CATALOG_DIR }}/numpy.npy"

When I launch print(cat.test_numpy) everything is ok (except of the number of slashes in the beginning and before numpy.npy):

sources:
  test_numpy:
    args:
      path: /path/to/windows/samba/share//numpy.npy
    description: Test numpy
    driver: intake.source.npy.NPySource
    metadata:
      catalog_dir: /path/to/windows/samba/share/

But it crashes on cat.test_numpy.read():

FileNotFoundError: [Errno 2] No such file or directory: '[c:/path/to/windows/samba/sahre/numpy.npy](file:///C:/path/to/windows/samba/share/numpy.npy)'

As far as I understand, it replaces the first slash in the path //path/to/windows/samba/share/numpy.npy with C:, and therefore cannot access the file.

OS: Windows 10 intake: 0.6.8 and commit e362825 from git intake-xarray: 0.4.1 fsspec: 2023.5.0 python: 3.10.11

kadykov avatar May 19 '23 08:05 kadykov

Before digging too far, would you mind trying with all of your r"\" characters in the path replaced by "/"?

martindurant avatar May 19 '23 13:05 martindurant

I have tried with Path("//path/to/windows/samba/share") / numpy.npy and "//path/to/windows/samba/share/numpy.npy", and it gives the same error.

kadykov avatar May 19 '23 14:05 kadykov

Would you mind trying with "file://" prefixed to your path? Otherwise, the following diff may do it, but I am not on windows right now to check:

--- a/intake/catalog/local.py
+++ b/intake/catalog/local.py
@@ -503,17 +503,9 @@ def register_plugin_module(mod):


 def get_dir(path):
-    if "://" in path:
-        protocol, _ = split_protocol(path)
-        out = get_filesystem_class(protocol)._parent(path)
-        if "://" not in out:
-            # some FSs strip this, some do not
-            out = protocol + "://" + out
-        return out
-    path = make_path_posix(os.path.join(os.getcwd(), os.path.dirname(path)))
-    if path[-1] != "/":
-        path += "/"
-    return path
+    protocol, _ = split_protocol(path)
+    out = get_filesystem_class(protocol)._parent(path)
+    return out

martindurant avatar May 19 '23 19:05 martindurant

or

--- a/intake/catalog/local.py
+++ b/intake/catalog/local.py
@@ -503,17 +503,14 @@ def register_plugin_module(mod):


 def get_dir(path):
-    if "://" in path:
-        protocol, _ = split_protocol(path)
-        out = get_filesystem_class(protocol)._parent(path)
-        if "://" not in out:
-            # some FSs strip this, some do not
-            out = protocol + "://" + out
-        return out
-    path = make_path_posix(os.path.join(os.getcwd(), os.path.dirname(path)))
-    if path[-1] != "/":
-        path += "/"
-    return path
+    protocol, _ = split_protocol(path)
+    out = get_filesystem_class(protocol)._parent(path)
+    if "://" not in out and protocol:
+        # some FSs strip this, some do not
+        out = protocol + "://" + out
+    if out[-1] != "/":
+        out += "/"
+    return out

martindurant avatar May 19 '23 19:05 martindurant

Thank you for your response. I tested the diffs and both of them fix the problem. Well done!

The notation "file://path/to/windows/samba/share" does not work at all with any options, I cannot even read the metadata with the same error:

FileNotFoundError: [Errno 2] No such file or directory: '[c:/path/to/windows/samba/share/cat.yml](file:///C:/path/to/windows/samba/share/cat.yml)'

kadykov avatar May 20 '23 11:05 kadykov