NimbusML icon indicating copy to clipboard operation
NimbusML copied to clipboard

Supporting pathlib's Path objects in FileDataStream

Open pnshinde opened this issue 5 years ago • 5 comments

Fixes #269 . pathlib's Path objects can be converted to strings just by casting, and vice versa. I added a check in FileDataStream's init function to convert a Path object to a string. I also wrote a test for this, but calling FileDataStream.read_csv() on a Path object produces the following error (using tool=None/'pandas'):

Screen Shot 2019-11-28 at 7 34 09 PM

@ganik do I need to define my own schema for a file path as a Path object, even though the contents of the file should be the same? I'm a little stuck here and any help would be appreciated!

pnshinde avatar Nov 29 '19 04:11 pnshinde

CLA assistant check
All CLA requirements met.

msftclas avatar Nov 29 '19 04:11 msftclas

Will this work with Python 2.7? It looks like pathlib was added in version 3.4.

pieths avatar Dec 02 '19 18:12 pieths

It looks like you're right, and that Path objects won't work with python 2.7. I tried several different things to try and get it to work but with no luck. Do you have any suggestions on how to proceed? What if we checked the system python version and used Path objects only when that value is greater than 3.4?

pnshinde avatar Dec 08 '19 04:12 pnshinde

It looks like you're right, and that Path objects won't work with python 2.7. I tried several different things to try and get it to work but with no luck. Do you have any suggestions on how to proceed? What if we checked the system python version and used Path objects only when that value is greater than 3.4?

Checking for a particular Python version is done in some other parts of the code (including the snippet in my previous comment). Search for six.PY2 in the code base for more examples.

pieths avatar Dec 12 '19 22:12 pieths

  1. For Python < 3.4, you can use pathlib2, but I don't think that is the best practice. BTW, I think Path.resolve() should be called by the user, but not by the library.

  2. According to PEP 519, you can implement a function like os.fspath (added in Python 3.6) by path.__fspath__() if hasattr(path, "__fspath__") else path. The definition of path-like object is actually:

    A path-like object is either a str or bytes object representing a path, or an object implementing the os.PathLike protocol.

    so we shouldn't use pathlib.Path to determine whether an object is path-like.

  3. I'm not sure what the best practice is for testing. Maybe you can go with pathlib2, or implement a class with __fspath__.

ianlini avatar Jan 10 '20 07:01 ianlini