clearml icon indicating copy to clipboard operation
clearml copied to clipboard

AttributeError for 'fspath' with str and bytes in WrapperBase

Open jokokojote opened this issue 1 year ago • 2 comments

Describe the bug

When passing a parameter from one pipeline component to another, this parameter is wrapped (<LazyEvalWrapper>). When calling os.fspath on this parameter and the parameter is of type str or bytes, an error occurs:

File "/opt/jupyter-server/venv-jupyter-server/lib/python3.11/site-packages/clearml/utilities/proxy_object.py", line 334, in method
    mtd = getattr(obj, name)
          ^^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute '__fspath__'

This is due to the fact that str and bytes do not implement __fspath__, but os._fspath checks for these types to handle them as valid path-like objects.

if isinstance(path, (str, bytes)):
        return path

That is, when fspath is called on the wrapped parameter (of type str / bytes) it is not recognised as one of these types (because it is still wrapped) and __fs__path is called on it. WrapperBase proxies this call, but then tries to call __fspath__ on the object like it would be some normal path-like object.

To reproduce

See this minimal example code:

from clearml import PipelineDecorator

@PipelineDecorator.component(cache=False,  return_values=['path'])
def dummy_path_provider():
    path = "/root/.clearml/cache/storage_manager/datasets/ds_966214ba3cbe4ae8ade5599a670a8d0b"  # provide some path e.g. after dataset download
    return path

@PipelineDecorator.component(cache=False) 
def dummy_path_consumer(path):
    import os
    
    search_path = os.path.join(path, '*.csv') # NOT WORKING, AttributeError: 'str' object has no attribute '__fspath__'
    # Do something with the path, e.g. search / load .csv files

@PipelineDecorator.pipeline(name="pipeline", 
                            project="test", 
                            add_pipeline_tags="True")
def pipeline():
    path = dummy_path_provider()
    dummy_path_consumer(path)

PipelineDecorator.debug_pipeline()
pipeline()

Expected behaviour

Pipeline component parameters of type str and bytes should be usable with the path protocol.

Environment

  • Server type: self hosted
  • ClearML SDK Version: 1.13.2
  • ClearML Server Version: WebApp: 1.12.1-397 • Server: 1.12.1-397 • API: 2.26
  • Python Version: 3.11.4
  • OS: Linux

jokokojote avatar Nov 17 '23 20:11 jokokojote

"I considered making a pull request to address this issue, but I'm not entirely sure about the best solution for this. One straightforward way could be of course to add specific handling for fspath in the make_method function of WrapperBase, like so:

def make_method(name):
    def method(self, *args, **kwargs):
        # Existing code...

        # Special handling for __fspath__ when the object is a string or bytes
        if name == '__fspath__' and isinstance(obj, (str, bytes)):
            return obj

        # Rest of the existing code...

However, you know your architecture the best. I am happy make a pull request if desired.

jokokojote avatar Nov 17 '23 20:11 jokokojote

Hi @jokokojote ! We will need something a little bit more robust to solve this issue. The root of the problem is that isinstance doesn't evaluate the proxy object properly, so we will actually need to find a solution for this.