clearml icon indicating copy to clipboard operation
clearml copied to clipboard

`Dataset.add_files` doesn't support multiple wildcards, although its documentation states otherwise

Open antifriz opened this issue 2 years ago • 2 comments

This is how at the time of writing add_files looks like:

def add_files(
            self,
            path,  # type: Union[str, Path, _Path]
            wildcard=None,  # type: Optional[Union[str, Sequence[str]]]
            local_base_folder=None,  # type: Optional[str]
            dataset_path=None,  # type: Optional[str]
            recursive=True,  # type: bool
            verbose=False  # type: bool
    ):

Although here it states that wildcard can be a list of strings(wildcards)

"""
...
:param wildcard: add only specific set of files.
            Wildcard matching, can be a single string or a list of wildcards)
...
"""

It actually isn't supported since first argument of both Path.glob and Path.rglob cannot be a list of strings. See here.

The use case I'd like to be supported is a large root directory where only a subset of files should be added to the dataset. I'd like to pass the list of files and have a single call of the method add_files do the rest.

antifriz avatar Mar 18 '22 00:03 antifriz

Thanks for catching that :) We'll make sure to fix this!

erezalg avatar Mar 20 '22 16:03 erezalg

Hello @antifriz, We've just released clearml 1.4.0 that fixes this issue. Let us know if it works as expected!

erezalg avatar May 05 '22 17:05 erezalg