clearml icon indicating copy to clipboard operation
clearml copied to clipboard

Dataset add_files behavior

Open dimka11 opened this issue 7 months ago • 1 comments

Describe the bug

I did define directory to add via path parameters, but instead a directory the files was added into a root of the dataset.

To reproduce

dataset.add_files(path=r'C:\New folder')
dataset.upload()
dataset.finalize()

Expected behaviour

path (Union [ str , Path , _Path ] ) – Add a folder/file to the dataset - from docs, it's not correct in my case. folder shoud be added to the root of the dataset not a files in the folder.

Also symlink to directories doesn't works (at least under Windows). In order to add multiple folders, I had to copy them to a new folder, and add this new folder. It's time consuming.

Environment

  • ClearML SDK Version 1.13.2
  • Python 3.9.2
  • Windows 11

dimka11 avatar Dec 09 '23 16:12 dimka11

@dimka11 the path parameter to Dataset.add_files() specifies where to look for the files to be added. To also control where in the dataset the files will be stored, you can use the dataset_path parameter e.g.

dataset.add_files(path=r'C:\New folder', dataset_path='New folder')

Re: symlinks - Can you provide an example of the directory structure you were trying, what failed and how you are working around it?

ainoam avatar Dec 10 '23 16:12 ainoam