client icon indicating copy to clipboard operation
client copied to clipboard

Automatically upload big files using DVC

Open simonlsk opened this issue 1 year ago • 0 comments

When uploading a new directory with the CLI

dagshub upload <repo> <local-dir-path> <remote-dir--path>

The directory is uploaded using DVC. When uploading a single file using the same command, the file is always uploaded with git. It would be nice to have a size threshold (i.e 5MB) that would automatically decide to upload the file using DVC.

The interesting question is how do you prevent the repo from growing into a list of many single dvc tracked files, and make sure the user makes use of dvc directories to store big files in a manner that makes sense:

.
├── data  <-- dvc
│   ├── preprocessed
│   │   └──  003.png <-- single file
│   └── raw
├── models <-- dvc
└── src <-- git

simonlsk avatar May 20 '23 15:05 simonlsk