dvc
dvc copied to clipboard
Epic: limit cache validation to absolute necessity
We validate cache too much, almost every operation (status/repro/checkout/etc) does it every time and even though they don't rehash the files, it still takes a lot of time to validate cache for large datasets. This is mostly a leftover from our old hardlink/symlink defaults.
- [ ] introduce
dvc cache/remote check(or some other name) - [x] don't validate on diff
- [x] don't validate on checkout (https://github.com/iterative/dvc/pull/9572)
- [ ] don't validate on status (unless explicitly asked?)
- [ ] don't validate on repro
- [ ] ...
@efiop Do we mention that it happens in the docs anywhere? For 3.0, I guess we can mention in the release notes that we will stop guaranteeing this behavior for performance reasons, and then we can keep working on it after release. WDYT?