dvc
dvc copied to clipboard
dvc commit takes too long
Bug Report
commit: takes too long
Description
Running dvc commit
takes too long. It can take more than an hour. It still takes a long time even if I run dvc commit
immediately after the previous dvc commit
is finished, without any change in tracked files.
Reproduce
- git pull
- dvc pull
- dvc commit
Expected
I expect that dvc commit
does not take long when there are few file changes or no change.
Environment information
Output of dvc doctor
:
$dvc doctor
DVC version: 2.10.2 (pip)
---------------------------------
Platform: Python 3.9.13 on Windows-10-10.0.22000-SP0
Supports:
webhdfs (fsspec = 2022.5.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2022.5.0, boto3 = 1.21.21)
Additional Information (if any):
I use Microsoft Windows 11 Home 10.0.22000 Build 22000.
The project folder and the local repository are in a network mount. I recently changed the laptop and it changed the drive letter from E to D. Because this could be the reason, I tried ii) changing the drive letter from D to E, ii) removing the existing folder with the local repository and newly git pull/dvc pull to create the folder, and ii) setting state.dir and index.dir at the C drive. Nevertheless, I encounter the same problem.
I have the same problem with multiple projects on the same device.
For reference, the issue appears to be related to slow copyfile
on checkout, the read takes 89% of the runtime for copyfile, the actual write is 7%
(cprof report available in support email chain)
Is the cache type copy
? If it is, I'd say the issue is with unnecessary relinking.
The cache type priority is the default reflink,copy. But copy will be effective because I use Windows.
How can I stop the unnecessary relinking?