dvc
dvc copied to clipboard
dvc commit takes too long
Bug Report
commit: takes too long
Description
Running dvc commit takes too long. It can take more than an hour. It still takes a long time even if I run dvc commit immediately after the previous dvc commit is finished, without any change in tracked files.
Reproduce
- git pull
- dvc pull
- dvc commit
Expected
I expect that dvc commit does not take long when there are few file changes or no change.
Environment information
Output of dvc doctor:
$dvc doctor
DVC version: 2.10.2 (pip)
---------------------------------
Platform: Python 3.9.13 on Windows-10-10.0.22000-SP0
Supports:
webhdfs (fsspec = 2022.5.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2022.5.0, boto3 = 1.21.21)
Additional Information (if any):
I use Microsoft Windows 11 Home 10.0.22000 Build 22000.
The project folder and the local repository are in a network mount. I recently changed the laptop and it changed the drive letter from E to D. Because this could be the reason, I tried ii) changing the drive letter from D to E, ii) removing the existing folder with the local repository and newly git pull/dvc pull to create the folder, and ii) setting state.dir and index.dir at the C drive. Nevertheless, I encounter the same problem.
I have the same problem with multiple projects on the same device.
For reference, the issue appears to be related to slow copyfile on checkout, the read takes 89% of the runtime for copyfile, the actual write is 7%

(cprof report available in support email chain)
Is the cache type copy? If it is, I'd say the issue is with unnecessary relinking.
The cache type priority is the default reflink,copy. But copy will be effective because I use Windows.
How can I stop the unnecessary relinking?