`checking available disk space` is slowest operation
- on doing
pacman -Syu, thechecking available disk spaceis almost always the slowest step - and i always wonder why? like what makes it take sooo much time? and if it can be improved
- it's performing nothing like downloading, extracting, copying, searching etc...
:: Starting full system upgrade...
resolving dependencies...
looking for conflicting packages...
Packages (63) ...
Total Download Size: 138.77 MiB
Total Installed Size: 886.21 MiB
Net Upgrade Size: -1.71 MiB
:: Proceed with installation? [Y/n]
:: Retrieving packages...
...
Total (63/63) ...
(63/63) checking keys in keyring ...
(63/63) checking package integrity ...
(63/63) loading package files ...
(63/63) checking for file conflicts ...
(63/63) checking available disk space ...
:: Processing package changes...
We have had these reports before, but the problem is that it is only slow for some users. Here it is instant for example.
We need to find a way to reproduce.
We need to find a way to reproduce.
how can i help?
oh, the msys on my system in in a custom location (/d/msys64). on a drive partitioned at 449 GB - 342 free on an HDD.
if i recall correct (might be wrong too), it was fast when it was in default location (/c/msys64) - a disk of capacity 118 GB on SSD.
Note that the checking can be disabled im the pacman config. That's just a workaround of course.
Years ago when I had checked it on very old HDD it was caused by Microsoft Defender. Even though I had added MSYS2 directory to the exceptions. Temporarily disabling the Defender made it much faster. I think Pacman might be doing something that is normal on Unix but slow when emulated by Cygwin on Windows/NTFS and also causes AV software to do a lot of unnecessary work.
I think Pacman might be doing something that is normal on Unix but slow when emulated by Cygwin on Windows/NTFS
yeah...
- the
dufor calculating disk usage is also suppppperrrr slow du -hd1 -t500MiB "$LOCALAPPDATA"took 20 minutes- whereas doublecmd does the equivalent in less than a minute (via "Show Occupied Space"
S-M-Ret< Commands < Menu bar). see attached screenshot.
$ # diskusage -hd1 -t500MiB
$ du --human-readable --max-depth=1 --threshold=500MiB "$LOCALAPPDATA"
du: cannot read directory './ElevatedDiagnostics': Permission denied
3.1G ./FXHOME
756M ./GitHubDesktop
763M ./JetBrains
du: cannot read directory './Microsoft/Windows/INetCache/Low/Content.IE5': Permission denied
2.1G ./Microsoft
du: cannot read directory './Packages/B9ECED6F.ASUSBatteryHealthCharging_qmba6cd70vzyy/SystemAppData/Helium/Cache': Permission denied
du: cannot read directory './Packages/B9ECED6F.ASUSKeyboardHotkeys_qmba6cd70vzyy/SystemAppData/Helium/Cache': Permission denied
1.1G ./Packages
740M ./pip
637M ./Programs
1.3G ./Vivaldi
14G .
Screenshot from doublecmd :
self hiding as off topic: as this ain't about slowness of calculating disk size feel free to open this in a dedicated discussion if you are feeling generous & want to explain this 😋
whereas the doublecmd does this in less than a minute:
P.S. also note the differences in the output sizes,
$ du --help | grep -Eni "apparent|default"
8: --apparent-size print apparent sizes, rather than disk usage; although
9: the apparent size is usually smaller, it may be
15: -b, --bytes equivalent to '--apparent-size --block-size=1'
33: -P, --no-dereference don't follow any symbolic links (this is the default)
54:Otherwise, units default to 1024 bytes (or 512 if POSIXLY_CORRECT is set).
$ echo $DU_BLOCK_SIZE, $BLOCK_SIZE and $BLOCKSIZE, or $POSIXLY_CORRECT
, and , or
$ # -d0 eq. --max-depth=0 eq. --summary eq. -s
$ # -B1024 eq --block-size=1024 eq neutral/identity/default option
$ du "$LOCALAPPDATA/FXHOME" -d0 -B1024
3148496 FXHOME
$ du "$LOCALAPPDATA/FXHOME" -d0 --apparent -B1024
3148084 FXHOME
$ # Only this one matches with "Size" field in windows Properties dialog
$ du "$LOCALAPPDATA/FXHOME" -d0 -b
3223637515 FXHOME
$ qalc 3148496*1024, 3148084*1024
[3148496 * 1024, 3148084 * 1024] = [3224059904, 3223638016]
how come -b i.e. --apparent --block-size=1 (= 3223637515) is not equal to --apparent --block-size=1024 * 1024 (= 3148496 * 1024 = 3224059904) ??
Compare this with the values shown in windows Properties dialog for the directory:
Size: 3.00 GB (3,223,637,515 bytes)
Size on disk: 3.00 GB (3,223,990,272 bytes)
$ # Only this one matches with "Size" field in windows Properties dialog $ du "$LOCALAPPDATA/FXHOME" -d0 -b
i thought maybe this is fetching from windows, so, could it be fast? but no, i retried after closing all the shells - and it is still taking time.
takeaways:
- closing the shell seems to clear the cache - so, it seems that original behaviour of the command is shown again.
important as executing
duon [edit: some list of subdirs of] same argument again in same session yields result instantly - the idea of "maybe it's fast" is wrong
i also took this opportunity to run time on it, i have omitted the output of du here, as it's exactly same as above with values in different units. and excuse [edit: and avoid] the conflicting options used together 😅 --human-readable -b
[edit: have replaced -b with --apparent-size]
$ time du --human-readable --max-depth=1 --threshold=500MiB "$LOCALAPPDATA" --apparent-size
du: cannot read directory './ElevatedDiagnostics': Permission denied
3223637515 ...
...
real 19m12.818s
user 0m6.703s
sys 1m7.219s
Some ideas:
- use Strace or Procmon to see what / how many IO operations are happening
- Du is not doing the same thing as Pacman, but there might be similarities
- Pacman's algorithm doesn't seem to be doing anything egregious other that stat-ing every file that's going to be removed or replaced; if that's the issue, I'd assume installing new packages is going to be very fast, but upgrading and removing, especially with many files, can be very slow
-
is there some way to forcefully make pacman to perform only this action of "checking available disk space" ?? as i think that coupling to upgradation will hugely hinder me in investigating this issue in context of pacman.
-
as for
du- yeah sure, i will try to learn the aforementioned things (strace/procmon) and use them over du. -
also, i noticed that \time outputs in this format (executed on a dummy command):
0.03user 0.14system 0:00.17elapsed 94%CPU (0avgtext+0avgdata 9920maxresident)k
0inputs+0outputs (2633major+0minor)pagefaults 0swaps
is what you are pointing towards? would this be enough?
@goyalyashpal, here goes my results – one minute and a half:
$ time du --human-readable --max-depth=1 --threshold=500MiB "$LOCALAPPDATA" -b
1263105728 C:\Users\saukrs\AppData\Local/0install.net
3674538443 C:\Users\saukrs\AppData\Local/Autodesk
du: cannot read directory 'C:\Users\saukrs\AppData\Local/ElevatedDiagnostics': Permission denied
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Google/Chrome/User Data/CertificateRevocation/8365': Permission denied
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Google/Chrome/User Data/CertificateRevocation/8367': Permission denied
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Google/Chrome/User Data/OptimizationHints/421': Permission denied
6783482254 C:\Users\saukrs\AppData\Local/Google
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Microsoft/Windows/INetCache/Low/Content.IE5': Permission denied
2406862526 C:\Users\saukrs\AppData\Local/Microsoft
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Programs/Opera/.opera/1DCCC0B63140': Permission denied
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Temp/WinSAT': Permission denied
15325201831 C:\Users\saukrs\AppData\Local
real 1m34.078s
user 0m3.968s
sys 0m31.968s
Note that rerunning the command instantly didn't change timing:
$ time du --human-readable --max-depth=1 --threshold=500MiB "$LOCALAPPDATA" -b
...
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Programs/Opera/.opera/1DCCC0B63140': Permission denied
du: cannot read directory 'C:\Users\saukrs\AppData\Local/Temp/WinSAT': Permission denied
15327079606 C:\Users\saukrs\AppData\Local
real 1m32.405s
user 0m4.640s
sys 0m37.343s
Maybe that's due to MP Realtime Protection being off here at the moment:
$ powershell '$prefs = Get-MpPreference; $prefs.DisableRealtimeMonitoring'
True
Maybe you should check that and disable it if it's enabled:
$ sudo powershell 'Set-MpPreference -DisableRealtimeMonitoring $true'
Note also, that run of an eleveated du is a bit shorter:
$ time sudo du --human-readable --max-depth=1 --threshold=500MiB "$LOCALAPPDATA" -b
1263105728 C:\Users\saukrs\AppData\Local/0install.net
3674538443 C:\Users\saukrs\AppData\Local/Autodesk
6785324008 C:\Users\saukrs\AppData\Local/Google
2406862776 C:\Users\saukrs\AppData\Local/Microsoft
15326500357 C:\Users\saukrs\AppData\Local
real 1m12.072s
user 0m0.015s
sys 0m0.015s
I used gsudo for that.
@elieux commented 1 hour ago:
- Du is not doing the same thing as Pacman, but there might be similarities
- Pacman's algorithm doesn't seem to be doing anything egregious other that stat-ing every file that's going to be removed or replaced;
My impressions of MSYS2 is that stat-ing is quite slow. Maybe not as slow as @goyalyashpal is experiencing on his Windows box, but still annoying.
BTW, Midipix is like 4x faster:
midipix@DESKTOP-O7JE7JE ~
$ time du --human-readable --max-depth=1 --threshold=500MiB 'C:\Users\saukrs\AppData\Local' -b
1269876416 C:\Users\saukrs\AppData\Local/0install.net
3695960523 C:\Users\saukrs\AppData\Local/Autodesk
du: cannot access 'C:\Users\saukrs\AppData\Local/Google/Chrome/User Data/CrashpadMetrics.pma': No such file or directory
6803580813 C:\Users\saukrs\AppData\Local/Google
2426903812 C:\Users\saukrs\AppData\Local/Microsoft
15399724254 C:\Users\saukrs\AppData\Local
real 0m18.005s
user 0m0.000s
sys 0m0.000s
PS. Pressing Alt-Enter for this directory in the Explorer exhibits calculation times around 26 seconds:
This timing is more similar to Midipix (which uses NTAPI to get the metadata) than to MSYS2 (or Cygwin for that matter).
My impressions of MSYS2 is that
stat-ing is quite slow. Maybe not as slow as @goyalyashpal is experiencing on his Windows box, but still annoying.
My recollections of stat sucking is that for any stat of foo it will also try to stat foo.exe and at least see if foo.lnk exists (for the symlinks-emulated-via-lnk-file feature). Perhaps whatever cache(s) don't cache an error (ENOENT) result, that would make it even worse...
I've posted very short tests results at https://github.com/msys2/msys2-pacman/issues/32#issuecomment-1973629270 TL;DR I recommend installing MSYS2 inside Dev Drives.
Getting the stat of files in windows is very slow when AV/Malware detectors are running. The solution I use when checking the stat of many files is to do a few of them in parallel. This becomes a lot faster and do not seem to have a negative impact on system performance. I am guessing that Windows is just waiting for something inside each stat request. This is not a problem that is unique to MSYS. Same when opening files for reading and or writing there is a significant delay when opening the file that I guess is related to the stat problem.
As my code for this is written in C# we need a C implementation that can be used in msys. A small library for bulk stating and bulk opening of files that use async io or threads for this.
I also found the following issue that seems to indicate that stating files actually opens them to get a file handle before getting information, but that there is a new API in Windows 11 that doesn't open the file and therefore is much faster. [(https://github.com/libuv/libuv/pull/4327)]
Perhaps using libuv is the solution if there is a mingw version where it is compiled to use the Windows API?
This however doesn't solve this issue for older versions of Windows but that may be something that we can live with. It also doesn't solve that opening a file is slow.
Ideally, this would be integrated into Cygwin. Using Win32 API for that is really far from ideal because you'd have to do conversions between different path formats yourself.
Also, Cygwin's stat will actually read from files in order to determine the 'execute' bit (is it an MZ or #!) in the default case here of noacl
We have had these reports before, but the problem is that it is only slow for some users. Here it is instant for example.
We need to find a way to reproduce.
I'm on w11 pro, with nvme drives only c={120BG, 10BG free}, d={350GB, 66GB free}, msys2 is installed at D:\msys64. This thing takes minutes. Not 1-2, but 5-10 minutes.
Note that the checking can be disabled im the pacman config. That's just a workaround of course.
Please, @lazka how can I disable it??
done... (it was around 70% when I wrote previous message). So, to complete last 20-30% it took around 8 minutes. It was more than 20mins total for sure. While it was doing the freespace check, only MsMpEng.exe (msft antivirus) was taking CPU (one core only). I did not add msys64 dir to av exclude folders (perhaps adding it, would avoid the issue).
Please, how can I disable it??
Comment out CheckSpace option in /etc/pacman.conf
i am running linux nixos with same NTFS, and computing disk space (in normal usage) in both nemo as well doublecmd file explorer is equally (figurative) slow.
so i think it is some inherent problem with unixy things while dealing with NTFS.
$ nix run nixpkgs#inxi -- --filter --system --extra # -zSx
System:
Kernel: 6.12.34 arch: x86_64 bits: 64 compiler: gcc v: 14.3.0
Desktop: Cinnamon v: 6.4.7 Distro: NixOS 25.11 (Xantusia)
I'm experiencing this as well but FWIW after waiting ten minutes for this to finish, turning off Windows Defender real-time protection made the space check finish nearly instantly.