msys2-pacman icon indicating copy to clipboard operation
msys2-pacman copied to clipboard

"checking available disk space" is slow

Open lazka opened this issue 2 years ago • 5 comments

See https://github.com/msys2/MSYS2-packages/issues/4176

We should look into why it potentially is slow.

lazka avatar Nov 16 '23 16:11 lazka

Here is the code: https://github.com/msys2/msys2-pacman/blob/4cfaf53950c1e2bbef7262e2e9b608f4f5a280d5/lib/libalpm/diskspace.c#L421

lazka avatar Nov 19 '23 11:11 lazka

I've tested it with MSYS2 installed on ST1000DX002 drive and reproduced the problem (D:/msys64 was added to MS Defender exclusion list). checking available disk space took a long time and I could hear the disk head doing a lot of work (100% disk usage inside task manager). In performance monitor System was attributed most of the disk usage.

Did the same experiment with MSYS2 extracted to Windows 11 Dev Drive created as VHDX on the same drive. This time checking available disk space was so fast I have almost missed the moment it started and finished despite not adding MSYS2 directory to MS Defender exclusion list. So it's either combination of Cygwin and NTFS or MS Defender.

So I ran more tests are here are the results:

  • ST1000DX002:
# MSYS2 located on Dev Drive VHDX on NTFS partition on HDD (without MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    0m34.083s
user    0m2.984s
sys     0m12.843s

# MSYS2 located on Dev Drive VHDX on NTFS partition on HDD (with MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    0m32.008s
user    0m3.451s
sys     0m13.723s

# MSYS2 located on NTFS partition on HDD (without MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    1m26.384s
user    0m3.749s
sys     0m43.123s

# MSYS2 located on NTFS partition on HDD (with MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    1m45.756s # not a mistake, ran the test again (see below); `checking available disk space` was slow
user    0m3.920s
sys     0m42.916s

real    1m30.206s # no idea what is going on; `checking available disk space` wasn't slow but not super fast either
user    0m3.968s
sys     0m43.593s
  • 980 PRO 2TB (about 60% filled):
#  MSYS2 located on Dev Drive VHDX on NTFS partition on NVMe (without MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    0m27.353s
user    0m3.046s
sys     0m13.015s

#  MSYS2 located on Dev Drive VHDX on NTFS partition on NVMe (with MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    0m26.005s
user    0m3.015s
sys     0m13.484s

# MSYS2 located on NTFS partition on NMVe (without MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    0m47,567s
user    0m3,295s
sys     0m22,402s

# MSYS2 located on NTFS partition on NMVe (with MS Defender exclusion)
$ time pacman -S mingw-w64-ucrt-x86_64-toolchain --noconfirm

real    0m43,201s
user    0m3,625s
sys     0m24,155s

Before the install I made sure the archives are cached with pacman -Sw mingw-w64-ucrt-x86_64-toolchain --noconfirm and mingw-w64-ucrt-x86_64-toolchain is not installed in all cases. I think my words "NTFS is the worst main FS used by modern OS" have been proved (at least in this case). Also I find it surprising how little performance was gained on Dev Drive with NVMe vs HDD, maybe buffering in RAM has played significant role?

mati865 avatar Mar 01 '24 17:03 mati865

Did we ever attempt to prove why it's so slow? My understanding was that it was the cygwin stat call, but it seems from looking at the code the only stat is https://github.com/msys2/msys2-pacman/blob/2eabe53cc265d1a8c86e6621b40ca7250747d7bb/lib/libalpm/diskspace.c#L251 and I think that should only be invoked in a scenario when a package is already installed. Otherwise, it just adds up the sizes from the packages' file lists (grouped by mount point) and uses statvfs to see if there's enough free space. That should be quick.

jeremyd2019 avatar Jun 30 '24 05:06 jeremyd2019

I hacked up the callback in pacman when checking disk space starts and ends, and output the time in ms it took (https://github.com/jeremyd2019/msys2-pacman/commit/b1180c2ef724d757b26ee98062cb3064d23e69d7). Trying to install mingw-w64-i686-clang in an env that didn't have any of the i686 toolchain. It was consistently 4. After installing, I attempted to re-install all the same packages. The first time it was 1099, then 274 and 271.

Disabling calculate_removed_size function (so it just returns 0) (https://github.com/jeremyd2019/msys2-pacman/commit/a29ba4d47ade6f1813a5a825946ea1853055ccc6) makes reinstalls consistently take 4 milliseconds again.

Anyway, that was consistent with what I expected, but not with what @mati865 seemed to be saying above (emphasis mine):

Before the install I made sure the archives are cached with pacman -Sw mingw-w64-ucrt-x86_64-toolchain --noconfirm and mingw-w64-ucrt-x86_64-toolchain is not installed in all cases.

jeremyd2019 avatar Jul 02 '24 00:07 jeremyd2019

Interesting, either something has changed in recent updates or there is some branch that wasn't hit in your case.

mati865 avatar Jul 02 '24 06:07 mati865