irssi
irssi copied to clipboard
very fragmented DCC on NTFS
I have problem with very, very defragmented files when downloading via DCC. I just don't know how much (or if) responsible is Irssi on my setup. Unfortunately I use Windows, NTFS and launch irssi via cygwin.
Please, don't judge.
To ilustrate how fragmented files (yellow color) are: https://i.imgur.com/PccE8sw.png
After zooming on middle-ish cluster: https://i.imgur.com/TQ3uNBV.png
All yelow parts are one file, aprox ~1.3gb. Getting it via DCC is very fast, that I can tell, reading from it - that's a nightmare. When dcc'ing multiple files they interweave freely because why shouldn't they.
I looked through /set options for dcc but it was nothing even remotely helpful. It looks like irssi is trying to write as much small chunks of data in as much small packages as possible and filesystem can't help but make it as sparse as it can get.
We could preallocate!
Apparently there's fallocate
and posix_fallocate
- former is linux-specific and latter is standard, but if the latter is used with an unsupported filesystem in glibc, it will do a really shitty/slow emulation of it. So usually fallocate
is preferred because it will fail if not supported. And a typical filesystem that doesn't support this is ntfs-3g.
Which sounds like there's no way to fix this, except this isn't linux!
While searching for this I found this rsync patch which became this commit with no proper attribution and apparently cygwin specifically is the only platform where posix_fallocate is preferred and handles ntfs just fine.
The man page also documents this nicely:
--preallocate
This tells the receiver to allocate each destination file to its even‐
tual size before writing data to the file. Rsync will only use the
real filesystem-level preallocation support provided by Linux’s fallo‐
cate(2) system call or Cygwin’s posix_fallocate(3), not the slow glibc
implementation that writes a null byte into each block.
Without this option, larger files may not be entirely contiguous on
the filesystem, but with this option rsync will probably copy more
slowly. If the destination is not an extent-supporting filesystem
(such as ext4, xfs, NTFS, etc.), this option may have no positive ef‐
fect at all.
This seems to be a default-off option for rsync, and since irssi's primary use case isn't file transfer, we could have a simpler implementation, IMO: if a /set
is set, posix_fallocate()
. If it's slow then the user can just turn it off.
Preallocating space surely does sound like perfect solution for my peculiar case. Where do I sign?
I assume you call/start cygwin with --preallocate
as a workaround you can always copy the file to a new copy after the transfer is finished
@vague666 no such flag in cygwin, it's for rsync only @ailin-nemui that's what I've been doing, along with downloading stuff not on windows and ntfs
I've made a patch for IRC library for allocating space for incoming DCC files to check if idea holds https://i.imgur.com/z28hQhx.png and it's perfect. For time being I won't be using irssi but this library.