irssi icon indicating copy to clipboard operation
irssi copied to clipboard

very fragmented DCC on NTFS

Open CatPlanet opened this issue 5 years ago • 5 comments

I have problem with very, very defragmented files when downloading via DCC. I just don't know how much (or if) responsible is Irssi on my setup. Unfortunately I use Windows, NTFS and launch irssi via cygwin.

Please, don't judge.

To ilustrate how fragmented files (yellow color) are: https://i.imgur.com/PccE8sw.png

After zooming on middle-ish cluster: https://i.imgur.com/TQ3uNBV.png

All yelow parts are one file, aprox ~1.3gb. Getting it via DCC is very fast, that I can tell, reading from it - that's a nightmare. When dcc'ing multiple files they interweave freely because why shouldn't they.

I looked through /set options for dcc but it was nothing even remotely helpful. It looks like irssi is trying to write as much small chunks of data in as much small packages as possible and filesystem can't help but make it as sparse as it can get.

CatPlanet avatar Jan 15 '20 18:01 CatPlanet

We could preallocate!

Apparently there's fallocate and posix_fallocate - former is linux-specific and latter is standard, but if the latter is used with an unsupported filesystem in glibc, it will do a really shitty/slow emulation of it. So usually fallocate is preferred because it will fail if not supported. And a typical filesystem that doesn't support this is ntfs-3g.

Which sounds like there's no way to fix this, except this isn't linux!

While searching for this I found this rsync patch which became this commit with no proper attribution and apparently cygwin specifically is the only platform where posix_fallocate is preferred and handles ntfs just fine.

The man page also documents this nicely:

--preallocate
       This tells the receiver to allocate each destination file to its even‐
       tual size before writing data to the file.  Rsync will  only  use  the
       real filesystem-level preallocation support provided by Linux’s fallo‐
       cate(2) system call or Cygwin’s posix_fallocate(3), not the slow glibc
       implementation that writes a null byte into each block.

       Without  this  option,  larger files may not be entirely contiguous on
       the filesystem, but with this option rsync  will  probably  copy  more
       slowly.   If  the  destination  is not an extent-supporting filesystem
       (such as ext4, xfs, NTFS, etc.), this option may have no positive  ef‐
       fect at all.

This seems to be a default-off option for rsync, and since irssi's primary use case isn't file transfer, we could have a simpler implementation, IMO: if a /set is set, posix_fallocate(). If it's slow then the user can just turn it off.

dequis avatar Jan 15 '20 22:01 dequis

Preallocating space surely does sound like perfect solution for my peculiar case. Where do I sign?

CatPlanet avatar Jan 17 '20 14:01 CatPlanet

I assume you call/start cygwin with --preallocate

vague666 avatar Jan 20 '20 09:01 vague666

as a workaround you can always copy the file to a new copy after the transfer is finished

ailin-nemui avatar Jan 20 '20 09:01 ailin-nemui

@vague666 no such flag in cygwin, it's for rsync only @ailin-nemui that's what I've been doing, along with downloading stuff not on windows and ntfs

I've made a patch for IRC library for allocating space for incoming DCC files to check if idea holds https://i.imgur.com/z28hQhx.png and it's perfect. For time being I won't be using irssi but this library.

CatPlanet avatar Jan 20 '20 12:01 CatPlanet