files icon indicating copy to clipboard operation
files copied to clipboard

Copying files is slow and slows to a crawl over time for large numbers of files

Open elementaryBot opened this issue 7 years ago • 24 comments

Copying a lot of files via Pantheon Files becomes slower and slower over time.

I've created 250,000 100-byte files on tmpfs for testing, and kicked off copying to another tmpfs. It started off at speeds over 100Kb/s but halfway through it's just 4Kb/s (!) and dropping.

Profiling with sysprof shows that all this time is spent in g_list_last(), which probably means that we're abusing a linked list somewhere and that it has to walk the entire list of already copied files, one by one, for each next file copied.

Testcase: mkdir ~/created-files ~/copy-here sudo mount -t tmpfs -o size=1G,mode=0777 tmpfs ~/created-files sudo mount -t tmpfs -o size=1G,mode=0777 tmpfs ~/copy-here cd created-files split -b 100 SOME-BIG-FILE

open Pantheon Files and copy "created-files" folder into "copy-here"

This is a synthetic test case, but I had over 250,000 files during my last backup for OS reinstallation, so this is a real-life scenario.

ProblemType: Bug DistroRelease: elementary OS 0.3 Package: pantheon-files 0.1.5.1+r1680+pkg35~ubuntu0.3.1 [origin: LP-PPA-elementary-os-daily] ProcVersionSignature: Ubuntu 3.13.0-43.72-generic 3.13.11.11 Uname: Linux 3.13.0-43-generic x86_64 ApportVersion: 2.14.1-0ubuntu3.6 Architecture: amd64 CrashDB: pantheon_files CurrentDesktop: Pantheon Date: Sun Dec 21 04:42:10 2014 ExecutablePath: /usr/bin/pantheon-files GsettingsChanges:

InstallationDate: Installed on 2014-12-10 (10 days ago) InstallationMedia: elementary OS 0.3 "Freya" - Daily amd64 (20141209) SourcePackage: pantheon-files UpgradeStatus: No upgrade log present (probably fresh install)

Launchpad Details: #LP1404588 Sergey "Shnatsel" Davidoff - 2014-12-21 01:53:22 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

pantheon-files-daemon has a lot of memory used too (250Mb)

Launchpad Details: #LPC Sergey "Shnatsel" Davidoff - 2014-12-21 04:41:05 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

In addition, such state makes any operations in Files very slow. Even the startup process of pantheon-files while pantheon-files-daemon is in such state is very slow.

Launchpad Details: #LPC Sergey "Shnatsel" Davidoff - 2014-12-21 04:42:04 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Replacing GList with GSequence data structure might be a way to hotfix this without changing huge amounts of code.

Launchpad Details: #LPC Sergey "Shnatsel" Davidoff - 2014-12-24 21:55:20 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

A bounty of 100$ has been placed on this bug

Launchpad Details: #LPC Jeremy Wootten - 2015-02-18 14:55:15 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Might this be related with a similar issue concerning very slow file transfer to USB stick?

Launchpad Details: #LPC Giulio Sant - 2015-02-23 16:46:20 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

I have changed the bug description to clarify that the bounty relates to obtaining significant improvement in file copying performance in general, not just for file numbers of the order of 100,000. Even with comparatively small numbers of files (100 - 1000) Files is very much slower than other well known file managers. I have increased the bounty to reflect the widened scope.

Launchpad Details: #LPC Jeremy Wootten - 2015-03-01 08:04:25 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Note solved but it was improved some so I'm bumping it from the milestone

Launchpad Details: #LPC Cody Garver - 2015-03-25 11:33:10 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

I assume both source and destination were open in a Files view during the copy? Different tabs or different windows? Icon View or other?

Launchpad Details: #LPC Jeremy Wootten - 2016-04-09 10:22:46 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

So I ran a couple benchmarks to see if I could figure out what the problems might be here. Wrote a simple program, basically a "cp" clone using g_file_copy to benchmark copy speeds against "cp" itself. What I found is that g_file_copy has very similar performance to "cp" (copied 10,000 small files at about 160kB/sec), so no problems there. Seems more like this has to do with all the queuing and locking going on in the file manager. Been swapping out various data structures and benchmarking and seeing some small performance increases. Removing some locking from the deep counter and switching out the marlin file queue for a thread-safe GAsyncQueue improved things a bit. I've been getting between 40kB/sec to 60kB/sec with those changes. It might also be worth swapping out the GIOScheduler stuff since that is deprecated. Not sure if that will bring any speed increase with it.

Launchpad Details: #LPC Matt Spaulding - 2016-10-12 20:53:15 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Matt: Thanks for having a go at this. Just for clarity, the target is to get Files to be at least comparable to other popular filemanagers in performance in this aspect, say within 75%? This assumes that other features like "undo" that might affect speed are also comparable.

Launchpad Details: #LPC Jeremy Wootten - 2016-10-13 10:46:49 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Okay, thank you for the clarification. Which file managers should I run comparisons against? At least in my tests with Nautilus it's copy speeds with large numbers of files is very poor, comparable to what we're seeing with Files.

Launchpad Details: #LPC Matt Spaulding - 2016-10-13 15:09:16 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Matt: I was thinking of Thunar and PCFman primarily although I have not done a comparison recently I admit. I assumed Nautilus was was superior at time of filing of the bug but perhaps things have changed. It is a fairly old bug now.

If Files is (now) already comparable to the best file managers under the conditions quoted in the bug then I would be willing to change the target to a more modest improvement and/or fixing of associated memory leakages.

Launchpad Details: #LPC Jeremy Wootten - 2016-10-13 17:34:41 +0000

elementaryBot avatar Jun 18 '17 15:06 elementaryBot

Not sure if launchpad comment will get forwarded here but I've got a patch if anyone would like to try it out and comment? See https://github.com/vjr/files/commit/c972549cf42ac68cacc24b1bed1d080f5542d2c1

vjr avatar Jun 20 '17 05:06 vjr

Is this still needing to be resolved?

ghost avatar Sep 06 '18 07:09 ghost

Yes, I can still reproduce this behavior sometimes with Juno.

elegaanz avatar Sep 06 '18 07:09 elegaanz

Need to ensure that files are not being swapped out as tmpfs fills up.

jeremypw avatar Oct 25 '18 09:10 jeremypw

Does this only happen for copying or also for moving or linking?

jeremypw avatar Oct 25 '18 09:10 jeremypw

Is the destination folder being displayed? If so, some of the delay may be caused by processing FIleMonitor signals and updating the associated async directory object and the display widget - especially for GtkIconView which gets very sluggish for a large number of file items.

jeremypw avatar Oct 25 '18 09:10 jeremypw

I found recently (version 4.1.9 on Juno) that repeated use of <Ctrl>A, <Ctrl>C and <Ctrl>V to create large numbers of (empty) files (doubling the number each cycle) becomes very slow after a few cycles (about 10).

jeremypw avatar Jul 30 '19 09:07 jeremypw

I am not a developer; however, has using the tar command ever been considered? I have used it to efficiently copy large numbers of files. The following articles elaborate on its utility:

Note that the last article demonstrates that enabling noatime in particular for the file system sped up the process, which may be a worthy consideration.

ghost avatar Jan 31 '20 19:01 ghost

I agree that for very heavy file manipulation tasks special tools are better than Files which at the moment is more suitable for general file browsing and operations on small numbers of files. Things like the color-tag plugin and the undo manager contribute an increasingly large overhead as the number of files increases.

jeremypw avatar Jan 31 '20 21:01 jeremypw

@jeremypw Why stop at “general file browsing,” though? Would it not be wonderful if Files were better equipped for universal applications rather than being adequate for “small” file operations? To be able to use the first party applications even under heavy loads would lessen one's dependence on the Terminal or third parties; for example, using Files to perform large copy operations, using Photos to edit very large photographs, or using Music to manage a library of several thousands of songs.

ghost avatar Feb 02 '20 10:02 ghost

Absolutely, but everything is limited by the developer time/abilities available. All developers are free to contribute new abilities to Files and improve the existing ones, but none are paid to do so. Also there is sometimes a pay off between ease of use and efficiency. e.g. it is useful to have unlimited undo abilities and color tagging but this slows up file operations. An app that focuses purely on moving files may not implement these.

jeremypw avatar Feb 02 '20 10:02 jeremypw

I understand. I did not mean to presume that the elementary OS team is swimming in cash, employees, and/or volunteers. I do hope that a solution will be found for this issue.

ghost avatar Feb 02 '20 11:02 ghost