transmission icon indicating copy to clipboard operation
transmission copied to clipboard

T 4.0.2 & 4.0.3 macOS "Unable to save resume file / Can't Get File too many open files (24)

Open chaseholden opened this issue 2 years ago • 36 comments
trafficstars

What is the issue?

T 4.0.2 Mac client Error "Unable to save resume file: too many open files" reference issue #3028

  • macOS Ventura 13.3 and 13.4 on M1 Ultra ARM64
  • over 3K open active magnets - 1TB free space & 64GB RAM
  • Attempting to reduce Global max connections down to even 40 does not help.
  • Workaround: pause and resume each magnet or use Transmission 3.0
  • Note FWIW: all files originated from Magnet links
  • See or reopen Issue #3028
T402-Bug-TooManyOpenFiles

Which application of Transmission?

macOS app

Which version of Transmission?

4.0.2 Public Release build 2a57b17031

chaseholden avatar Apr 13 '23 04:04 chaseholden

In T 4.0.3, I am now getting what seems to be a related "Error: Can't Get File … Too many open files (24)" in public release 4.0.3. build 6b0e49bbb2. Let me know what to do to help troubleshoot if I can. I'm not a dev but I have installed Xcode.

Screenshot 2023-04-16 at 5 36 19 PM

chaseholden avatar Apr 16 '23 23:04 chaseholden

I also see this issue on MacOS on Transmission 4.0.3 and Ventura 13.3.1 on arm64

michaeltin avatar May 02 '23 00:05 michaeltin

caleb@caleb-server:~/scripts$ transmission-daemon --version
transmission-daemon 4.0.3 (6b0e49bbb2)
caleb@caleb-server:~/scripts$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.3 LTS
Release:	22.04
Codename:	jammy
caleb@caleb-server:~/scripts lsof -np 60239 | wc -l
872
caleb@caleb-server:~/scripts$ cat /proc/sys/fs/file-max
9223372036854775807

Transmission web client shows 2,000 transfers under "Show: All." Active shows 97.

Many transfers, regardless of state, show one of two errors:

Error: Unable to save resume file: Too many open files
Error: Couldn't get '....': Too many open files (24)

komali2 avatar Aug 28 '23 05:08 komali2

@komali2 Error: Couldn't get '....': Too many open files (24) stood out to me, '....' should contain the file path of one of the files in the torrent. This implies that Transmission might be incorrectly parsing some .torrent files.

If the torrent is in public domain, could you upload it here?

tearfur avatar Sep 15 '23 08:09 tearfur

@komali2 Error: Couldn't get '....': Too many open files (24) stood out to me, '....' should contain the file path of one of the files in the torrent. This implies that Transmission might be incorrectly parsing some .torrent files.

If the torrent is in public domain, could you upload it here?

Oh, sorry, that was me manually changing the message. I get that message thousands of times, one for each of my torrent files. They're private to my organization torrents, unfortunately I can't share them. That does indeed, though, carry the path of the files in the torrent.

What I noticed though is I haven't gotten the error since. It may have been that some other program was also accessing thousands (millions?) of files simultaneously at some point for some massive move operation or something? There's a lot going on on that server, honestly probably too much lol.

komali2 avatar Sep 15 '23 10:09 komali2

Same problem inversion 4.04 for me. same message on thousands of files. saved the torrent and removed it on thousands of complete active seeds, a 4 terabytes of files worth, a few hundred at a time and it's still a problem. I'm at a loss.

Is there a way to downgrade to to the old version?

Mac Mini M2 Pro, 16gb ram, 512gb ssd, Ventura 13.5.2. External 16TB hdd connected via usb.

Screenshot 2023-09-15 at 10 44 06 AM

gradu8ed avatar Sep 18 '23 16:09 gradu8ed

I had a look at the code from libtransmission. AFAICT, Transmission is supposed to have at most 32 files descriptors open for torrent files at any time, with the following exceptions:

  1. File descriptors used for copying torrent files (e.g. when moving torrent to a new location) does not count.
  2. File descriptors used for verifying torrent files does not count.

32 open files shouldn't even come close to triggering that error.

I don't see anything wrong with the code so far. If you'd like to help look for the cause, you can start from cache.h, the functions listed here are the main entry points to file IO code:

https://github.com/transmission/transmission/blob/6ead147620680259248bdebb764dae19f0cb8933/libtransmission/cache.h#L36-L42

It's also possible that some of you simply have a very small per-process open file limit. On Linux and macOS, you can check your open file limit by the command ulimit -a, for example:

$ ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 127102
max locked memory           (kbytes, -l) 8192
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 127102
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

tearfur avatar Sep 21 '23 08:09 tearfur

You’re over my head, I am but a simple user with not programming experience. I’ve been using Transmission for more than a decade and this is something that was never an issue in version 3. I used all of the same settings from v3 to v4. I can’t put a finger on exactly when this started being a problem for me. I was seeding about 2,500 files and downloading another 1,500 +. For a while I figured that there was some kind of limit on the number of total files that Transmission would allow. So I’d save the torrent for 400-500 transfers and then remove them and it would be ok for a week or two and then I’d do it again, and again and I’m not sure what to do at this point. I’m uneasy about jumping in to Terminal commands though.

Is there a way to downgrade to version 3?

Matt

On Sep 21, 2023, at 3:21 AM, Yat Ho @.***> wrote:

I had a look at the code from libtransmission. AFAICT, Transmission is supposed to have at most 32 files descriptors open for torrent files at any time, with the following exceptions:

File descriptors used for copying torrent files (e.g. when moving torrent to a new location) does not count. File descriptors used for verifying torrent files does not count. I don't see anything wrong with the code so far. If you'd like to help look for the cause, you can start from cache.h, the functions listed here are the main entry points to file IO code:

https://github.com/transmission/transmission/blob/6ead147620680259248bdebb764dae19f0cb8933/libtransmission/cache.h#L36-L42

It's also possible that some of you simply have a very small per-process open file limit. On Linux and macOS, you can check your open file limit by the command ulimit -a, for example:

$ ulimit -a real-time non-blocking time (microseconds, -R) unlimited core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 127102 max locked memory (kbytes, -l) 8192 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 127102 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited — Reply to this email directly, view it on GitHub https://github.com/transmission/transmission/issues/5385#issuecomment-1729084215, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYF5MYOFNE2SXDP76R5FNQDX3P2IBANCNFSM6AAAAAAW4REAAQ. You are receiving this because you commented.

gradu8ed avatar Sep 24 '23 01:09 gradu8ed

Any solution to this? - I have 63 transfers (but some of the transfers indeed have many files) and getting this error. Using Transmission 4.0.4 on a Mac Mini M1 running Ventura 13.5.2.

Transmission is set to 200 global maximum connections and 60 Maximum connections for new transfers.

$ lsof -n | grep -i transmiss | wc -l
     150
$ sysctl -a | grep maxfile
kern.maxfiles: 122880
kern.maxfilesperproc: 61440
$ ulimit -a
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8176
-c: core file size (blocks)         0
-v: address space (kbytes)          unlimited
-l: locked-in-memory size (kbytes)  unlimited
-u: processes                       2666
-n: file descriptors                4092

rgaufman avatar Sep 26 '23 08:09 rgaufman

Same issue on Transmission 4.0.4 and Synology DSM 7.1.1.

I found that transmission open a lot of /dev/urandom

$ ls -l  /proc/[pid of transmission]/fd
total 0
lrwx------  1 sc-transmission transmission 64 Oct  2 22:39 0 -> /dev/null
lrwx------  1 sc-transmission transmission 64 Oct  2 22:39 1 -> /dev/null
l-wx------  1 sc-transmission transmission 64 Oct  2 22:39 10 -> 'pipe:[61446]'
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 100 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1000 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1001 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1002 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1003 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1004 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1005 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1006 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1007 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1008 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1009 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 101 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1010 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1011 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1012 -> /dev/urandom
lr-x------  1 sc-transmission transmission 64 Oct  2 22:39 1013 -> /dev/urandom
...

$ ls -l  /proc/[pid of transmission]/fd|grep /dev/urandom|wc -l
973

Maybe some method open /dev/urandom to generate random bytes and forget to close it?

mihirukiss avatar Oct 02 '23 15:10 mihirukiss

@mihirukiss Hmm, nice find!

This is a big hint. Does your Synology NAS by chance use OpenSSL 1.x? If you do, then your case looks a lot like https://github.com/haproxy/haproxy/commit/56996dabe67b484b7c0e90192539c57e60483751.

tearfur avatar Oct 03 '23 01:10 tearfur

@tearfur the openssl version on my nas is 1.1.1t.

$ openssl version
OpenSSL 1.1.1t  7 Feb 2023

I haven't update dsm since July. Transmission works fine on my nas before I update it last month(Sorry, I can't remember which version I use before update).

mihirukiss avatar Oct 03 '23 04:10 mihirukiss

Hmmm, looks like the commit I linked isn't related here: According to the commit message, the bug it fixed is triggered by calling execvp(), but we don't do anything like that in our code.

Knowing that a lot of those open files are /dev/urandom is still very useful though. I haven't got an idea where the bug could be yet, but it should help narrow the search by a lot.

tearfur avatar Oct 03 '23 07:10 tearfur

You’re over my head, I am but a simple user with not programming experience. I’ve been using Transmission for more than a decade and this is something that was never an issue in version 3. I used all of the same settings from v3 to v4. I can’t put a finger on exactly when this started being a problem for me. I was seeding about 2,500 files and downloading another 1,500 +. For a while I figured that there was some kind of limit on the number of total files that Transmission would allow. So I’d save the torrent for 400-500 transfers and then remove them and it would be ok for a week or two and then I’d do it again, and again and I’m not sure what to do at this point. I’m uneasy about jumping in to Terminal commands though. Is there a way to downgrade to version 3? Matt

@gradu8ed

Try what was recommended for checking your system's open file limit with ulimit -a, it's a possibility that's where your issue is. Then you can try raising that limit.

Transmission is FOSS software. With great freedom comes great responsibility. You alone are responsible for managing the software on your machine and your machine's settings. I'm sure you're more capable of figuring it out than you give yourself credit for, that's part of the point of FOSS, to empower ALL users, not just programmers.

Don't be uneasy about jumping into terminal commands. Keep your important stuff backed up and you'll be just fine. Working in the terminal can eventually be fun (I do it every day, basically all day) and bonus you look like a badass hacker for doing something as simple as moving a file lol.

komali2 avatar Nov 22 '23 03:11 komali2

The problem went away for me since upgrading to MacOS Sonoma 14.1.1 - seems Apple fixed it? - is anyone still experiencing this?

rgaufman avatar Nov 22 '23 12:11 rgaufman

It still happens on my Synology. So it's probably not a macOS Bug.

I think it happens when I add new torrents. I installed a fresh Transmission 4.0.2 in Docker, and so far I didn't have that problem there. And from what I remember I think I didn't have this problem in 4.0.2 in the main install, so it probably got introduced in 4.0.3 or 4.0.4.

Metal-Snake avatar Nov 22 '23 18:11 Metal-Snake

Same here, I've had the problem for months (I couldn't date the first occurrence exactly). 3000+ torrents loaded on a Transmission 4.0.5 instance hosted on a Synology DS418 with ARM architecture (ARM Cortex-A53 / ARMv8-A). The common point with the ticket initiator seems to be ARM architecture. I noticed that transmission showed the error 2 times out of 3 when starting, and when this happened, it limited writing to the disk to other applications (I didn't do any tests, I just noticed this behaviour). If other users can confirm their CPU arch it might help maybe.

And same as @mihirukiss, lot of /dev/urandom open :

$ ls -l  /proc/11394/fd|grep /dev/urandom|wc -l
940
$ cat /volume1/@appdata/transmission/transmission.log
[2024-01-03 00:22:55.408] inf session.cc:646 Transmission version 4.0.5 (a6fe2a64aa) starting (session.cc:646)
[2024-01-03 00:23:53.166] WRN net.cc:153 Couldn't create socket: Too many open files (24) (net.cc:153)
[2024-01-03 02:46:55.343] ERR variant.cc:1102 Couldn't save '/volume1/@appdata/transmission/resume/0810b448729ca44e8326e1e88d0d2599931966f7.resume': Too many open files (24) (variant.cc:1102)
[2024-01-03 03:16:54.741] ERR variant.cc:1102 Couldn't save '/volume1/@appdata/transmission/stats.json': Too many open files (24) (variant.cc:1102)
[2024-01-03 03:38:34.854] ERR file.ext Couldn't get '/volume1/path/file.ext': Too many open files (24) (inout.cc:156)
$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 6432
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 6432
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

peacheriii avatar Jan 03 '24 04:01 peacheriii

Same here, I've had the problem for months (I couldn't date the first occurrence exactly). 3000+ torrents loaded on a Transmission 4.0.5 instance hosted on a Synology DS418 with ARM architecture (ARM Cortex-A53 / ARMv8-A). The common point with the ticket initiator seems to be ARM architecture.

My Synology is x86 and has this problem. I switched from the native synology package version 4.0.4 to the Docker package and using version 4.0.2 now and the problem is gone.

Metal-Snake avatar Jan 03 '24 09:01 Metal-Snake

Same here. I'm running 4.0.5-26 on a Synology DS920+ DSM 7.2.1-69057 Update 3, 20GB mem

Patrick010 avatar Jan 16 '24 21:01 Patrick010

@tearfur, we've noted users on Synology experiencing this issue since v4.0.2 in some cases. I am working with a user to help diagnose the issue (https://github.com/SynoCommunity/spksrc/issues/5999) and wanted to know if there was any specific information beyond what I have requested that would be useful to help resolve this issue.

mreid-tt avatar Feb 02 '24 01:02 mreid-tt

wanted to know if there was any specific information beyond what I have requested that would be useful to help resolve this issue.

@mreid-tt I'm also poking around in the dark here, I still don't understand this problem very well. I do have a hunch however.

For everyone that's experiencing this problem, did you configure any torrent scripts? You are using a script if you have set any of the script-torrent-*-enabled settings to true.

Those scripts does use the execvp() system call. So this might be a similar problem with what https://github.com/haproxy/haproxy/commit/56996dabe67b484b7c0e90192539c57e60483751 fixes.

@mikedld Do you have any ideas? This seems to be related to the cypto libraries.

tearfur avatar Feb 02 '24 02:02 tearfur

@tearfur Don't use script here:

    "script-torrent-added-enabled": false,
    "script-torrent-added-filename": "",
    "script-torrent-done-enabled": false,
    "script-torrent-done-filename": "",
    "script-torrent-done-seeding-enabled": false,
    "script-torrent-done-seeding-filename": "",

peacheriii avatar Feb 02 '24 02:02 peacheriii

Bummer, so that's a miss. 😊

tearfur avatar Feb 02 '24 03:02 tearfur

Hey @mikedld, when you have a moment, I'd value your insights on this matter. It's been impacting certain Synology users since v4.0.2 in specific instances. If there's any specific diagnostic information I could assist in gathering, please let me know. Your input would be greatly appreciated.

mreid-tt avatar Feb 04 '24 12:02 mreid-tt

I downgraded to 4.0.2 on my Synology system and the too many open files error went away.

flerbert12983 avatar Feb 10 '24 23:02 flerbert12983

Hey @mikedld, I'm circling back on this matter. As mentioned earlier, a user demonstrated that this issue wasn't present in v4.0.2, which you assisted me in bringing to the Synology community. Do you have any insights into what might have altered in the newer versions?

mreid-tt avatar Feb 20 '24 12:02 mreid-tt

Since it was demonstrated to not only affect macOS, could it help to remove the "os:mac" label to signify this bug's wider reach and – perhaps – higher urgency?

fabrykowski avatar Mar 24 '24 17:03 fabrykowski

Can confirm that downgrade from 4.0.3 to 4.0.2 fixed this issue. This statement has been posted multiple times, but the title of this issue includes 4.0.2 as affected.

dhqcz avatar May 12 '24 04:05 dhqcz

This issue went away for me after turning off remote access via the web interface. Suspect either keeping all torrents remotely accessible was putting the app under undue stress or hackers where spamming the open port attempting to gain access.

derrend avatar May 22 '24 09:05 derrend

~Can also confirm that downgrading to 4.0.2 fixes the issue; thanks for that everyone.

FWIW, I do also have the web interface available, however I do not have it exposed to the wide web.~

Elidrake avatar Jun 14 '24 13:06 Elidrake