bogrep icon indicating copy to clipboard operation
bogrep copied to clipboard

Fetch error: Too many open files

Open mattmartini opened this issue 1 year ago • 10 comments
trafficstars

When trying to do a bogrep fetch I am getting the error below.

$ bogrep fetch Error: Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/78aa542f-52c1-4b5e-b475-15293854996a.txt: Too many open files (os error 24) $ (140/8005)

I tried setting "max_concurrent_requests": 50, and still get this issue.

OS: Darwin 23.1.0 - macOS 14.1.1 (Sonoma) version: bogrep 0.5.0

mattmartini avatar Nov 28 '23 23:11 mattmartini

Thanks, seems that writing to the file system needs to be limited as well (besides limiting the number of open network connections)

quambene avatar Nov 29 '23 02:11 quambene

On linux systems, ulimit -n shows the number of open files allowed (see https://ss64.com/bash/ulimit.html). Often, the default is 1024. Could you check what's the value on your OS?

quambene avatar Nov 29 '23 04:11 quambene

$ ulimit -n 256

On Nov 28, 2023, at 11:01 PM, quambene @.***> wrote:

On linux systems, ulimit -n shows the number of open files allowed (see https://ss64.com/bash/ulimit.html). Often, the default is 1024. Could you check what's the value on your OS?

— Reply to this email directly, view it on GitHub https://github.com/quambene/bogrep/issues/59#issuecomment-1831180559, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAM6OQCMJTVZOBZ2NG74LLYG2XRBAVCNFSM6AAAAAA76RAQY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZRGE4DANJVHE. You are receiving this because you authored the thread.

mattmartini avatar Nov 29 '23 05:11 mattmartini

Above fix did not solve the problem.

$ bogrep init
Imported 8007 bookmarks from 2 sources: /Users/USERNAME/Library/Application Support/Firefox/Profiles/8cycwilz.Everyday_Usage/bookmarkbackups/bookmarks-2023-11-29_9410_DCFJZCTgp91yEKP+oZsoDA==.jsonlz4, /Users/USERNAME/Library/Application Support/Google/Chrome/Default/Bookmarks
Error: Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/97d40634-be8c-4332-987b-6bb9f576d2e4.txt: Too many open files (os error 24)
$ (197/8007)

mattmartini avatar Nov 29 '23 16:11 mattmartini

sorry, it's not fixed yet. I will let you know when it's ready

quambene avatar Nov 29 '23 22:11 quambene

Fixed on main branch. Will prepare release 0.6.0.

I set the default for max_idle_connections_per_host in settings.json from 100 to the more sensible 10. Idle connections were stuck in the connection pool.

Idle connections will be removed after 5 seconds (see idle_connections_timeout in settings.json).

See the documentation: https://docs.rs/bogrep/latest/bogrep/struct.Settings.html#fields

Please remove your bogrep config folder with the old settings.json before running bogrep.

If you still get a "Too many open files" error, try to decrease max_idle_connections_per_host to 5, and I will update the default accordingly.

quambene avatar Nov 30 '23 00:11 quambene

"max_concurrent_requests": 10,
"max_idle_connections_per_host": 5,

Seems to work (slowly). However got this error (I believe this is a chrome bookmark)

Error: Can't get host for url: javascript:void(location.href='http://tinyurl.com/create.php?url='+encodeURIComponent(location.href))

mattmartini avatar Nov 30 '23 17:11 mattmartini

Pushed an improvement to main branch: https://github.com/quambene/bogrep/pull/63

Fetching will not be aborted any more if an expected error occurs, so you should be able to finish processing (but with a few warnings instead).

You can try to set max_concurrent_requests to 100 again.

There is still something odd with macOS which I have to investigate. On Ubuntu, I'm able to fetch 500 concurrent requests, and it's finished quickly.

quambene avatar Nov 30 '23 20:11 quambene

Bumped max_concurrent_requests back up to 100. Still getting many Too many open files errors. Some for creating the cache file, and some for fetching a website.

Only 2783 cache files created out of 8007 bookmarks. This seems very low, sure there are probably many dead links in the bookmarks, but not 71%.

[2023-11-30T23:37:34Z WARN bogrep::cmd::fetch] Can't create file at /Users/USERNAME/Library/Application Support/bogrep/cache/720450d1-9108-4e76-a2ff-1b1431cca0c5.txt: Too many open files (os error 24)

[2023-11-30T23:37:34Z WARN bogrep::cmd::fetch] Can't fetch website: error sending request for url (http://www.j.nurick.dial.pipex.com/Code/Perl/index.htm): error trying to connect: dns error: proto error: io error: Too many open files (os error 24)

[2023-11-30T23:36:44Z WARN bogrep::cmd::fetch] Can't fetch website: error sending request for url (http://support.apple.com/kb/HT1159#mac_pro): error trying to connect: dns error: proto error: io error: Too many open files (os error 24)

Dropped max_concurrent_requests to 20. Much slower ( 15:58 min to try and fetch 8007 URLs), but only got warnings for javascript: bookmarks.

[2023-11-30T23:50:11Z WARN bogrep::cmd::fetch] Can't get host for url: javascript:void(location.href='http://tinyurl.com/create.php?url='+encodeURIComponent(location.href))

Retrieved 5447 out of 8007 bookmarks.

On another note, you should do point releases like v0.6.1 ;-) Replaced package `bogrep v0.6.0 (/Users/USERNAME/Projects/BookMarks/bogrep)` with `bogrep v0.6.0 (/Users/USERNAME/Projects/BookMarks/bogrep)` (executable `bogrep`)

mattmartini avatar Dec 01 '23 00:12 mattmartini

Thanks for checking!

71% error rate is indeed too much and is explained by the "Too many open files" errors which prevents fetching and caching.

I would have expected that 100 concurrent requests would work without issues though.

For example, on Ubuntu I was fetching 500 concurrent requests successfully which is explained by:

500 open files + 500 open connections = 1000 linux sockets < 1024

where 1024 is the limit for open files on Ubuntu.

The same calculation doesn't seem to work for macOS, where we have:

100 open files + 100 open connections = 200 sockets < 256

I will dig a bit more why the expected 100 for max_concurrent_requests is not working on macOS.

Unfortunately, most releases include breaking changes, that's why I'm increasing the minor version. Next release includes a bugfix without breaking changes though and will be v0.6.1 :)

quambene avatar Dec 01 '23 01:12 quambene

@mattmartini should be fixed in v0.10.0: https://github.com/quambene/bogrep?tab=readme-ov-file#usage

default for open files and network sockets is 500 each now. the required file descriptor limit (i.e. 1000) will be set automatically (on linux and macOS)

quambene avatar Nov 30 '24 00:11 quambene