nitter icon indicating copy to clipboard operation
nitter copied to clipboard

Loading profiles breaks after some time

Open tux93 opened this issue 3 years ago • 19 comments

After some runtime nitter becomes unresponsive when trying to load profiles and nginx will only show 504 Gateway timeouts,
direct links to single tweets continue to work.

After seeing discussion of similar symptoms on Matrix last night I updated redis, but the issue continues.
I have now recompiled nitter without -d:release and enabled Debug, the resulting log is attached.

The pattern I noticed is that this seems to be triggered by some amount of RSS feeds being refreshed in a short time, once this state is entered all further requests to profiles or rss endpoints time out and the only way to recover is restarting nitter.

nitter.log

tux93 avatar Jan 18 '22 15:01 tux93

Sanity check:

  • Delete ~/.nimble to refresh dependencies, recompile
  • Run flushdb in redis-cli to clear Redis, restart nitter
  • Does it happen if you set redisConnections and redisMaxConnections to 0 in nitter.conf? (disables the connection pool)

Useful info:

  • Nim version
  • Redis version
  • OS and hardware info (Linux? ARMv7?, etc)
  • Environment (VPS, local, AWS, etc)
  • Reproduction steps

zedeus avatar Jan 18 '22 19:01 zedeus

Sanity check:

  • Delete ~/.nimble to refresh dependencies, recompile

Done

  • Run flushdb in redis-cli to clear Redis, restart nitter

Done

  • Does it happen if you set redisConnections and redisMaxConnections to 0 in nitter.conf? (disables the connection pool)

It has not happened again yet but I'd like to wait a little more to let some more feed refreshes pass

Useful info:

  • Nim version
# nim --version                                                                                                                                                                                                              
Nim Compiler Version 1.6.2 [Linux: amd64]
  • Redis version
# redis-server --version
Redis server v=6.2.6 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=f969e617abf7f198
  • OS and hardware info (Linux? ARMv7?, etc)

openSUSE 15.3 x86_64

  • Environment (VPS, local, AWS, etc)

VPS with 8 Threads and 4GB RAM

  • Reproduction steps
  1. Start Nitter
  2. Add a number of feeds to an RSS reader (in my case Nextcloud News)
  3. Refresh the feeds periodically
  4. After some time feeds will fail to refresh and requests time out, at that profile pages also become inaccessible

tux93 avatar Jan 18 '22 23:01 tux93

  • Does it happen if you set redisConnections and redisMaxConnections to 0 in nitter.conf? (disables the connection pool)

It has not happened again yet but I'd like to wait a little more to let some more feed refreshes pass

With those settings set to 0 it was stable for multiple refresh cycles, after setting them back to the default values it happened again after only two refreshes of the feeds

I'll set it back to 0 and leave it overnight to make sure that it won't just take longer but still happen eventually

tux93 avatar Jan 19 '22 00:01 tux93

With redisConnections and redisMaxConnections set to 0 it seems to no longer happen, been stable for a day now

tux93 avatar Jan 19 '22 20:01 tux93

For testing purposes I set redisConnections = 4 and redisMaxConnections = 8 which also lead to the error and this time produced some async errors in the debug log

nitter_debug.log

tux93 avatar Jan 19 '22 21:01 tux93

That's interesting, thank you. Is your Redis config default? Do you use a password?

zedeus avatar Jan 19 '22 23:01 zedeus

@tux93 I've pushed a commit that could potentially solve the issue, but I have no way to reproduce it so I can't test.

zedeus avatar Jan 20 '22 01:01 zedeus

Updated and set the redis*Connections values back to default, lets see what'll happen

That's interesting, thank you. Is your Redis config default? Do you use a password?

No, no password, here's the config of my nitter redis instance:

# cat /etc/redis/nitter.conf /etc/redis/common.conf
bind 127.0.0.1
port 6387
pidfile /run/redis/nitter.pid
logfile /var/log/redis/nitter.log
dir /var/lib/redis/nitter/

include /etc/redis/common.conf

protected-mode yes
tcp-backlog 511
unixsocketperm 770
timeout 600
tcp-keepalive 300
daemonize no
supervised systemd
loglevel notice
databases 16
always-show-logo no
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
rdb-del-sync-files no
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-diskless-load disabled
repl-disable-tcp-nodelay no
replica-priority 100
acllog-max-len 128
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no
lazyfree-lazy-user-del no
oom-score-adj no
oom-score-adj-values 0 200 800
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes
jemalloc-bg-thread yes

tux93 avatar Jan 20 '22 08:01 tux93

Using latest commit and your Redis config, I'm still unable reproduce the issue. I tried reverting to before the changes I mentioned, but it still works fine despite opening ~100 profiles + RSS feeds and lists.

zedeus avatar Jan 20 '22 10:01 zedeus

For me the latest commit has improved the situation: profile pages no longer become inaccessible, but some RSS requests still time out, which wasn't the case with the redis*Connections settings set to zero.

Jan 20 10:16:23 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:23,335 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 605' returned 255 with output: 'cURL error 28: Operation timed out after 60001 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/kiaun_ad/media/rss'                                                     
Jan 20 10:16:23 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:23,335 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 604' returned 255 with output: 'cURL error 28: Operation timed out after 60001 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/CasparRoo/media/rss'                                                    
Jan 20 10:16:23 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:23,390 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 603' returned 255 with output: 'cURL error 28: Operation timed out after 60001 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/Winterbalg/media/rss'                                                   
Jan 20 10:16:23 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:23,461 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 599' returned 255 with output: 'cURL error 28: Operation timed out after 60001 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/RoaryAndFriends/media/rss'                                              
Jan 20 10:16:28 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:28,532 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 579' returned 255 with output: 'cURL error 28: Operation timed out after 60000 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/SavajAD/media/rss'                                                      
Jan 20 10:16:30 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:30,976 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 560' returned 255 with output: 'cURL error 28: Operation timed out after 60001 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/SavajBunny/media/rss'                                                   
Jan 20 10:16:34 tuxVPS nextcloud-news-updater[531101]: 2022-01-20 10:16:34,878 - Nextcloud News Updater - ERROR - Command 'php -f /srv/www/owncloud/occ news:updater:update-feed fl4nn 543' returned 255 with output: 'cURL error 28: Operation timed out after 60000 milliseconds with 0 bytes received (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://nitter.squirrel.rocks/KarimVSA/media/rss'                                                     
Jan 20 10:16:35 tuxVPS systemd[1]: nextcloud-news.service: Succeeded.

I wonder if i could work around that by increasing the rssCache setting in nitter.conf and/or the timeout of the rss reader

tux93 avatar Jan 20 '22 10:01 tux93

RSS requests should never take more than 1 secoond at most, so increasing timeouts won't help. Increasing the cache time doesn't seem like it would help, since the issue appears occur when Nitter's Redis client pool gets filled up with bad connections causing errors and timeouts. How that happens I don't know, but the aforementioned change reduces the amount of "weird stuff" I do with the Redis clients.

zedeus avatar Jan 20 '22 10:01 zedeus

I've pushed an update to my Redis pool that adds error handling, discarding bad connections. It may help, but of course you shouldn't run into these errors to begin with.

zedeus avatar Jan 20 '22 10:01 zedeus

Sadly no, the timeouts are still there

tux93 avatar Jan 20 '22 12:01 tux93

In that case I recommend you just disable the Redis pool, then everything should be fine

zedeus avatar Jan 20 '22 19:01 zedeus

Hi!

I am using an online RSS reader (Inoreader), and all nitter subscriptions give a timeout error as follows.

cURL error 28: Connection timed out after 30001 milliseconds

gqcao avatar May 27 '22 08:05 gqcao

What instance?

zedeus avatar May 27 '22 08:05 zedeus

Hi,

It's nitter.net. I also contacted Inoreader and got the following resopnse.

Yes, we are receiving similar complaints in the past few days. It looks like the connection to this source times out for some reason. It could be a connectivity issue, geo-restriction, rate limit, or similar. We've moved the most popular feeds from that source imported here to proxy polling servers, and they are working again. But still, if you have an account in this service, we suggest you contact them and ask about that.

gqcao avatar May 27 '22 10:05 gqcao

nitter.net has rate limiting for RSS, but you should only be affected if you have hundreds of feeds and inoreader tries to get them all at once in intervals

zedeus avatar May 27 '22 10:05 zedeus

Thanks. I moved to another instance and it seems to work so far.

gqcao avatar May 27 '22 14:05 gqcao

FWIW, I'm seeing similar issues on my own Dockerized instance. After a while loading /<username> gets stuck and nginx barfs on gateway timeout. If I hit that same URL again a few seconds later, it does show up OK. Docker logs are full of API timeouts a-la: error: OSError, msg: Operation timed out or sometimes error: OSError, msg: Host is unreachable. There's considerably more spew than that, but it doesn't offer any more insight into what's failing. Restarting the dockerized instance seems to alleviate this problem for a while, but then it starts happening again.

1e100 avatar Nov 07 '22 01:11 1e100

Please reopen with more info or open a new issue if this is still a problem.

zedeus avatar Nov 27 '22 17:11 zedeus