gotosocial icon indicating copy to clipboard operation
gotosocial copied to clipboard

[bug] Pleroma instances tend to massively overload my GtS instance for some reason

Open eeeeeta opened this issue 1 year ago • 18 comments

Describe the bug with a clear and concise description of what the bug is.

Due to some sort of incompatibility (perhaps #107), Pleroma instances tend to get stuck in a loop requesting the same things over and over from my instance. When making a particularly popular post, the effect of lots of Pleromas doing this causes CPU load to spike to ~200%, the journal to get overwhelmed, ... -- basically a small Denial of Service attack.

What's your GoToSocial Version?

v0.10.0 git-2470e21 (patched)

GoToSocial Arch

amd64 binary

What happened?

  • Make a post that gets boosted by a bunch of people
  • Notice GtS getting slower to respond
  • Have my Prometheus alerts tell me that the server is overheating and has its resources stretched
  • Check the logs to find lots of Pleroma spam

A random sample of 100 lines of it looks like this.

What you expected to happen?

Although this probably needs to be fixed pleroma-side, it'd be nice to have some mitigation for this on the GtS end -- I currently just block the pleroma User-Agent entirely, but some rate-limiting to detect this case might be useful

How to reproduce it?

No response

Anything else we need to know?

No response

eeeeeta avatar Jul 30 '23 16:07 eeeeeta

Huh, that's odd! Thanks, will investigate soon (tm).

tsmethurst avatar Jul 31 '23 09:07 tsmethurst

@tsmethurst I just had a thought, what if we are also doing the same thing we found that some Mastodon instances were doing and serving self-referential status collection pages?

this would be very ironic :')

NyaaaWhatsUpDoc avatar Jul 31 '23 10:07 NyaaaWhatsUpDoc

I'm pretty sure this one is actually PEBKAC -- I had hacked up my GtS database to move some of my old posts over from another instance, but this resulted in the pinned toot being broken, and I think that's why the Pleromas carried on crashing and retrying in a big loop. Whoops! Apologies for the false alarm >.<

eeeeeta avatar Aug 21 '23 12:08 eeeeeta

I lied! I pinned a non-cursed toot, left it a while, and came back to another sea of constantly retrying Pleromas in the logs :c

eeeeeta avatar Aug 24 '23 20:08 eeeeeta

Hmmm.... could it be because you've got your domain (eta.st) serving webfinger responses directly rather than forwarding them to your GtS server? It does look like you're serving the correct data, but it might be confusing Pleroma specifically for whatever reason.

tsmethurst avatar Aug 24 '23 22:08 tsmethurst

It seems very linked to the pinned toot -- unpinning the toot will make the Pleromas stop looping, for example -- so I doubt that's it

eeeeeta avatar Sep 13 '23 11:09 eeeeeta

Does it persist if you have multiple toots pinned? Is it specifically with one pinned toot that it occurs?

tsmethurst avatar Sep 13 '23 12:09 tsmethurst

Hey @eeeeeta , are you still seeing this with 0.12.1?

tsmethurst avatar Oct 25 '23 16:10 tsmethurst

I just updated my own patchset to HEAD actually, so I'll pin a toot and see if I can get some pleromas stuck in it again...

eeeeeta avatar Oct 27 '23 13:10 eeeeeta

Update: It's 2 days later and I just opened my gts logs to find about 10 Pleroma requests per second, so this is still an issue! Please let me know if there's anything more I can do to debug this :)

eeeeeta avatar Oct 29 '23 22:10 eeeeeta

Alright, thanks for checking it out! I'll see if I can figure out just what it is about GtS pinned posts that pleroma doesn't like...

tsmethurst avatar Oct 30 '23 10:10 tsmethurst

If it also affects Akkoma, I could try to follow and take a look into the logs.

bjo81 avatar Oct 30 '23 10:10 bjo81

It only affects sufficiently old Akkoma; modern Akkoma versions don't seem to have an issue.

eeeeeta avatar Oct 30 '23 10:10 eeeeeta

Ah really? So it's something that Akkoma identified as a bug recently and fixed?

tsmethurst avatar Oct 30 '23 12:10 tsmethurst

Just a thought, btw, are you pinning toots that you 'imported' and then fiddled with in the database?

tsmethurst avatar Nov 13 '23 10:11 tsmethurst

That was originally the case, but the problem exists even with non-'imported' toots.

eeeeeta avatar Nov 13 '23 14:11 eeeeeta

I'm still not sure what to do with this one. Is it still happening or have the 'omas fixed it on their side?

tsmethurst avatar Jan 22 '24 14:01 tsmethurst

Very much so -- pinned a post today and came back an hour later to find GtS using all the CPU on the machine :(

eeeeeta avatar Feb 08 '24 16:02 eeeeeta

How about now? Just want to check if any of the code updates over the last 5 months have resolved this. Otherwise I'll close it and just chalk it up as an *oma bug, because I really don't think it's anything we're doing :thinking:

tsmethurst avatar Jun 16 '24 10:06 tsmethurst

Seems fixed! I've had a pinned post for 2 days and haven't seen the problem recur; additionally, the handy new HTTP header block thing makes this easy to mitigate without having to recompile, so closing this seems like a good idea :)

eeeeeta avatar Jun 18 '24 19:06 eeeeeta

(whoops, didn't actually mean to instantly close, but I guess that's fine?)

eeeeeta avatar Jun 18 '24 19:06 eeeeeta