mbin icon indicating copy to clipboard operation
mbin copied to clipboard

added lock when updating actor via UpdateActorHandler

Open asdfzdfj opened this issue 1 year ago • 11 comments

while working on incoming activity, the UpdateActorMessage could be fired more than once per actor, and even with #602 applied there could still be multiple UpdateActorHandler running at the same time updating the same actor, before the added guard could kick in

this adds a bit of locking in UpdateActorHandler per actor url to try and ensure that only one handler is actually updating an actor at a time, and discards any following update actor message that comes after

added the force parameter to UpdateActorMessage and handler to force actor update regardless of #602 last updated time (still have to get past the lock though)

the mbin:actor:update command also got some jank in there fixed as they also used UpdateActorMessage to do its job, and the force parameter is also used here

also try moving the fetched time guard in #602 up to the handler level the guard is alright but such guard should resides on the higher level before calling the update method

asdfzdfj avatar Mar 24 '24 07:03 asdfzdfj

Great idea to use locks for this. Didn't know they existed :D I would probably just have used a DB field for it, which is just bad in comparision.

I think the lock should be aquired when dispatching the message and then passing the lock along with the actor to the message (see Lock#Serialization). With your solution you ensuring that only one update actor job is done at a time for an actor, however it is not solving the problem of dispatching multiple update actor messages for the same actor, so we still have a strain on rabbit...

What do you think?

BentiGorlich avatar Mar 24 '24 09:03 BentiGorlich

And I just noticed that the default lock dsn that everybody uses atm is the flock store. I think we should think about changing the default (because it uses filesystem locks which I doubt are great). I think I would lean towards redis... As your new code is the first and only bit of code that uses the lock component it may result in errors due to a wrong setup, because it was just not needed beforehand

BentiGorlich avatar Mar 24 '24 09:03 BentiGorlich

I think the lock should be aquired when dispatching the message and then passing the lock along with the actor to the message (see Lock#Serialization). With your solution you ensuring that only one update actor job is done at a time for an actor, however it is not solving the problem of dispatching multiple update actor messages for the same actor, so we still have a strain on rabbit...

What do you think?

while I was making this I couldn't find the example for the lock+key usage, and lock feature sets (including keys) basically depends on what stores you used, so I opted to do it in a way that it only needs the locks to work and depend less on fancier features that may or may not be present with any particular stores (also I used semaphore store for locks on my test instance for some inexplicable reason, and the key feature doesn't work with that store). maybe I'll see what can I come up with that

though I do agree that this should be guarded since before dispatching the update actor message


And I just noticed that the default lock dsn that everybody uses atm is the flock store. I think we should think about changing the default (because it uses filesystem locks which I doubt are great). I think I would lean towards redis... As your new code is the first and only bit of code that uses the lock component it may result in errors due to a wrong setup, because it was just not needed beforehand

I think the flock store that we have configured is fine, as the symfony caching components already uses lock behind the scenes and so far it seems to be working fine (also this is actually how I found out that symfony has locks, I happened to notice some debug locg lines mentioning lock acquisition using cache key while the caching call was working)

still, a small docs con configuring lock stores if soneone want a more durable lock store might be nice, I'm leaning more towards DoctrineDbalPostgreSqlStore myself, same feature set with flock, but I think redis is fine too

asdfzdfj avatar Mar 24 '24 10:03 asdfzdfj

and lock feature sets (including keys) basically depends on what stores you used

I guess the feature that is required is "sharing" right? It is not explicitly stated which one it is... DoctrineDbalPostgreSqlStore sounds good to me too, lets go with that

BentiGorlich avatar Mar 24 '24 10:03 BentiGorlich

You could even consider redis / keydb/dragonfly for this key lock objects. With redis you have both support for expiring (ttl) and sharing.

Also depending on the amount of locks it could be more performant than using postgresql.

melroy89 avatar Mar 24 '24 11:03 melroy89

perhaps it's just like what benti pointed out, that the goal of this patch here is to prevent multiple update actor handler running on same actor at once (and I've seen it happen on my instance), rather than avoid dispatching multiple UpdateActorMessage in quick succession, as I find the former to be more feasible

anyway, giving a shot at moving the lock up to the dispatch level to avoid dispatching multiple UpdateActorMessage in quick succession, the change is here in a separate branch

from the quick test it seems to work, however I'm not merging it into this patch right now as I'm not sure the added complexity is worth it:

  • the lock store MUST be reconfigured otherwise the changes wouldn't work, as the lock keys must be serializable, and the following stores cannot have their lock keys serialized, which includes flock store that's our current default, and also postgresql advisory locks I was aiming too, leaving redis store as the only practical choice for us:
mbin % rg markUnserializable vendor/symfony/lock
vendor/symfony/lock/Store/FlockStore.php
132:        $key->markUnserializable();

vendor/symfony/lock/Store/SemaphoreStore.php
79:        $key->markUnserializable();

vendor/symfony/lock/Store/DoctrineDbalPostgreSqlStore.php
98:                $key->markUnserializable();
133:                $key->markUnserializable();

vendor/symfony/lock/Store/ZookeeperStore.php
72:        $key->markUnserializable();

vendor/symfony/lock/Store/PostgreSqlStore.php
88:                $key->markUnserializable();
124:                $key->markUnserializable();

vendor/symfony/lock/Key.php
58:    public function markUnserializable(): void
  • there's also this odd symfony serializer problem where it failed to deserialize the key object properly, complaining about missing values for Key constructor, so I have to manually (un)serialize the keys in a roundabout way in order to get it working
  • the payload for UpdateActorMessage is relatively light compared to other messages, so while it still strains rabbitmq I personally find this somewhat more acceptable, compared to potential complexity that I have to add to avoid this situation

asdfzdfj avatar Mar 24 '24 17:03 asdfzdfj

ok, I've extended it try and avoid dispatching multiple UpdateActorMessage in the first place with the help of lock keys, provided that key serialization is available, if it doesn't then it'll fall back to handler level locking that's in the first patch

this way the existing deployments should continue to work and can 'opt-in' to this locking later by setting LOCK_DSN="${REDIS_DNS}" in the .env, while not adding another hard dependency on redis just for this change (not in the mood to add another hard dependency to the list when this thing already uses/needs too much supporting infrastructure)

asdfzdfj avatar Mar 28 '24 10:03 asdfzdfj

ok, I just now came up with even more crazier proposition: if the main thing you want is limit dispatching multiple update actor messages for the same actor, then just (rate) limit them:

rather than doing the lock + key serialization rigamarole when dispatching the actor update, I've instead added a rate limiter guard such that it would dispatching (auto) update actor once every 5min, to limit a burst of multiple update messages being dispatched in quick succession, while still allow for retry if the dispatched message earlier failed to update the actor, otherwise the outer fetched time guard should take care of the rest (this is based on personal observation where there seem to be some time locality to it, and the rate period could be reduced to 1min I think)

doing it this way means changing the lock store to one with key serialization support is no longer needed, doesn't need special upgrade instruction, and (imo) the added guard logic is more simple and straightforward

asdfzdfj avatar Mar 29 '24 10:03 asdfzdfj

Code wise it looks good to me. Did you test it thoroughly? If yes then I trust you that I do not need to test it myself.

I have this on my own instance since the PR was created, and from what I could read from debug log traces it seems to be working as intended

debug log sample for a random actor
[2024-05-03T02:13:32.892057+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-02T19:18:32+00:00"} []
[2024-05-03T02:13:32.892994+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-02T19:18:32+00:00"} []
[2024-05-03T02:13:33.017289+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-02T19:18:32+00:00"} []
[2024-05-03T02:13:33.267509+07:00] app.INFO: updating user https://misskey.io/users/9hxecnw2ne {"name":"https://misskey.io/users/9hxecnw2ne"} []
[2024-05-03T18:12:40.367327+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-03T11:17:40+00:00"} []
[2024-05-03T18:12:40.367923+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-03T11:17:40+00:00"} []
[2024-05-03T18:12:40.492754+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-03T11:17:40+00:00"} []
[2024-05-03T18:12:40.496183+07:00] app.INFO: updating user https://misskey.io/users/9hxecnw2ne {"name":"https://misskey.io/users/9hxecnw2ne"} []
[2024-05-03T19:32:16.696423+07:00] app.INFO: updating user https://misskey.io/users/9hxecnw2ne {"name":"https://misskey.io/users/9hxecnw2ne"} []
[2024-05-03T19:32:17.048946+07:00] app.DEBUG: not dispatching updating actor for https://misskey.io/users/9hxecnw2ne: one has been dispatched recently {"actor":"https://misskey.io/users/9hxecnw2ne","retry":"2024-05-03T12:37:16+00:00"} []

Just a summary what we have here:

  • we rate limit the dispatches of UpdateActorMessages per actor url, so each actor url can trigger an UpdateActorMessage every 5 minutes
  • we lock the actual updating part for 60 minutes per actor, so every actor is at most updated once every hour

that sounds about right

asdfzdfj avatar May 03 '24 13:05 asdfzdfj

we lock the actual updating part for 60 minutes per actor, so every actor is at most updated once every hour

I think this is not really correct, since you release the lock at the end :thinking:

BentiGorlich avatar May 03 '24 13:05 BentiGorlich

we lock the actual updating part for 60 minutes per actor, so every actor is at most updated once every hour

I think this is not really correct, since you release the lock at the end 🤔

the lock is only meant to prevent more than one UpdateActorMessage workers updating same actor at the same time (though you could say it's superseded by adding rate limiters to avoid sending multiple UpdateActorMessage in the first place), the apFetchedAt time guard is whats limits the actor auto update frequency

asdfzdfj avatar May 03 '24 13:05 asdfzdfj