uwazi
uwazi copied to clipboard
Sync issues and slowness
Here i describe issues i found on sync and ideas or potential solutions.
- ~~Sync Worker loop performs a log in to the target tenant every time even if there are no more syncs to send.~~
- ~~Check if there is changes before trying anything else~~
- Sync and other workers loop over ALL tenants to check if the feature flag is active
- probably we should have some method of calculating the tenants with the active flag and only loop over them.
- Every sync iteration processes 50 syncs, sequentially, which takes some time, more than 30 secs for the batch.
- Process syncs in batch, either all the requests in parallel or change sync so that it supports requests with multiple syncs.
- Having more workers will not help improving the job speed, sync (and others) use DistributedLoop, this class makes sure that only 1 worker gets the job at a time, this means there is no way to scale this things this way.
- I think DistributedLoop has no place anymore, we should stop using it, for this to be possible we should also rethink the jobs themselves so that they work properly if paralelized (not sure if this is the case already)
- ~~We have a 30s timeout between job executions.~~
- ~~We should reduce this a lot unless there is a good reason for it.~~
2, Didn't we consider putting the sync parameters into the shared db at some point?
4, I think the new job queue should be able to do this. @fnocetti?
4, I think the new job queue should be able to do this. @fnocetti?
Not only it in fact does, but also I have pitched in the past that the jobs queue could, and IMO should, replace every use of the DistributedLoop.
Some of these (like the 30 sec delay) have already been addressed. Can you please describe what is missing so that we can create individual issues to work on them? Thanks. cc @daneryl
The only thing addressed is the 30 sec timeout for sync, we changed it to 1 second, not for all other jobs. everything else remains an issue.
I will add an extra issue related to sync.
- ~~Sync tries to log in to the target even when there is nothing new to sync, since the timeout is now 1 sec, we are spamming login attempts to ourselves.~~
Point one is addressed in #6873