meilisearch
meilisearch copied to clipboard
Huge latency when importing a dump due to the task queue
Original issue: I tried importing a dump containing 1M tasks, which took 3 minutes on main. It should be faster.
Context: At the time, I was trying to debug something else and hand-crafted a dump with 1M finished tasks in it. When I tried importing, it didn’t know what was going on, but I found that waaaaay too slow (and I still think it is way too slow).
Updated:
The initial investigation I made at the time that brought me to this PR was that most of the time was spent committing changes we didn't need and writing things on disk on the go. So, as explained in the PR description, the fix consisted to:
- Stops committing the changes between each task import
- Stop deserializing + serializing every bitmap for every task
The dump import became slow again after merging the fix for: https://github.com/meilisearch/meilisearch/issues/3596.
The issue is that the insert_task_datetime still deserializes + inserts + re-serializes multiple bitmaps per task to be inserted. That's slow as hell.
We should find another, more straightforward way to do it, but keeping every single date in RAM doesn't seem like a good idea. Maybe with the help of a grenad, we could generate more easily a roaring bitmap of all the IDs that share the same insertion datetime.
Overall, this doesn't seem super hard to fix but will necessitate a little bit of investigation to fully understand again what's taking time.
The fix was completely defeated when we fixed https://github.com/meilisearch/meilisearch/issues/3596
I’m going to re-open for now, and we’ll see if there is another way to fix the issue
Ping from triage:
We need more information:
- What is the bottleneck getting fixed here?
- What is the expected result vs the 3 minutes that are observed?
- What is the solution that was implemented and then reverted by #3596?
I finally took the time to update this issue. I answered the your first and last question in my first comment. And for:
What is the expected result vs the 3 minutes that are observed?
I expect the task queue import to be way faster. 3 minutes on my machine with a super fast SSD may not look like an issue, but on the cloud instances, it actually takes a lot more time. We should just find a way to write only once every bitmap.