ghettoq icon indicating copy to clipboard operation
ghettoq copied to clipboard

Performance for larger queues in mysql is not ok

Open peterlundberg opened this issue 14 years ago • 4 comments

With 100000 messages processing queue entries becomes timeconsuming (roughly 2 sec per pop() du to filesorting and mysql picking the visible index to use). This is not intended as a high end queue, but perhaps clarifying the type of load this is expected to support in dox is a good idéa.

One easy fix for mysql is to use a index optimized for the reading actually done. E.g. this will help alot: "create unique index better_index ON ghettoq_message (visible, timestamp, id)"

peterlundberg avatar Jun 01 '10 14:06 peterlundberg

It's not that it's slow, it's leaking! I've got a 1.3GB gettoq_message table in just 2 days. Is it supposed to cleanup() itself ?

I called Message.objects.cleanup() manually and it takes forever. It looks like Django's Model.objects.all().delete() doesn't work as you'd expect. It selects all records and then calls DELETE on batches of 100 ID's. How cool is that on a 500.000 table?

jonozzz avatar Jun 11 '10 05:06 jonozzz

The message model was actually stolen^H^H^H^H^H^Htaken from django-queue-service (http://code.google.com/p/django-queue-service/)

There's probably heaps of ways to optimize this, but I will probably not spend much time doing this myself. This is because the database support is really only meant as a very basic solution for testing or for users with very basic requirements.

Of course, I'll accept any patches from someone willing to work on this. The index tip could be part of the documentation, are you willing to write a short blurb about it? Then I can include it in a FAQ or something for ghettoq.

@jonozzz I don't know to be honest, I spent the total of half an hour copy-pasting the code from django-queue-service, then roughly testing it :) I'm not sure how to go about this, but maybe it could collect the processed items at every n pop()s or something.

Django's delete() is certainly stupid, maybe it has to be that way for on_delete type signals to be triggered or something. I think DELETE from .. WHERE .. is pretty much universally supported so using a raw sql query in this case shouldn't hurt.

ask avatar Jun 17 '10 10:06 ask

In this case DELETE from .. WHERE visible = 0 and then maybe an OPTIMIZE TABLE .. But pop() at every N messages would seem a better idea. Once records are marked as visible=0 why are they needed in the DB anymore ? I know little about the way this works, that's why I'm asking.

jonozzz avatar Jun 17 '10 16:06 jonozzz

Fixed in django-kombu v0.9.2 :)

ask avatar Feb 28 '11 23:02 ask