django-rq icon indicating copy to clipboard operation
django-rq copied to clipboard

Closing the database connection

Open davecap opened this issue 12 years ago • 30 comments
trafficstars

I'm using django-rq on Heroku with the Heroku Postgres DB setup. When my RQ worker jobs finish, I always see this in the logs:

LOG:  could not receive data from client: Connection reset by peer
LOG:  unexpected EOF on client connection

Any idea is there's a way to close the database connection when a job ends? For example, I probably need to put this somewhere:

from django.db import connection
connection.close()

Perhaps it should just be in the worker function itself... but that seems like a hack... thoughts?

davecap avatar Mar 19 '13 06:03 davecap

I've never written a Heroku application before so this will be hard for me to debug. Does explicitly closing the connection on your queued function fix it? I did a little bit of googling and it seems that this problem is not specific to Django/RQ. There's a resque thread discussing the same problem https://github.com/defunkt/resque/issues/367

selwin avatar Mar 19 '13 12:03 selwin

Yea I saw that thread too... I'm surprised I haven't seen any traces of this problem with Django though. I was kind-of wondering if I could implement a similar fix to the gem mentioned in that thread. What they're doing is adding the disconnect on some sort of "after_job" callback. Does such a thing exist with RQ?

davecap avatar Mar 19 '13 17:03 davecap

No, there's no such thing as "after_job" in RQ. We can propose adding one, as well as the ability to use a different worker class since this is already in RQ's roadmap (implementing concurrent workers).

However, we need to first ensure that the proposed solution does indeed solve the problem. Can you try if running "close_connection" in the function being queued fix this?

selwin avatar Mar 19 '13 23:03 selwin

Yep. I just tested it one of my async functions and I don't see the EOF errors.

davecap avatar Mar 27 '13 21:03 davecap

I've been some more thinking about this, here's a few ideas:

Alternative 1

Writing a wrapper function for async functions - this doesn't require any changes to RQ but could be tedious if every single enqueued django function on Heroku needs to explicitly close db connection.


from django.db import connection

def func():
    #pass

def rq_func():
    func()
    connection.close()

queue.enqueue(rq_func)

Alternative 2

Implementing a job callback pattern, however, this idea was rejected in https://github.com/nvie/rq/issues/155, I'm also not a big fan of the notion of having callbacks in RQ.

queue.enqueue(f, callback=close_connection)

Alternative 3

Add a job_class argument to Worker to allow the use of custom job classes. In this case, Django RQ can define its own Job class to deal with this issue. It would look something like:


from rq.job import Job
from django.db import connection

ConnectionClosingJob(Job):

    def perform(self):
        result = super(DatabaseClosingJob, self).perform()
        connection.close()
        return result

# Somewhere in Django-RQ

worker = Worker(connection=connection, job_class=ConnectionClosingJob)

@nvie, what do you think?

selwin avatar Mar 28 '13 12:03 selwin

I'm experiencing the same issue. I'm glad I checked, because I had no idea what the cause issue was. I think alternatives 2 or 3 would be good. I would think of it less of a callback than as a hook. I would ideally like to do other post-job housekeeping.

acjay avatar Apr 15 '13 19:04 acjay

Just to add, I'm not sure Alternative 1 really works when func() is a bound method. In my use case, I'm binding a method at run time and queuing that method directly. I'm refactoring to try to pass a wrapper function instead that will recover the method reference job-side and do the necessary cleanup, but it's really gumming up the code.

acjay avatar Apr 15 '13 21:04 acjay

I don't understand the problem in the first place. Stripping away django-rq and RQ from the equation, what's the core of the problem we're looking at here? Isn't it that Django implicitly opens a connection, but does never explicitly close it until its Python process is terminated (which never happens in an RQ worker situation)?

nvie avatar Apr 19 '13 10:04 nvie

From what I understand it's a harmless postgres error message that occurs when no Terminate message is received. I wonder if enabling "autocommit" mode for the psycopg2 DB backend would fix it.

davecap avatar Apr 20 '13 15:04 davecap

@nvie That's one manifestation of the problem, but to me, the overall problem is that I want to be able to do other post-task work. In my case, I'm manually tracking tasks in flight so that I can prevent duplicate tasks with the same arguments from running. In the current paradigm, I have to schedule a wrapper function that calls that actual task function performs this clean up. But since there are multiple actual task functions that might be called, and since I can't pass function references as arguments, my wrapper function has to do some serious gymnastics to recreate the task function reference worker-side. The whole thing would be much easier if there were hooks for post-task work. I really like Alternative 3, personally.

acjay avatar Apr 23 '13 14:04 acjay

The problem I believe is that RQ forks its worker processes from the main process, and django shares its database connection from the parent process to the child process. http://stackoverflow.com/questions/8242837/django-multiprocessing-and-database-connections sums up the problem pretty well.

fangsterr avatar Sep 15 '13 23:09 fangsterr

We have decided to implement the fix on RQ itself instead of django-rq. RQ will allow the user to use a custom Worker class where we can then tell the worker to do close db connection before forking.

There's an open issue tracking this feature on https://github.com/nvie/rq/issues/233 . Feel free to contribute/chime in on the discussion on the thread.

selwin avatar Sep 16 '13 01:09 selwin

I came here because I was worried with the could not receive data from client: Connection reset by peer on my django/django-rq app running on Heroku.

I applied the tip described here to create "clean" connections when the worker is launched and avoir the "forked connections" syndrome that seems to be the root cause: https://github.com/nvie/rq/issues/233#issuecomment-25010141

But I stil get reset connection message on Heroku. Did I miss something?

abulte avatar Jun 09 '14 13:06 abulte

@abulte mind showing me a snippet of your code that manages postgres connection? If I remember correctly @fdr also wanted to put this up on Heroku's docs so we should definitely work out a reliable way of doing this and properly document it.

selwin avatar Jun 11 '14 01:06 selwin

Here's my snippet, directly inspired from the mentioned link:

from rq import Worker
from django.db import close_connection

class AAWorker(Worker):

    def fork_and_perform_job(self, *args, **kwargs):
        close_connection()
        super(AAWorker, self).fork_and_perform_job(*args, **kwargs)

I should add that it did not solve the "connection reset" problem.

I would be very glad to have the heroku docs for doing that correctly ;-)

Thanks for your help.

abulte avatar Jun 11 '14 07:06 abulte

@abulte THANK YOU dude, this just saved me so much time and effort.

EDIT: Sorry, scratch that response, this didn't work for me, I think I have a different bug :/ Still thank you though lol.

foxx avatar Sep 03 '14 22:09 foxx

I went ahead and updated the Heroku docs like I promised I would. Somehow I missed the notification the first time around. Sorry for the long delay:

https://devcenter.heroku.com/articles/forked-pg-connections#django-rq

fdr avatar Sep 04 '14 23:09 fdr

@fdr thanks!

Sent from my phone

On Sep 5, 2014, at 6:42 AM, Daniel Farina [email protected] wrote:

I went ahead and updated the Heroku docs like I promised I would. Somehow I missed the notification the first time around. Sorry for the long delay:

https://devcenter.heroku.com/articles/forked-pg-connections#django-rq

— Reply to this email directly or view it on GitHub.

selwin avatar Sep 04 '14 23:09 selwin

@fdr hi. You seem to have updated the docs with the snippet I provided on https://github.com/ui/django-rq/issues/17#issuecomment-45710490. But this snippet does not solve the "connection reset" like I said... At least in my experience.

abulte avatar Sep 05 '14 08:09 abulte

Does it solve any other inconvenient symptom?

fdr avatar Sep 05 '14 14:09 fdr

AFAIK no.

abulte avatar Sep 05 '14 14:09 abulte

On Fri, Sep 5, 2014 at 7:42 AM, Alexandre Bulté [email protected] wrote:

AFAIK no.

Oh. Pity, I guess I'll remove it, then. Thanks for clarifying that for me.

fdr avatar Sep 05 '14 14:09 fdr

Thanks for trying ;-) Maybe @selwin has a proper way to do this?

abulte avatar Sep 05 '14 14:09 abulte

I've never used Heroku before so no :(. Does closing db connection right after worker is started and after job completion help?

I'll reopen this thread since we don't have a proper solution for this issue.

selwin avatar Sep 05 '14 16:09 selwin

On Fri, Sep 5, 2014 at 9:31 AM, Selwin Ong [email protected] wrote:

I've never used Heroku before so no :(. Does closing db connection right after worker is started and after job completion help?

I'll reopen this thread since we don't have a proper solution for this issue.

Heroku is irrelevant in this other than I making the recommendation to open the bug because I've seen people get mystified by file descriptor corruption because of fork() a bunch of times...

So, any pg/fork based worker is affected unless one moves connection creation to post-fork.

fdr avatar Sep 06 '14 16:09 fdr

Ah ok, got it.

Win~

On Sat, Sep 6, 2014 at 11:25 PM, Daniel Farina [email protected] wrote:

On Fri, Sep 5, 2014 at 9:31 AM, Selwin Ong [email protected] wrote:

I've never used Heroku before so no :(. Does closing db connection right after worker is started and after job completion help?

I'll reopen this thread since we don't have a proper solution for this issue.

Heroku is irrelevant in this other than I making the recommendation to open the bug because I've seen people get mystified by file descriptor corruption because of fork() a bunch of times...

So, any pg/fork based worker is affected unless one moves connection creation to post-fork.

— Reply to this email directly or view it on GitHub https://github.com/ui/django-rq/issues/17#issuecomment-54718577.

selwin avatar Sep 07 '14 01:09 selwin

I used this with django-rq:

In every function enqueued to run by a worker, I added the connection.close() at the end. Now the postgres logs do not have the "LOG: could not receive data from client: Connection reset by peer".

More importantly, I think those errors were causing the PG archive log to become larger. So fixing this took on importance for the PG backup not running away.

sachinkagarwal avatar Mar 03 '15 15:03 sachinkagarwal

Any idea or update on this problem? I have the exact same issue, tried some of the hacks mentioned above, but nothing works for heroku.. I need to figure out how to post_fork a connection.. Any ideas? Otherwise I would have to completely drop django-rq and go for celery of something similar...

mpetyx avatar May 17 '16 14:05 mpetyx

Or you could drop Django :)

foxx avatar May 17 '16 14:05 foxx

FWIW I've been using this for years:

from django.db import connection
from rq.worker import Worker


class ConnectionClosingWorker(Worker):
    """
    RQ Worker that closes the database connection before forking.

    See also https://github.com/ui/django-rq/issues/17

    """

    def execute_job(self, job, queue):
        connection.close()
        super().execute_job(job, queue)

It's been too long since I wrote it to remember the details, but it solved this issue and hasn't caused problems in a largish rq deploy. By "largish", I mean "we're moving off Heroku because we hit their 300-dyno limit".

aaugustin avatar Apr 07 '19 11:04 aaugustin