gunicorn icon indicating copy to clipboard operation
gunicorn copied to clipboard

application start time

Open benoitc opened this issue 5 years ago • 19 comments

Some applications may takes time to be loaded or initialised which may be an issue sometimes because the worker may be killed in between. One way to fix that would be notifying the arbiter when the application is loaded (which can be done via the shared pipe).

To prevent the cases whre the application is too long to start, an application start time would be given. If the arbiter is not notified during that time it would kill the worker and report the timeout as usual.

This would fix #1658, #1757 .

Thoughts?

benoitc avatar Nov 23 '19 00:11 benoitc

I think we should do it. We still need to decide if the existing --timeout should act as a request timeout for non-sync workers. I think we first say only that it is not an application boot timeout and make that a new option.

tilgovi avatar Nov 23 '19 23:11 tilgovi

We could even start with no timeout for application start. We could keep the current heartbeat behavior for --timeout but only start checking it once the worker "boots". That's a breaking change, but for most use I would think health checks and such will make sure that we do not need to have a timeout for booting the application.

tilgovi avatar Nov 23 '19 23:11 tilgovi

Hello! We ran into an issue related to this with our gunicorn deployment. We set a smaller timeout for requests, but our workers sometimes take longer than that just to initialize themselves. Ideally we would improve our application to take less time to initialize, but that seemed like it could be a decent amount of work, and modifying gunicorn to avoid this case seemed relatively simple compared to that.

I made a small change to gunicorn that adds a boot-timeout setting. When it is set, the arbiter uses that to decide whether workers have timed out rather than the existing timeout setting if the workers haven't booted yet. I am not familiar with the gunicorn codebase, so it is definitely possible I made a glaring mistake when working on this, but if you would like to look at the patch I wrote to add this setting, it is here: https://github.com/benhiller/gunicorn/pull/1 As another caveat, I may also not be aware of the various ways in which gunicorn can be used, so it possible this change might not be compatible with other gunicorn configurations.

I wanted to share that to see if you would be interested in merging it as a pull request to gunicorn? The solution suggested above of not applying the timeout for workers until the boot would also definitely help with our problem, but I went with this approach instead since it seemed less disruptive to the existing behavior, and only changes the behavior for people who opt-in to this by setting a new gunicorn flag.

Anyway, let me know what you think!

Edit: I believe that the above patch actually doesn't do what I thought it did due to my misunderstanding of how the arbiter and workers communicate. Will look into revising that, so for now you can probably ignore that link to the patch.

benhiller avatar Mar 09 '21 22:03 benhiller

Hi,

Such a feature may help me, I might consider writting it myself, but I admin I have limited knowledge of the code base, and not a great deal of time to see it through.

Nonetheless, the way I understand things, @benoitc seems to be OK with a basic design such that :

  1. The worker may have (some kind of) booted / booting status
  2. The arbiter may watch this in the murder_workers method to decide if a special timeout configuration is applied, as opposed to the current "unifiied" timeout

The crux of the issue, for me, is step 1). Benoitc mentionned a shared pipe between arbiter/worker, but I see no such pipe in the code base. What I do see is that arbiter and worker have a shared communication based on the worker.self object (which internally, ultimately boils down to a file pointer). The current mean of communication through this file is by setting file permissions, basically switching between 0 and 1 every notify call.

Expanding on that, i would consider encoding, bitset style, the booted status of the worker using the second byte of the file permissions (e.g. notify would switch from 0x0 to 0x1, booted would switch from 0x0_ to 0x1_).

If this still sounds OK, I would propose :

  1. Creating a configuration flag, let's call it worker_start_timeout, with a default equals to the existing timeout
  2. Extending the internal worker.tmp object (e.g. WorkerTmp class) to expose a set_booted() and is_booted method, using bitset encoding on the file permissions as described above
  3. Modifying murder_workers to check the booted attribute, and choosing the actual timeout value.

Sounds like a plan ?

gueugaie avatar Mar 19 '21 18:03 gueugaie

I think @benhiller should submit this as a pull request. I notice a few small improvements that could be made, but I'd like to offer a full review on a PR here. But first, we need to answer one important question, I think.

Is a worker allowed to notify the arbiter that it is still booting, or is the boot timeout strict? If it's strict, we don't need any new status. We just need to know if there has been at least one call to notify, i.e. ctime > mtime. If it's not strict, then the worker may call notify during boot and we might need a separate status flag.

tilgovi avatar Mar 20 '21 21:03 tilgovi

Agreed about the strict / non strict distinction - and the potential simplification of not creating a new status.

About @benhiller’s proposed patch, and unless I missed some IPC mechanism behind the scene (which may very well be the case!), I do not see it working. It revolves around a « booted » flag, set on the worker process. I do not see how the arbiter process would ever « see » the change operated on the worker process.

gueugaie avatar Mar 21 '21 10:03 gueugaie

@gueugaie that's correct.

tilgovi avatar Mar 21 '21 18:03 tilgovi

@tilgovi & @gueugaie - thank you for your comments and taking a look at my patch! You are right that the previous patch I linked to wasn't correct. I worked on an updated version of it which used a similar approach to what @gueugaie described above and will send a pull request for it in a bit. I tried the ctime > mtime check as well, since that seemed simpler, but it seems like the util.unlink call in WorkerTmp.__init__ results in the ctime and mtime on the temp file being different even before the first notify call.

From my usage of gunicorn, I think that having a strict timeout for booting makes more sense, since the application initialization seems like a single indivisible chunk of work, so it isn't clear to me when the worker could call notify during that. However, I am really only judging this based on the way we use gunicorn in our codebase, so I'd definitely defer to your judgement on that!

benhiller avatar Mar 21 '21 19:03 benhiller

One thing I had in mind while thinking about this whole chmod / bit-masking implementation is : does it work on windows ? Python reference guide states :

Note

Although Windows supports chmod(), you can only set the file’s read-only flag with it (via the stat.S_IWRITE and stat.S_IREAD constants or a corresponding integer value). All other bits are ignored.

So, it remains to be seen if one can actually encode any information using file permissions (except on the file modification timestamp and S_WRITE/S_IREAD).

gueugaie avatar Mar 22 '21 08:03 gueugaie

I am not very fan of changing the timeout behaviour of the workers. This makes things more complex. Also I am not sure the proposed implementation is working on all filesystems.

I am suggesting to makes the check of this boot timeout distinct and eventually add a new filetime to check instead if no better mechanism is found. That would make the logic more simpler and separate the concerns.

I am also thinking that we could send to the arbiter pipe a signal for it though, this would be the start for more features over the time like allowing some kind of process sharing the data between all workers. Such things. If I have some time this week I will work on a patch for it .

benoitc avatar Mar 22 '21 16:03 benoitc

Agreed that a communication channel would be ideal for multiple uses. But sharing the (one and only) arbiter pipe to the (possibly) multiple workers seems dangerous to me, if that's what you have in mind.

https://docs.python.org/3/library/multiprocessing.html#exchanging-objects-between-processes

> Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. 

gueugaie avatar Mar 22 '21 16:03 gueugaie

Hm, one option to use an approach similar to what @tilgovi suggested with ctime & mtime, and avoid using chmod for this purpose could be: call fstat on the temp file in the WorkerTmp constructor, store the initial ctime in an instance variable on WorkerTmp (which should be shared by both the arbiter and the worker, since this is before the fork), and then use that for comparisons when deciding whether or not notify has ever been called.

benhiller avatar Mar 22 '21 16:03 benhiller

Agreed that a communication channel would be ideal for multiple uses. But sharing the (one and only) arbiter pipe to the (possibly) multiple workers seems dangerous to me, if that's what you have in mind.

https://docs.python.org/3/library/multiprocessing.html#exchanging-objects-between-processes

> Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. 

In my mind the pipe is only there to signal to the arbiter. Ther arbiter will manage the rest. In this case it is just about sending a worker is ready or not for example. So it's really N workers -> Arbiter. Only Arbiter read. The communication channel between workers would be handled differently

benoitc avatar Mar 22 '21 17:03 benoitc

Hm, one option to use an approach similar to what @tilgovi suggested with ctime & mtime, and avoid using chmod for this purpose could be: call fstat on the temp file in the WorkerTmp constructor, store the initial ctime in an instance variable on WorkerTmp (which should be shared by both the arbiter and the worker, since this is before the fork), and then use that for comparisons when deciding whether or not notify has ever been called.

that may work also :)

benoitc avatar Mar 22 '21 17:03 benoitc

So it's really N workers -> Arbiter. Only Arbiter read.

Only one reader, I agree, but since it's the same pipe instance, you fall into the "two processes (or threads) try to (...) write to the same end of the pipe at the same time" scenario, don't you ?

gueugaie avatar Mar 22 '21 17:03 gueugaie

So it's really N workers -> Arbiter. Only Arbiter read.

Only one reader, I agree, but since it's the same pipe instance, you fall into the "two processes (or threads) try to (...) write to the same end of the pipe at the same time" scenario, don't you ?

WHat I have in mind is something similar to imsg from openbsd which is also used in TMUX if I remember. Either we can use the C library and bind it or porting it in Python which seems doable.

benoitc avatar Mar 22 '21 17:03 benoitc

If we land #1967 we could then adapt it to use the initial st_mtime as the basis for an application startup timeout.

tilgovi avatar Sep 19 '21 23:09 tilgovi

I just hit the same issue - a slow starting app, that sometimes can take a really long time to start up (many minutes if it needs to scale the underlying infrastructure).

Unfortunately currently my only option if to disable the worker timeout by setting --timeout 0, but that is obviously not optimal, it would be better to have a startup timeout separate from runtime timeout - as described in this issue.

zoltan-fedor avatar Jan 23 '23 17:01 zoltan-fedor

What's going on? Is anyone still working on this?

lyksdu avatar Mar 25 '24 12:03 lyksdu

no activity since awhile. closing feel free to create a new ticket if needed.

benoitc avatar Aug 06 '24 15:08 benoitc