Implement gunicorn access log format
This PR a very naïve and straightforward implementation of gunicorn access log format, which is currently ignored by uvicorn (see #527).
One notable omission is the WSGI variables: they should be rather straightforward to add, but I think it is important to agree on the approach first. And the approach goes as follows:
It puts the gunicorn log object all the way through to the RequestResponseCycle, and it unrolls gunicorn_log.access function
https://github.com/benoitc/gunicorn/blob/master/gunicorn/glogging.py#L331-L351
so that it is not necessary to recreate all the gunicorn primitives, like gunicorn.http.wsgi.Response and gunicorn.http.message.Request.
The timing is implemented via time.monotonic so that it works well w.r.t. DST transitions and leap seconds. The timing values are added to the asgi scope. And obviously, to be able to measure the timing, the logging had to be moved to the end of response processing.
~~Default uvicorn logging is kept intact to minimize the scope of this PR, and is only disabled when gunicorn_log is passed by gunicorn worker classes to avoid duplication.~~
GunicornSafeAtoms does several things:
- it provides the values for gunicorn log formatter in the most performant way I could imagine
- it aggregates response body size from ASGI messages
- and it does what gunicorn.glogging.SafeAtoms does to provide fallbacks for missing values
Edit by @Kludex : closes #773, closes #527
ref #773, ref #527, ref #389
@euri10 is there anything i can do to help get this in? or at least to help to get the discussion started, as i would assume there's a decision to be taken w.r.t. the approach.
@euri10 is there anything i can do to help get this in? or at least to help to get the discussion started, as i would assume there's a decision to be taken w.r.t. the approach.
back here from the issue discussion, I quickly looked on the PR, I can't pronounce on the gunicorn side of things, however what is pretty clear to me reading the PR is that the implementation is adding keys to the scope that are not in the asgi specification. (https://asgi.readthedocs.io/en/latest/specs/www.html#http-connection-scope).
I tried to dig a little in asgiref and didn't find a definitive answer as to whether or not it's "allowed", I would tend to say it is not, see https://github.com/django/asgiref/issues/178 and also https://github.com/django/asgiref/issues/112 for interesting points.
It would be better to have an implementation that does not violate the spec, if not possible this would need a discussion about that beforehand, I cant remember where but I think this was discussed before and frowned upon, maybe @tomchristie remembers.
From what I see, these are the added keys:
- request_start_time
- request_end_time
Would the extensions approach from the spec work here?
There are times when protocol servers may want to provide server-specific extensions outside of a core ASGI protocol specification, or when a change to a specification is being trialled before being rolled in. For this use case, we define a common pattern for extensions - named additions to a protocol specification that are optional but that, if provided by the server and understood by the application, can be used to get more functionality. This is achieved via an extensions entry in the scope dictionary, which is itself a dict. Extensions have a Unicode string name that is agreed upon between servers and applications.
Would the extensions approach from the spec work here?
I think yes that would be the most sensible thing to do to move things forward.
Heyy, now we're talking (literally), thanks!
I'll update the PR to use the extensions approach, or if you see another way to store the start/end timestamps within the scope of the request, i'm open to other ideas :+1:
@euri10 i have updated the code to use scope['extensions']['uvicorn_request_duration'] to store the timestamps, but like i said previously, if there's a better place to put it, just let me know.
you can use ./scripts/lint if you have linter issues, not sure if you saw it
I really want this feature, but I'm not sure this is the way to go.
Some points on my mind:
- I don't think we should have any reference to
gunicornon our source code besides the worker. - I think
uvicornshould have its way to customize the logger format, which for convenience it's a good idea to matchgunicorn's way. - I don't feel comfortable adding an extension on the ASGI scope just for the time, I think we can achieve that with a single variable, even if that variable needs to be sent to another object (cycle). :thinking:
Those are my concerns at first glance. What do you think? @euri10 @immerrr
I agree with the concerns listed above, as i mentioned earlier, it was done the way it was done to minimize the scope of change and to let the uvicorn team decide the fate of uvicorn's logging and formatting at your convenience.
If I'm reading this correctly, the requested changes are:
- move tracking of request-response timing to a variable
- turn the gunicorn logging format handling into uvicorn's native logging formatting
If those were to happen, would it make this PR "the way to go"?
If I'm reading this correctly, the requested changes are:
- move tracking of request-response timing to a variable
- turn the gunicorn logging format handling into uvicorn's native logging formatting
If those were to happen, would it make this PR "the way to go"?
Yeah, but also adding the --access-log-format on the CLI command (just being explicit here).
Besides if we can get the answer to "is it possible to read the gunicorns --access-log-format in the UvicornWorker?" it would be cool.
EDIT: For the last question, maybe self.cfg.access_log_format? Not sure...
Not sure if i understand the last question correctly, but i think the format is available as
self.log.cfg.access_log_format
On the following line: https://github.com/encode/uvicorn/pull/947/files#diff-7c8a9d71c98f0ed83c32a88a1b7e777ef85776c98869045d24c74e3f7714494aR41
And I am making use of it in this PR.
Ah, yeah! Perfect.
So, in summary:
- Store request-response timing in a variable (being an attribute of
RequestResponseCycle?). - As you said: "turn the
gunicornlogging format handling intouvicorn's native logging formatting". This is for convenience, my point is that we should have our formatting rules. - Create the
--access-log-formatonuvicorn, and on theUvicornWorkersend the"access_log_format": self.log.cfg.access_log_formatinstead of"gunicorn_log": self.log. In case we decide to not match the rules (exactly the same), we can create a converter on theUvicornWorkerthat will aim to match thegunicornwithuvicornrules. But with this idea present, we can make use of--access-log-formateither fromuvicornorgunicornonuvicorn.
Does it make sense? :thinking:
It can be made to work.
It's been a while since I wrote this, but i think the default uvicorn logging uses colours, whereas gunicorn doesn't support that, so we either need to disable colouring (break backward compat), or agree that colouring is not possible with custom log formats, or start bending gunicorn formatting features right away :)
It's been a while since I wrote this, but i think the default uvicorn logging uses colours, whereas gunicorn doesn't support that, so we either need to disable colouring (break backward compat), or agree that colouring is not possible with custom log formats, or start bending gunicorn formatting features right away :)
@euri10 thoughts?
Just started implementing the requested changes, and ran into another point of contention/agreement. The existing uvicorn logging is done when response is started, whereas gunicorn logging is done when the response is ended (to be able to provide all response headers and total request-response duration time).
Do you want to move the default uvicorn logging to the end of the cycle? Or perhaps do the access log before or after response depending on whether or not access-log-format was passed?
This was one more question I was trying to avoid with this PR by keeping the gunicorn logging format gunicorn-specific :)
Do you want to move the default uvicorn logging to the end of the cycle? Or perhaps do the access log before or after response depending on whether or not access-log-format was passed?
IMHO, we can move it. I also think that makes more sense to move it. But maybe we should take a look at how hypercorn does it or gunicorn itself with their main workers :thinking:
~~Oh, and another thing that should work right now and will not with the new suggested approach is the handling of gunicorn's --access-logfile. Current implementation takes the pre-configured logger object from gunicorn with all its handlers~~
https://github.com/encode/uvicorn/blob/4e7da5495d9713dcfb04d96ef82f828319e29e84/uvicorn/protocols/http/httptools_impl.py#L534-L536
~~And if the new version only uses the access log format, this behaviour will have to be reproduced later on.~~
EDIT: nvm, log file and levels are copied in workers.py
I have added a minimal test to exercise gunicorn worker logging, because it is true that previously there were no tests for this.
I have updated the branch according to the feedback, it's almost there:
- the old default logging format is kept as is
- the
--access-log-formatparameter still needs to be added to uvicorn itself
I have rebased to master and added --access-log-format, and looks like the test are passing:
https://github.com/immerrr/uvicorn/actions/runs/957952813
I'm not sure I like the result: now it looks like there are two ways to configure access log format, in the logging dict, that depends on the custom formatter class and a very specific set of parameters passed into the logging method, and then with the new command line parameter. Not sure, what's your preference on resolving this, I personally don't have a strong opinion on this.
@Kludex @euri10, I have enabled "allow edits from maintainers", so if you agree that it is going to be more efficient than the review-comment-fix-review-comment-fix... loop, feel free to tweak this to your liking yourselves.
I'm going to take a look at this later today, but I also want to have your opinion @euri10 .
Here you have hypercorn implementation: https://gitlab.com/pgjones/hypercorn/-/blob/master/src/hypercorn/logging.py This one is from gunicorn: https://github.com/benoitc/gunicorn/blob/master/gunicorn/glogging.py
Maybe we want to use a simple dictionary (same as both implementations above) instead of the decorator idea around the atoms? Looks simpler and less line consuming? :thinking:
@Kludex @euri10 please, let me know if I can help somehow to push this forward
@immerrr I have a suggestion to make it possible to pass custom fields to the access logger. Specifically, after #859 it became impossible to access any fields that one might have been storing on the scope.
We could achieve this by making it possible to configure which class is assigned to self.access_log_fields in the __init__ method of RequestResponseCycle:
class RequestResponseCycle:
def __init__(..., access_log_fields: Type[AccessLogFields], ...):
self.access_log_fields = access_log_fields(self.scope, self.timing)
...
We would then read that class from the config, much like the access_log_format.
A user could then inherit from the AccessLogFields to pass whatever fields they like to the access logger, including information stored on the scope:
class CustomAccessLogFields(AccessLogFields):
@_register_handler("my_field")
def my_field(self) -> Optional[str]:
return self.scope["my_field"]
Curious to hear your thoughts!
Curious to hear your thoughts!
To me it sounds like a nice and a straightforward solution. I would probably try to keep this PR focused on getting the gunicorn log formatter in first, and once that's done see if there is consensus on the way to extend the basic functionality with the core team (which I am not a part of).
@Kludex @euri10 please, let me know if I can help somehow to push this forward
Just for the record, we also need to update the documentation.
I have rebased the PR to latest master and taken a stab at the documentation. Please, let me know WDYT.
I have rebased the PR to latest master and taken a stab at the documentation.
Just updated the PR again to resolve merge conflicts. @Kludex @euri10 would be great if you had a look when you have a chance.
@br3ndonland any updates on this PR? This would be really helpful for improved o11y in gunicorn with uvicorn workers
@jaronoff97 there have been a couple of attempts (this PR and #1191), but nothing implemented directly in Uvicorn yet.
In the short-term, you could use asgi-logger from @Kludex.
Longer-term, we're looking at implementing the Gunicorn access log format in the Uvicorn 0.20 milestone.
@Kludex is already doing a lot of work for our Encode projects, so I can pick up some of the work on this if needed.