uvicorn icon indicating copy to clipboard operation
uvicorn copied to clipboard

Implement gunicorn access log format

Open immerrr opened this issue 4 years ago • 35 comments

This PR a very naïve and straightforward implementation of gunicorn access log format, which is currently ignored by uvicorn (see #527).

One notable omission is the WSGI variables: they should be rather straightforward to add, but I think it is important to agree on the approach first. And the approach goes as follows:

It puts the gunicorn log object all the way through to the RequestResponseCycle, and it unrolls gunicorn_log.access function

https://github.com/benoitc/gunicorn/blob/master/gunicorn/glogging.py#L331-L351

so that it is not necessary to recreate all the gunicorn primitives, like gunicorn.http.wsgi.Response and gunicorn.http.message.Request.

The timing is implemented via time.monotonic so that it works well w.r.t. DST transitions and leap seconds. The timing values are added to the asgi scope. And obviously, to be able to measure the timing, the logging had to be moved to the end of response processing.

~~Default uvicorn logging is kept intact to minimize the scope of this PR, and is only disabled when gunicorn_log is passed by gunicorn worker classes to avoid duplication.~~

GunicornSafeAtoms does several things:

  • it provides the values for gunicorn log formatter in the most performant way I could imagine
  • it aggregates response body size from ASGI messages
  • and it does what gunicorn.glogging.SafeAtoms does to provide fallbacks for missing values

Edit by @Kludex : closes #773, closes #527

immerrr avatar Feb 03 '21 15:02 immerrr

ref #773, ref #527, ref #389

immerrr avatar Feb 22 '21 17:02 immerrr

@euri10 is there anything i can do to help get this in? or at least to help to get the discussion started, as i would assume there's a decision to be taken w.r.t. the approach.

immerrr avatar Apr 15 '21 13:04 immerrr

@euri10 is there anything i can do to help get this in? or at least to help to get the discussion started, as i would assume there's a decision to be taken w.r.t. the approach.

back here from the issue discussion, I quickly looked on the PR, I can't pronounce on the gunicorn side of things, however what is pretty clear to me reading the PR is that the implementation is adding keys to the scope that are not in the asgi specification. (https://asgi.readthedocs.io/en/latest/specs/www.html#http-connection-scope).

I tried to dig a little in asgiref and didn't find a definitive answer as to whether or not it's "allowed", I would tend to say it is not, see https://github.com/django/asgiref/issues/178 and also https://github.com/django/asgiref/issues/112 for interesting points.

It would be better to have an implementation that does not violate the spec, if not possible this would need a discussion about that beforehand, I cant remember where but I think this was discussed before and frowned upon, maybe @tomchristie remembers.

euri10 avatar May 28 '21 07:05 euri10

From what I see, these are the added keys:

  • request_start_time
  • request_end_time

Would the extensions approach from the spec work here?

There are times when protocol servers may want to provide server-specific extensions outside of a core ASGI protocol specification, or when a change to a specification is being trialled before being rolled in. For this use case, we define a common pattern for extensions - named additions to a protocol specification that are optional but that, if provided by the server and understood by the application, can be used to get more functionality. This is achieved via an extensions entry in the scope dictionary, which is itself a dict. Extensions have a Unicode string name that is agreed upon between servers and applications.

chbndrhnns avatar May 28 '21 08:05 chbndrhnns

Would the extensions approach from the spec work here?

I think yes that would be the most sensible thing to do to move things forward.

euri10 avatar May 28 '21 08:05 euri10

Heyy, now we're talking (literally), thanks!

I'll update the PR to use the extensions approach, or if you see another way to store the start/end timestamps within the scope of the request, i'm open to other ideas :+1:

immerrr avatar May 28 '21 11:05 immerrr

@euri10 i have updated the code to use scope['extensions']['uvicorn_request_duration'] to store the timestamps, but like i said previously, if there's a better place to put it, just let me know.

immerrr avatar May 31 '21 14:05 immerrr

you can use ./scripts/lint if you have linter issues, not sure if you saw it

euri10 avatar May 31 '21 14:05 euri10

I really want this feature, but I'm not sure this is the way to go.

Some points on my mind:

  • I don't think we should have any reference to gunicorn on our source code besides the worker.
  • I think uvicorn should have its way to customize the logger format, which for convenience it's a good idea to match gunicorn's way.
  • I don't feel comfortable adding an extension on the ASGI scope just for the time, I think we can achieve that with a single variable, even if that variable needs to be sent to another object (cycle). :thinking:

Those are my concerns at first glance. What do you think? @euri10 @immerrr

Kludex avatar Jun 12 '21 16:06 Kludex

I agree with the concerns listed above, as i mentioned earlier, it was done the way it was done to minimize the scope of change and to let the uvicorn team decide the fate of uvicorn's logging and formatting at your convenience.

If I'm reading this correctly, the requested changes are:

  • move tracking of request-response timing to a variable
  • turn the gunicorn logging format handling into uvicorn's native logging formatting

If those were to happen, would it make this PR "the way to go"?

immerrr avatar Jun 13 '21 09:06 immerrr

If I'm reading this correctly, the requested changes are:

  • move tracking of request-response timing to a variable
  • turn the gunicorn logging format handling into uvicorn's native logging formatting

If those were to happen, would it make this PR "the way to go"?

Yeah, but also adding the --access-log-format on the CLI command (just being explicit here).

Besides if we can get the answer to "is it possible to read the gunicorns --access-log-format in the UvicornWorker?" it would be cool.

EDIT: For the last question, maybe self.cfg.access_log_format? Not sure...

Kludex avatar Jun 13 '21 09:06 Kludex

Not sure if i understand the last question correctly, but i think the format is available as

self.log.cfg.access_log_format

On the following line: https://github.com/encode/uvicorn/pull/947/files#diff-7c8a9d71c98f0ed83c32a88a1b7e777ef85776c98869045d24c74e3f7714494aR41

And I am making use of it in this PR.

immerrr avatar Jun 13 '21 09:06 immerrr

Ah, yeah! Perfect.

So, in summary:

  • Store request-response timing in a variable (being an attribute of RequestResponseCycle?).
  • As you said: "turn the gunicorn logging format handling into uvicorn's native logging formatting". This is for convenience, my point is that we should have our formatting rules.
  • Create the --access-log-format on uvicorn, and on the UvicornWorker send the "access_log_format": self.log.cfg.access_log_format instead of "gunicorn_log": self.log. In case we decide to not match the rules (exactly the same), we can create a converter on the UvicornWorker that will aim to match the gunicorn with uvicorn rules. But with this idea present, we can make use of --access-log-format either from uvicorn or gunicorn on uvicorn.

Does it make sense? :thinking:

Kludex avatar Jun 13 '21 10:06 Kludex

It can be made to work.

It's been a while since I wrote this, but i think the default uvicorn logging uses colours, whereas gunicorn doesn't support that, so we either need to disable colouring (break backward compat), or agree that colouring is not possible with custom log formats, or start bending gunicorn formatting features right away :)

immerrr avatar Jun 13 '21 10:06 immerrr

It's been a while since I wrote this, but i think the default uvicorn logging uses colours, whereas gunicorn doesn't support that, so we either need to disable colouring (break backward compat), or agree that colouring is not possible with custom log formats, or start bending gunicorn formatting features right away :)

@euri10 thoughts?

Kludex avatar Jun 13 '21 11:06 Kludex

Just started implementing the requested changes, and ran into another point of contention/agreement. The existing uvicorn logging is done when response is started, whereas gunicorn logging is done when the response is ended (to be able to provide all response headers and total request-response duration time).

Do you want to move the default uvicorn logging to the end of the cycle? Or perhaps do the access log before or after response depending on whether or not access-log-format was passed?

This was one more question I was trying to avoid with this PR by keeping the gunicorn logging format gunicorn-specific :)

immerrr avatar Jun 13 '21 16:06 immerrr

Do you want to move the default uvicorn logging to the end of the cycle? Or perhaps do the access log before or after response depending on whether or not access-log-format was passed?

IMHO, we can move it. I also think that makes more sense to move it. But maybe we should take a look at how hypercorn does it or gunicorn itself with their main workers :thinking:

Kludex avatar Jun 13 '21 16:06 Kludex

~~Oh, and another thing that should work right now and will not with the new suggested approach is the handling of gunicorn's --access-logfile. Current implementation takes the pre-configured logger object from gunicorn with all its handlers~~

https://github.com/encode/uvicorn/blob/4e7da5495d9713dcfb04d96ef82f828319e29e84/uvicorn/protocols/http/httptools_impl.py#L534-L536

~~And if the new version only uses the access log format, this behaviour will have to be reproduced later on.~~

EDIT: nvm, log file and levels are copied in workers.py

immerrr avatar Jun 13 '21 16:06 immerrr

I have added a minimal test to exercise gunicorn worker logging, because it is true that previously there were no tests for this.

immerrr avatar Jun 14 '21 10:06 immerrr

I have updated the branch according to the feedback, it's almost there:

  • the old default logging format is kept as is
  • the --access-log-format parameter still needs to be added to uvicorn itself

immerrr avatar Jun 21 '21 16:06 immerrr

I have rebased to master and added --access-log-format, and looks like the test are passing: https://github.com/immerrr/uvicorn/actions/runs/957952813

I'm not sure I like the result: now it looks like there are two ways to configure access log format, in the logging dict, that depends on the custom formatter class and a very specific set of parameters passed into the logging method, and then with the new command line parameter. Not sure, what's your preference on resolving this, I personally don't have a strong opinion on this.

@Kludex @euri10, I have enabled "allow edits from maintainers", so if you agree that it is going to be more efficient than the review-comment-fix-review-comment-fix... loop, feel free to tweak this to your liking yourselves.

immerrr avatar Jun 21 '21 17:06 immerrr

I'm going to take a look at this later today, but I also want to have your opinion @euri10 .

Here you have hypercorn implementation: https://gitlab.com/pgjones/hypercorn/-/blob/master/src/hypercorn/logging.py This one is from gunicorn: https://github.com/benoitc/gunicorn/blob/master/gunicorn/glogging.py

Maybe we want to use a simple dictionary (same as both implementations above) instead of the decorator idea around the atoms? Looks simpler and less line consuming? :thinking:

Kludex avatar Jun 26 '21 11:06 Kludex

@Kludex @euri10 please, let me know if I can help somehow to push this forward

immerrr avatar Aug 02 '21 17:08 immerrr

@immerrr I have a suggestion to make it possible to pass custom fields to the access logger. Specifically, after #859 it became impossible to access any fields that one might have been storing on the scope.

We could achieve this by making it possible to configure which class is assigned to self.access_log_fields in the __init__ method of RequestResponseCycle:

class RequestResponseCycle:
    def __init__(..., access_log_fields: Type[AccessLogFields], ...):
        self.access_log_fields = access_log_fields(self.scope, self.timing)
        ...

We would then read that class from the config, much like the access_log_format.

A user could then inherit from the AccessLogFields to pass whatever fields they like to the access logger, including information stored on the scope:

class CustomAccessLogFields(AccessLogFields):
    @_register_handler("my_field")
    def my_field(self) -> Optional[str]:
        return self.scope["my_field"]

Curious to hear your thoughts!

drderuiter avatar Aug 25 '21 13:08 drderuiter

Curious to hear your thoughts!

To me it sounds like a nice and a straightforward solution. I would probably try to keep this PR focused on getting the gunicorn log formatter in first, and once that's done see if there is consensus on the way to extend the basic functionality with the core team (which I am not a part of).

immerrr avatar Aug 25 '21 14:08 immerrr

@Kludex @euri10 please, let me know if I can help somehow to push this forward

immerrr avatar Oct 26 '21 09:10 immerrr

Just for the record, we also need to update the documentation.

I have rebased the PR to latest master and taken a stab at the documentation. Please, let me know WDYT.

immerrr avatar Nov 12 '21 17:11 immerrr

I have rebased the PR to latest master and taken a stab at the documentation.

Just updated the PR again to resolve merge conflicts. @Kludex @euri10 would be great if you had a look when you have a chance.

immerrr avatar Feb 10 '22 08:02 immerrr

@br3ndonland any updates on this PR? This would be really helpful for improved o11y in gunicorn with uvicorn workers

jaronoff97 avatar Oct 18 '22 14:10 jaronoff97

@jaronoff97 there have been a couple of attempts (this PR and #1191), but nothing implemented directly in Uvicorn yet.

In the short-term, you could use asgi-logger from @Kludex.

Longer-term, we're looking at implementing the Gunicorn access log format in the Uvicorn 0.20 milestone.

@Kludex is already doing a lot of work for our Encode projects, so I can pick up some of the work on this if needed.

br3ndonland avatar Oct 18 '22 21:10 br3ndonland