nginx-opentracing
nginx-opentracing copied to clipboard
Opentracing and proxy_cache_use_stale causing lots of log spam
Context
We are noticing a large amount of opentracing error logs for routes that have caching enabled. Example:
2020/12/15 22:39:01 [error] 13#13: *123456789 failed to expand opentracing_context_ for request REQUESTID123: lookup_span_context_value failed: could not find request trace, client: X.X.X.X, server: myserver.com, request: “GET /resource/id HTTP/1.1”, subrequest: “/resource/id”, host: “myserver.com”
which we noticed that these errors happen whenever NGINX is serving up STALE
cache responses:
[15/Dec/2020:22:39:01 +0000] “GET /resource/id HTTP/1.1" 200 1048 “App Client” <...> STALE
Here's what our NGINX config looks like and the specific route:
location / {
opentracing_tag method $request_method;
opentracing_tag uri $uri;
opentracing_propagate_context;
proxy_cache_background_update on; # this must be on in order to use proxy_cache_use_stale below
proxy_pass http://backend;
location ~ /resource/\w+$ {
proxy_cache_methods GET;
proxy_cache_bypass $bypass_cache;
proxy_ignore_headers Cache-Control;
proxy_cache_valid 200 60s;
proxy_cache_use_stale error updating http_500 http_502 http_503 http_504;
proxy_cache_lock on;
proxy_pass http://backend;
}
}
Furthermore, when we disabled proxy_cache_use_stale error updating http_500 http_502 http_503 http_504;
from this route, we stopped seeing these opentracing errors. We need to leave the proxy_cache_use_stale
directive enabled, so this is not an acceptable solution for us.
Questions
- Our running theory is that it seems like an http request that nginx can find a stale hit in it's cache triggers "a background subrequest to update an expired cache item, while a stale cached response is returned to the client." We think this background subrequest is what is triggering the error. ~a stale trace is on the cached response and the opentracing controller cannot tie it back to the current request.~ Is that completely off base? 😅
- Is there a way to disable or not attempt to expand the opentracing context when NGINX is issuing a background subrequest? ~is going to serve a stale cache hit?~
On further investigation, it appears that the errors go away if I add a custom header that we add within the route. I can't speak at all as to why this seemed to work though...
location / {
# ... the same directives as my original description
proxy_set_header My-Custom-Header $my_custom_variable;
location ~ /resource/\w+$ {
proxy_set_header My-Custom-Header $my_custom_variable;
# ... the same directives as my original description
}
}