Support for logging contextual metadata

Open yurishkuro opened this issue 5 years ago • 31 comments

Requirement

When my application is propagating context.Context object and that context include metadata about a distributed transaction, such as distributed tracing information (trace_id, span_id, etc.) from Jaeger or Zipkin, I want to be able to tag all logs with those metadata fields, similar to how MDC works in [Java loggers][https://logback.qos.ch/manual/mdc.html].
I want to be able to configure zap to forward logs to the opentracing.Span stored in the Context, i.e. translate logger.Info(ctx, fields...) to spanFrom(ctx).Log(fields)

What's preventing me from doing that in Zap

It is currently possible to do in Zap via With([]Fields) scoping of the Logger, however:

if it's done at the call sites, then it is easy to forget to do so
if the scoped Logger is created at the top of the request handler, then this logger needs to be passed through the application logic, which does not play well with dependency injection since many components tend to accept the logger only at initialization.

Proposal

Create a new type of logger, e.g. ContextLogger, similar to SugarLogger, which has methods similar to the regular Logger, but always taking context.Context as the first argument.
- example by @ramonza
- Could also create SugarContextLogger
Add mechanism for optional context.Context encoding in the Entry/Encoders.

-- Updated 2019-05-13 Added second requirement that came up in https://github.com/uber-go/zap/issues/654#issuecomment-442692349

Nov 28 '18 19:11 yurishkuro

@akshayjshah @prashantv please let me know if this is an acceptable direction, and I can follow up with full PRs.

Nov 29 '18 00:11 yurishkuro

Rather than adding new types which are essentially duplicates of the current interface, I think I'd prefer a new smaller type -- it accepts a context and returns the existing logger types:

type ContextLogger struct{...}

func (*ContextLogger) Ctx(context.Context) zap.Logger

We could have a duplicate that returns a SugaredLogger, or additional methods.

It also allows the user to select when to pay the cost of serializing the context, rather than being forced to pay it at every log call.

// if you only log a couple of times, you may want to take the cost of context serialization each time
ctxLogger.Ctx(ctx).Info(...)
ctxLogger.Ctx(ctx).Info(...)

// if you're logging many times, you can cache the serialized context in the logger
logger := ctxLogger.Ctx(ctx)
logger.Info(..)
logger.Info(..)
logger.Info(..)
logger.Warn(..)
logger.Error(..)

We can still add lint rules to ensure that if there's a ctx in scope, that the user uses a ContextLogger.

The ContextLogger type can also live outside of zap, nothing about it ties it to zap (it just needs to return a zap.Logger with fields with the context's contents).

Nov 29 '18 02:11 prashantv

@prashantv thanks. What you're suggesting is very similar to what I did in HotROD demo app long ago - https://github.com/jaegertracing/jaeger/blob/master/examples/hotrod/pkg/log/factory.go#L44

However, in HotROD we could also write the logs to the Span found in the Context, so that the logs could be viewed in Jaeger UI inside the actual trace. If ctxLogger.Ctx(ctx) returns *zap.Logger can that logger also have a dynamically attached encoder that will know about the span and write to it? I didn't mention it as a requirement in the ticket, but your proposal reminded me - some users found it very useful to be able to view the logs in Jaeger UI instead of log aggregation tool

Nov 29 '18 03:11 yurishkuro

@prashantv another downside of the ctxLogger.Ctx(ctx) wrapper is that you always have to pay the cost of fields generation and Logger allocation even if the resulting debug level is below the threshold.

Nov 29 '18 16:11 yurishkuro

I'd also prefer a smaller interface, like the one Prashant suggests (I believe we discussed this exact API in some Google Docs comments ages ago). I'm not sure we need a sugared variant; clogger.Ctx(ctx).Sugar() isn't much less typing than clogger.SugaredCtx(ctx), and it won't need to change if/when we need to introduce more variant loggers.

The implementation of this feature shouldn't require touching encodings or anything in the zapcore package at all. Instead, we can add a zap.Option that allows the user to describe how contexts should be serialized, and at the zap layer we can convert calls to ContextLogger.Ctx to their equivalent Logger.With.

type Contexter interface {
  Context(context.Context) zapcore.Field
}

func AddContext(c Contexter) Option { ... }

func (c *ContextLogger) Ctx(ctx context.Context) *Logger {
  // we'll need some more, but the basic idea is...
  return c.logger.With(c.logger.contextify(ctx))
}

The default Contexter can log the deadline and time remaining. Users of distributed tracing systems can provide a more complex implementation.

@prashantv another downside of the ctxLogger.Ctx(ctx) wrapper is that you always have to pay the cost of fields generation and Logger allocation even if the resulting debug level is below the threshold.

Definitely true of the single logger allocation. Any cost associated with creating more complex fields (simple fields stay on the stack) will have to paid either way - at least this approach allows us to pay that cost once, instead of once at each log site. On balance, I think that the single allocation is worth it.

If ctxLogger.Ctx(ctx) returns *zap.Logger can that logger also have a dynamically attached encoder that will know about the span and write to it?

I'd like this too - providing a single entry point to these two logging systems is a great idea. However, it's a bit more complex. There are at least three options to do this:

One option is to implement the zapcore.Core API on top of the tracing APIs. If we want to go down this road, then we should make the Contexter API a little different:

type Contexter interface {
  Context(context.Context) zap.Option
}

We'd use the WrapCore option to dynamically attach a new core to the logger, and the Fields option to add some extra data to the message. (We'd also need a way to compose options, but that's not difficult.)
The downside of this approach is that the new core won't have access to any of the fields that have already been added to the logger and we'll end up serializing everything twice.

Alternatively, we can implement the zapcore.WriteSyncer API on top of the tracing APIs. That's nice because it doesn't require re-serializing the data and it'll include any fields already added to the logger. However, we'd need to sort out a way to propagate the context into the internals of zapcore, which is going to be pretty complex. Downsides here are complexity, the need to introduce an io.Writer-style API to the Jaeger span, and the need to validate that the encodings used by the logger are also appropriate for the span (if you look at our internal zap plugin for Fx, we're already validating that all production logs are JSON). The final, and biggest, downside is that after all this work, we'll only have integrated these systems for Go programs that use zap.

The third (and to me, the best) option is to have our log-tailing agent send all messages that include a span and trace ID to the Jaeger agent, which can then treat them just like messages sent via the in-memory OpenTracing APIs. That's probably a bit more complex, but it solves this problem for all languages and logging libraries, and I think it'd also make it easier for other companies to get value out of Jaeger's logging functionality.

Nov 30 '18 00:11 akshayjshah

I have some interesting to this issue. so, I want to add traceId and spanId to fields from context with OpenCensus. It requires context from logger.Xxx(msg string, fields ... Field) timing. but current API can't it. traceId does not change during the request. but, spanId changes frequently and invisibly (from application code). (e.g. Database transactions often generate new spans.)

I hope to all Xxx(msg string, fields ... Field) like methods take ctx as an argument and it will be passed to Core.Write. But that requires a breaking change ....

May 20 '19 10:05 vvakame

I don't understand why this needs to be integrated into zap considering the kinds of fields you need to pull out of any context.Context is application specific. Therefore, adding it to zap would make it far less useable that declaring your own application specific func FromContext(c context.Context) *zap.SugaredLogger. Am I missing something?

I'm currently working on a feature that shares requirements with what we're after here, but it made the most sense to have it implemented specific to my application which has its own specific metadata keys.

On the other hand,

I suppose you'd be adding a zap.ContextWith(c context.Context, fields ...zap.Field) context.Context for decorating each context with fields, and then using that.

If so, that could allow other thirdparty libraries the opportunity to add their zap fields to the context. Is that the goal?

Dec 13 '19 18:12 colelawrence

I'm definitely finding late this issue, although its the same problem for me.

First of all (thats more for other people discovering this issue) - if You're using elastic apm for tracing, there is builtin functionality for zap logs - https://www.elastic.co/guide/en/apm/agent/go/master/builtin-modules.html#builtin-modules-apmzap

But in general, I think what would've been great - is to have context.Context field in zapcore.Entry structure with possibility to set it somehow for each entry. Unfortunately logger.With adds only fields and options do not operate with entry but rather with logger itself. I'm not sure how it should be done here - I don't have enough domain knowledge about zap

Having context for entry will allow to extract any required information from context either with custom encoder or with hook without any major break.

For example logrus has it - https://github.com/sirupsen/logrus/blob/master/entry.go#L67

Nov 01 '20 09:11 a-urth

I just found this issue after creating #898 but the ideas of having a generic way to set the context and then having some info extracting automatically stills holds. I came onto this regarding Opentelemetry and I have the same problem as the original poster, if you want to add trace_id and span_id in every logged entry you want it done automatically, having a way to just provide the context and then having a hook add the data seems the better way of doing this.

Dec 27 '20 14:12 schmurfy

I'm facing the same interest at the moment while building things that will be related to open-telemetry...

@schmurfy How did you solve the problem?

@a-urth Did you do anything or did you just migrated to logrus?

Jan 18 '21 02:01 marcellodesales

It is possible to extend Zap as follows:

new field that captures context, e.g. logger.Debug(fields.Context(ctx), ...)
an interface to extract metadata from the context, e.g.

type ContextEncoder interface {
    Encode(context.Context) []zap.Field
}

a modified core combining the above so that a log entry is enriched with context field before being serialized

This is how zap was used internally at Uber (when I was there), but unfortunately the extension code was not in the OSS. It was all done in additional modules, so modifying Zap itself was not necessary.

Jan 18 '21 03:01 yurishkuro

@marcellodesales we ended up generating a wrapper around the methods which do the real work, logging and span attributes are there. Not really satisfying but it works.

Jan 18 '21 19:01 schmurfy

@schmurfy any reference implementation maybe would be a great start for the community :) As we see more and more users interested in this feature...

Feb 06 '21 03:02 marcellodesales

https://github.com/uptrace/opentelemetry-go-extra/tree/main/otelzap

Feb 06 '21 14:02 vmihailenco

Here's what I was thinking in terms of adding custom fields with every log request. It uses a custom zapcore.Core to wrap the ioCore and adds a field of your choosing with each log request. Your custom logic in Write could grab the metadata out of the context or any other functionality of your choosing. Your custom logic would also be responsible for any kind of caching so as not to slow down every log request.

Cons to this approach would be serializing the added Fields with each log request. In my own use case I don't have a clean way of instantiating a "context logger" at the top of each request stack and using that throughout the application without breaking our DI pattern. Incurring the serialization hit in our case seemed worth it to be able to add the single WrapCore and then not have to worry about those added fields anywhere else.

Thoughts?

https://gist.github.com/jejikenwogu/2e605e467e188bba42707be7709d4bdb

Mar 21 '21 11:03 jejikenwogu

I believe just adding a Ctx(context.Context) will not be the most performant, because you need to call it every time you log something. In general a "Logger" has a static scope in a code (at least in most of the languages that I am aware of), so you will have a "Logger" per a component instance (probably enhanced with component name, etc) and "Context" has a request scope. So essentially we are trying to use a "static scoped object" to record any "request scope informations" (e.g. error, message, context, etc.), so I think it makes more clear to have all arguments in the same function call since they all have the same scope.

It may not be ideal, but probably the best option is to have something like "InfoC(ctx context.Context, ...)" and all the equivalents.

Oct 29 '21 19:10 bogdandrutu

how about just add []zap.Field to context.Context?

package zapctx

import (
	"context"

	"go.uber.org/zap"
)

type (
	contextKey struct{}
)

func With(ctx context.Context, log *zap.Logger) *zap.Logger {
	fs, _ := ctx.Value(contextKey{}).([]zap.Field)
	return log.With(fs...)
}

func Add(ctx context.Context, fields ...zap.Field) context.Context {
	if len(fields) == 0 {
		return ctx
	}

	fs, _ := ctx.Value(contextKey{}).([]zap.Field)
	return context.WithValue(ctx, contextKey{},
		append(fs, fields...),
	)
}

func Set(ctx context.Context, fields ...zap.Field) context.Context {
    // replace existing fields
}

Oct 29 '21 19:10 fxrlv

Same problem you force a call to With every time I want to do any logging, and cannot benefit from the optimization to not do unnecessary work when level is less than configured level, etc.

type ComponentA struct {
  logger *zap.Logger
  compB *ComponentB
}

func NewComponentA(logger *zap.Logger, compB *ComponentB) *ComponentA {
  return &ComponentA {
    logger: logger.With(zap.String("component", "A")),
    compB: compB,
  }
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  a.logger.DebugC(ctx, "Start compB.DoWork")
  err := a.compB.DoWork(ctx, req)
  a.logger.DebugC(ctx, "Finish compB.DoWork", zap.Error(err))
  return err
}

type ComponentB struct {
  logger *zap.Logger
}

func NewComponentB(logger *zap.Logger) *ComponentB {
  return &ComponentB {
    logger: logger.With(zap.String("component", "B")),
  }
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  a.logger.InfoC(ctx, "I am zen")
  return nil
}

Try to map your solution to this.

Oct 29 '21 19:10 bogdandrutu

another approach

package main

import (
	"context"

	"go.uber.org/zap"
	"go.uber.org/zap/zapcore"
)

type Valuer interface {
	Value(key interface{}) interface{}
}

func Context(key string, value Valuer) zap.Field {
	return zap.Object(key, valuer{value})
}

type (
	valuer struct {
		Valuer
	}

	contextKey struct{}
)

func (v valuer) MarshalLogObject(enc zapcore.ObjectEncoder) error {
	fs, _ := v.Value(contextKey{}).([]zap.Field)
	for i := range fs {
		fs[i].AddTo(enc)
	}

	return nil
}

func main() {
	log, _ := zap.NewDevelopment()

	ctx := context.WithValue(context.Background(),
		contextKey{}, []zap.Field{
			zap.String("foo", "from context"),
			zap.String("bar", "from context"),
		},
	)

	log.Info("message",
		zap.String("baz", "explicit"),
		//zap.Context("context", ctx),
		Context("context", ctx),
	)

	// output: {"baz": "explicit", "context": {"foo": "from context", "bar": "from context"}}
}

Oct 29 '21 20:10 fxrlv

also there is zap.Inline to inject implicit context fields without nested object

Oct 29 '21 20:10 fxrlv

@fxrlv indeed this idea seems tempting, but is not as clean as passing the Context as the first argument per "golang" recommendation. It is a bit of an anti-pattern to see the "Context" in the middle or at the end of a function call but seems that is the only option zap authors are willing to give.

Also it seems that the authors of this project do not really care about users asks, sorry but I have to say this since 40 "+1"s are ignored and no real solution is giving.

Nov 11 '21 18:11 bogdandrutu

iirc having Context(ctx) method that returns a special Field was the approach used internally at Uber. Combined with an extractor that can be attached to the logger/core itself, it allows the extractor to pull certain values from ctx and convert them into proper fields. The major downside was that there was no way to cache that since Go doesn't allow interface pointers as dictionary keys.

Nov 11 '21 19:11 yurishkuro

So, for 2022, what do you believe it is the best approach (performance wise) to add some fields at the beginning of the flow of a request?

Creating a new suggarLog withFields when creating a new context and store the logger in the context? which basically that this package does: I saw the package https://pkg.go.dev/github.com/juju/zaputil/zapctx or more simpler: https://github.com/northwesternmutual/kanali/blob/master/pkg/log/log.go Or store the []zap.Field?

Or is it out there something better?

For my case we just have REST incoming request which do several operation with RabbitMQ,DB,Caches, calling other external services, etc... and we want to track the whole flow.

May 09 '22 19:05 willthrom

const alreadyAdded = this.loggers.has(id) TypeError: this.loggers.has is not a function

why this error

Feb 08 '23 18:02 pradhumnsingh1

Same problem you force a call to With every time I want to do any logging, and cannot benefit from the optimization to not do unnecessary work when level is less than configured level, etc.

type ComponentA struct {
  logger *zap.Logger
  compB *ComponentB
}

func NewComponentA(logger *zap.Logger, compB *ComponentB) *ComponentA {
  return &ComponentA {
    logger: logger.With(zap.String("component", "A")),
    compB: compB,
  }
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  a.logger.DebugC(ctx, "Start compB.DoWork")
  err := a.compB.DoWork(ctx, req)
  a.logger.DebugC(ctx, "Finish compB.DoWork", zap.Error(err))
  return err
}

type ComponentB struct {
  logger *zap.Logger
}

func NewComponentB(logger *zap.Logger) *ComponentB {
  return &ComponentB {
    logger: logger.With(zap.String("component", "B")),
  }
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  a.logger.InfoC(ctx, "I am zen")
  return nil
}

Try to map your solution to this.

this is not a good way, because most of them time you need to get some context information from context and in the construction ( New) you usually ctx has not been passed. I'm talking about the case that you want to get trace_id and span_id from context and add it to logger fields.

May 23 '23 10:05 yuseferi

@willthrom storing the whole log in the context is not a good way for that, we did it a few years ago in a couple of projects, but to be honest we didn't have a better option. it works but I would say not the best case,

this is an example to store specific keys in the context and use it in the logger.

const loggerKey = "my-logger"

// Set zap.Logger as variable in context
func Set(ctx context.Context, logger *zap.Logger) context.Context {
	setup := logger
	fields := make([]zap.Field, 0)

	for _, key := range []string{local.TraceID, local.SpanID, local.ParentSpanID} {
		if value, ok := ctx.Value(key).(string); ok {
			fields = append(fields, zap.String(key, value))
		}
	}

	if len(fields) > 0 {
		setup = setup.With(fields...)
	}

	return context.WithValue(ctx, loggerKey, setup)
}

// Get zap.Logger from context
func Get(ctx context.Context) *zap.Logger {
	if logger, ok := ctx.Value(loggerKey).(*zap.Logger); ok {
		return logger
	}
	return zap.L()
}

May 25 '23 21:05 yuseferi

I've been using a pattern where I only store fields in the context, and have have a function that will decorate a logger with the fields of in the context:


type logKeyType struct{}

// Context stores a list of zap.Field in a context.
// If the parent context had any zap.Field's they will be kept.
func Context(ctx context.Context, fields ...zap.Field) context.Context {
	old := GetFields(ctx)
	fields = append(old, fields...)
	return context.WithValue(ctx, logKeyType{}, fields)
}

// GetFields returns the zap.Field's stored in the context or nil if none are found.
func GetFields(ctx context.Context) []zap.Field {
	fields, ok := ctx.Value(logKeyType{}).([]zap.Field)
	if !ok {
		return nil
	}
	return fields
}

// WithContext gets the zap.Field's from a context and adds them to an existing logger.
func WithContext(ctx context.Context, log *zap.Logger) *zap.Logger {
	return log.With(GetFields(ctx)...)
}

// With adds fields the context and return a logger with the same fields added
func With(ctx context.Context, log *zap.Logger, fields ...zap.Field) (context.Context, *zap.Logger) {
	ctx = Context(ctx, fields...)
	log = log.With(fields...)
	return ctx, log
}

The usage looks something like this:

log := /* any existing zap.Logger */

ctx := Context(context.Background(), zap.String("hello", "world"))

log.Info("Just a log") // will just use the logger as is

log = WithContext(ctx, log) // creates a new logger with all fields (if any) from the context added 
log.Info("A log with fields from the context")

May 27 '23 09:05 ptxmac

Hi @ptxmac , Actually, it's a better way than storing the whole logger object ( struct) in the context, just if you could improve it to not be a separate function and be a method on your customer wrapper around zap. logger would be a better option.

May 27 '23 10:05 yuseferi

Finally, I had time to create a simple package for this.

github.com/yuseferi/zax https://pkg.go.dev/github.com/yuseferi/zax


func main() {
    logger, _ := zap.NewProduction()
    ctx := context.Background()
    s := NewServiceA(logger)
    ctx = zax.Set(ctx, logger, []zap.Field{zap.String("trace_id", "my-trace-id")})
    s.funcA(ctx)
}

type ServiceA struct {
logger *zap.Logger
}

func NewServiceA(logger *zap.Logger) *ServiceA {
    return &ServiceA{
        logger: logger,
    }
}

func (s *ServiceA) funcA(ctx context.Context) {
    s.logger.Info("func A") // it does not contain trace_id, you need to add it manually
    zax.Get(ctx).Info("func A") // it will logged with "trace_id" = "my-trace-id"
}

feel free to contribute to make it better folks 😄 ❤️

Jun 08 '23 15:06 yuseferi

Hey guys, I just release version 2 of zax. it has about 15-20% improvement in performance. https://github.com/yuseferi/zax.

Basically instead of storing the whole logger in the context it just stores fields which cause better performance .

example:

func main() {
    logger, _ := zap.NewProduction()
    ctx := context.Background()
    s := NewServiceA(logger)
    ctx = zax.Set(ctx, zap.String("trace_id", "my-trace-id"))  
    // and if you want to add multiple of them at once
    //ctx = zax.Set(ctx, []zap.Field{zap.String("trace_id", "my-trace-id"),zap.String("span_id", "my-span-id")})
    s.funcA(ctx)
}

type ServiceA struct {
logger *zap.Logger
}

func NewServiceA(logger *zap.Logger) *ServiceA {
    return &ServiceA{
        logger: logger,
    }
}


func (s *ServiceA) funcA(ctx context.Context) {
    s.logger.Info("func A") // it does not contain trace_id, you need to add it manually
	s.logger.With(zax.Get(ctx)...).Info("func A") // it will logged with "trace_id" = "my-trace-id"
}

I would like to hear your thoughts

Nov 09 '23 12:11 yuseferi

zap zap copied to clipboard

Support for logging contextual metadata

Requirement

What's preventing me from doing that in Zap

Proposal

zap
zap copied to clipboard