sdk icon indicating copy to clipboard operation
sdk copied to clipboard

NSM logging system is complicated and buggy

Open denis-tingaikin opened this issue 2 years ago • 13 comments

Description

During fixing issue reported by @richardstone23 I found a few issues in our current NSM logging system

  1. We are using TracingEnabled(true) in each cmd application, when we can check is the tracing enabe by the grpc options.
  2. Even if loglevel != TRACE && TracingEnabled(false) then trace chain element calculates diff and uses CPU for nothing.
  3. TracingEnabled(false)/TracingEnabled(true) is complicated and confuses.
  4. Don’t depend on logrus
  5. https://github.com/networkservicemesh/sdk/issues/1364

Solution

Rework and simplify the logging system.

denis-tingaikin avatar Apr 27 '22 22:04 denis-tingaikin

/cc @edwarnicke , @glazychev-art

denis-tingaikin avatar Apr 27 '22 22:04 denis-tingaikin

And remove any hard-coded reference to logrus in the interface.

It should be possible to implement a custom logger from the Logger interface that isn't logrus.

Now that is possible (it seems) except that you must set the logrus.TraceLevel even if logrus isn't used. It is awkward but no big deal really. It seems that the proposed "simplifications" makes even more hard-coded references to logrus.

Logrus is in maintenace mode for the reason below. I want to use Zap as logger.

I believe Logrus' biggest contribution is to have played a part in today's widespread use of structured logging in Golang. There doesn't seem to be a reason to do a major, breaking iteration into Logrus V2, since the fantastic Go community has built those independently. Many fantastic alternatives have sprung up. Logrus would look like those, had it been re-designed with what we know about structured logging in Go today. Check out, for example, Zerolog, Zap, and Apex.

uablrek avatar Sep 03 '22 08:09 uablrek

Now that is possible (it seems) ...

Nope. NSM doesn't care about the logger set in the context with

nsmlog.WithLog(ctx, log.NSMLogger(logger)) // allow NSM logs

uablrek avatar Sep 03 '22 17:09 uablrek

I guess the problem above is addressed by the first items in the proposed solutio;

1.Use logger from the chain context by trace server chain elements. 2.Remove Default logger

uablrek avatar Sep 05 '22 08:09 uablrek

IMO a good test (necessary actually) to verify that an abstract log interface really is usable for may implementations is to try at least two, e.g. logrus and Zap. I include my Zap based logger below. We use go-logr/logr with Zap as backend.

An NSM Logger with Zap
import (
	"github.com/go-logr/zapr"
	"go.uber.org/zap"
	nsmlog "github.com/networkservicemesh/sdk/pkg/tools/log"
)

// NSMLogger return a logger to use for NSM logging.
func NSMLogger(baseLogger logr.Logger) nsmlog.Logger {
	// Get the zap logger
	if underlier, ok := baseLogger.GetSink().(zapr.Underlier); ok {
		return &nsmLogger{ z: underlier.GetUnderlying() }
	} else {
		panic("NSMLogger: Can't get the Zap logger")
	}
}
type nsmLogger struct {
	z *zap.Logger
}
func (l *nsmLogger) Info(v ...interface{}) {
	l.z.Sugar().Info(v...)
}
func (l *nsmLogger) Infof(format string, v ...interface{}) {
	l.z.Sugar().Infof(format, v...)
}
func (l *nsmLogger) Warn(v ...interface{}) {
	l.z.Sugar().Info(v...)
}
func (l *nsmLogger) Warnf(format string, v ...interface{}) {
	l.z.Sugar().Infof(format, v...)
}
func (l *nsmLogger) Error(v ...interface{}) {
	l.z.Sugar().Error(v...)
}
func (l *nsmLogger) Errorf(format string, v ...interface{}) {
	l.z.Sugar().Errorf(format, v...)
}
func (l *nsmLogger) Fatal(v ...interface{}) {
	l.z.Sugar().Fatal(v...)
}
func (l *nsmLogger) Fatalf(format string, v ...interface{}) {
	l.z.Sugar().Fatalf(format, v...)
}
func (l *nsmLogger) Debug(v ...interface{}) {
	l.z.Sugar().Debug(v...)
}
func (l *nsmLogger) Debugf(format string, v ...interface{}) {
	l.z.Sugar().Debugf(format, v...)
}
func (l *nsmLogger) Trace(v ...interface{}) {
	if l.z.Core().Enabled(-2) {
		l.z.Sugar().Debug(v...)
	}
}
func (l *nsmLogger) Tracef(format string, v ...interface{}) {
	if l.z.Core().Enabled(-2) {
		l.z.Sugar().Debugf(format, v...)
	}
}
func (l *nsmLogger) Object(k, v interface{}) {
	l.z.Sugar().Info(k, v)
}
func (l *nsmLogger) WithField(key, value interface{}) nsmlog.Logger {
	return &nsmLogger{
		z: l.z.With(zap.Any(fmt.Sprintf("%v", key), value)),
	}
}

uablrek avatar Sep 05 '22 08:09 uablrek

Above used as;

	if config.LogLevel == "TRACE" {
		nsmlog.EnableTracing(true)
	        // Work-around for hard-coded logrus dependency in NSM
		logrus.SetLevel(logrus.TraceLevel)
	}
	logger.Info("NSM trace", "enabled", nsmlog.IsTracingEnabled())
	ctx = nsmlog.WithLog(ctx, log.NSMLogger(logger)) // allow NSM logs

uablrek avatar Sep 05 '22 08:09 uablrek

Keep IsTracingEnabled() but move it to the Logger interface. The implementation (whatever logger implementation used) will know this.

uablrek avatar Sep 05 '22 14:09 uablrek

Hello @uablrek!

Good to know that you started look into that. We're super welcome for refactorings/ideas on improvements. I also really like idea to support zap logging for NSM.

Let us know if you want to work with this issue. Otherwise we may schedule this improvement for v1.7.0.

@edwarnicke Thoughts?

denis-tingaikin avatar Sep 07 '22 19:09 denis-tingaikin

Some things that should be done but breaks the API;

Move IsTracingEnabled() to the Logger interface

This is a per-logger thing and moving it removes the need for weird work-arounds with logrus.GetLevel() == logrus.TraceLevel

This would also make EnableTracing(enable bool) obsolete.

Use a single logger as parameter to WithLog()

Current definition;

func WithLog(ctx context.Context, log ...Logger) context.Context

This enforces an extra grouplogger layer every time a context is used and it is un-intuitive and un-necessary.

A user can easily use Combine herself if needed.

  ctx = log.WithLog(ctx, log.Combine(A, B))

The Join() function should be updated accordingly, or ...

Consider ditching the Join() function

This seem to be a rarely used "nice-to-have" function. APIs should be kept a simple as possible to make test and maintenance easier.

uablrek avatar Sep 09 '22 07:09 uablrek

The logging system is going to be more complicated.

I'd suggest drawing a diagram of classes and a diagram of use cases to check the current state of it. After that, we could consider rewriting/simplifying logging to solve all described problems.

denis-tingaikin avatar Oct 25 '22 14:10 denis-tingaikin

I'd suggest drawing a diagram of classes and a diagram of use cases to check the current state of it. After that, we could consider rewriting/simplifying logging to solve all described problems.

Sounds good. When will it be ready?

uablrek avatar Oct 25 '22 16:10 uablrek

Hello, We are trying to test setting different timezones for NSM and the current way of printing timestamps in the logs is confusing. It would be really great if the log messages would be well segmented, so that it could be paresed by other services (i.e. FluentBit) easily.

I'm thinking about something like this:

Current log message: Nov 2 20:44:59.677[36m [INFO] [cmd:/bin/forwarder] [0mexecuting phase 2: run vpp and get a connection to it (time since start: 2.053701ms)

Future log message: {"timestamp": "2022-11-02T20:44:59.677Z", "severity": "info", "binary": "/bin/forwarder", "phase": "2", "message": "run vpp and get a connection to it (time since start: 2.053701ms)"}

richardstone avatar Nov 03 '22 08:11 richardstone

The PR https://github.com/networkservicemesh/sdk/pull/1348 adds a settable NSM-logger. You can configure it with a structured logging format of your preference. Initiate your code with something like;

	if config.LogLevel == "TRACE" {
		nsmlog.EnableTracing(true)
		// Work-around for hard-coded logrus dependency in NSM
		logrus.SetLevel(logrus.TraceLevel)
	}
	logger.Info("NSM trace", "enabled", nsmlog.IsTracingEnabled())
	nsmlogger := log.NSMLogger(logger)
	nsmlog.SetGlobalLogger(nsmlogger)
	ctx = nsmlog.WithLog(ctx, nsmlogger)

The logger should be set as the global logger and added to the context passed to NSM.

We use the zap logger, but you can probably adapt another structured logger easily. Below is our code for an NSMLogger (narrowed) as an example.

The NSM programs themselves can't be configured to use a structured logger yet. We needed to set it in our NSE/NSC code for starters.

NSM-logger example
import "go.uber.org/zap"

// NSM logger;

type nsmLogger struct {
	z *zap.Logger
	s *zap.SugaredLogger
}

func (l *nsmLogger) Info(v ...interface{}) {
	l.s.Info(v...)
}
func (l *nsmLogger) Infof(format string, v ...interface{}) {
	l.s.Infof(format, v...)
}
func (l *nsmLogger) Warn(v ...interface{}) {
	l.s.Info(v...)
}
func (l *nsmLogger) Warnf(format string, v ...interface{}) {
	l.s.Infof(format, v...)
}
func (l *nsmLogger) Error(v ...interface{}) {
	l.s.Error(v...)
}
func (l *nsmLogger) Errorf(format string, v ...interface{}) {
	l.s.Errorf(format, v...)
}
func (l *nsmLogger) Fatal(v ...interface{}) {
	l.s.Fatal(v...)
}
func (l *nsmLogger) Fatalf(format string, v ...interface{}) {
	l.s.Fatalf(format, v...)
}
func (l *nsmLogger) Debug(v ...interface{}) {
	l.s.Debug(v...)
}
func (l *nsmLogger) Debugf(format string, v ...interface{}) {
	l.s.Debugf(format, v...)
}
func (l *nsmLogger) Trace(v ...interface{}) {
	if l.z.Core().Enabled(-2) {
		l.s.Debug(v...)
	}
}
func (l *nsmLogger) Tracef(format string, v ...interface{}) {
	if l.z.Core().Enabled(-2) {
		l.s.Debugf(format, v...)
	}
}
func (l *nsmLogger) Object(k, v interface{}) {
	l.z.Info("Object", zap.Any(fmt.Sprintf("%v", k), v))
}
func (l *nsmLogger) WithField(key, value interface{}) nsmlog.Logger {
	z := l.z.With(zap.Any(fmt.Sprintf("%v", key), value))
	return &nsmLogger{
		z: z,
		s: z.Sugar(),
	}
}

uablrek avatar Nov 03 '22 09:11 uablrek