sdk
sdk copied to clipboard
NSM logging system is complicated and buggy
Description
During fixing issue reported by @richardstone23 I found a few issues in our current NSM logging system
- We are using TracingEnabled(true) in each cmd application, when we can check is the tracing enabe by the grpc options.
- Even if loglevel != TRACE && TracingEnabled(false) then trace chain element calculates diff and uses CPU for nothing.
- TracingEnabled(false)/TracingEnabled(true) is complicated and confuses.
- Don’t depend on logrus
- https://github.com/networkservicemesh/sdk/issues/1364
Solution
Rework and simplify the logging system.
/cc @edwarnicke , @glazychev-art
And remove any hard-coded reference to logrus in the interface.
It should be possible to implement a custom logger from the Logger interface that isn't logrus.
Now that is possible (it seems) except that you must set the logrus.TraceLevel
even if logrus isn't used. It is awkward but no big deal really. It seems that the proposed "simplifications" makes even more hard-coded references to logrus.
Logrus is in maintenace mode for the reason below. I want to use Zap as logger.
I believe Logrus' biggest contribution is to have played a part in today's widespread use of structured logging in Golang. There doesn't seem to be a reason to do a major, breaking iteration into Logrus V2, since the fantastic Go community has built those independently. Many fantastic alternatives have sprung up. Logrus would look like those, had it been re-designed with what we know about structured logging in Go today. Check out, for example, Zerolog, Zap, and Apex.
Now that is possible (it seems) ...
Nope. NSM doesn't care about the logger set in the context with
nsmlog.WithLog(ctx, log.NSMLogger(logger)) // allow NSM logs
I guess the problem above is addressed by the first items in the proposed solutio;
1.Use logger from the chain context by trace server chain elements. 2.Remove Default logger
IMO a good test (necessary actually) to verify that an abstract log interface really is usable for may implementations is to try at least two, e.g. logrus and Zap. I include my Zap based logger below. We use go-logr/logr with Zap as backend.
An NSM Logger with Zap
import (
"github.com/go-logr/zapr"
"go.uber.org/zap"
nsmlog "github.com/networkservicemesh/sdk/pkg/tools/log"
)
// NSMLogger return a logger to use for NSM logging.
func NSMLogger(baseLogger logr.Logger) nsmlog.Logger {
// Get the zap logger
if underlier, ok := baseLogger.GetSink().(zapr.Underlier); ok {
return &nsmLogger{ z: underlier.GetUnderlying() }
} else {
panic("NSMLogger: Can't get the Zap logger")
}
}
type nsmLogger struct {
z *zap.Logger
}
func (l *nsmLogger) Info(v ...interface{}) {
l.z.Sugar().Info(v...)
}
func (l *nsmLogger) Infof(format string, v ...interface{}) {
l.z.Sugar().Infof(format, v...)
}
func (l *nsmLogger) Warn(v ...interface{}) {
l.z.Sugar().Info(v...)
}
func (l *nsmLogger) Warnf(format string, v ...interface{}) {
l.z.Sugar().Infof(format, v...)
}
func (l *nsmLogger) Error(v ...interface{}) {
l.z.Sugar().Error(v...)
}
func (l *nsmLogger) Errorf(format string, v ...interface{}) {
l.z.Sugar().Errorf(format, v...)
}
func (l *nsmLogger) Fatal(v ...interface{}) {
l.z.Sugar().Fatal(v...)
}
func (l *nsmLogger) Fatalf(format string, v ...interface{}) {
l.z.Sugar().Fatalf(format, v...)
}
func (l *nsmLogger) Debug(v ...interface{}) {
l.z.Sugar().Debug(v...)
}
func (l *nsmLogger) Debugf(format string, v ...interface{}) {
l.z.Sugar().Debugf(format, v...)
}
func (l *nsmLogger) Trace(v ...interface{}) {
if l.z.Core().Enabled(-2) {
l.z.Sugar().Debug(v...)
}
}
func (l *nsmLogger) Tracef(format string, v ...interface{}) {
if l.z.Core().Enabled(-2) {
l.z.Sugar().Debugf(format, v...)
}
}
func (l *nsmLogger) Object(k, v interface{}) {
l.z.Sugar().Info(k, v)
}
func (l *nsmLogger) WithField(key, value interface{}) nsmlog.Logger {
return &nsmLogger{
z: l.z.With(zap.Any(fmt.Sprintf("%v", key), value)),
}
}
Above used as;
if config.LogLevel == "TRACE" {
nsmlog.EnableTracing(true)
// Work-around for hard-coded logrus dependency in NSM
logrus.SetLevel(logrus.TraceLevel)
}
logger.Info("NSM trace", "enabled", nsmlog.IsTracingEnabled())
ctx = nsmlog.WithLog(ctx, log.NSMLogger(logger)) // allow NSM logs
Keep IsTracingEnabled()
but move it to the Logger interface. The implementation (whatever logger implementation used) will know this.
Hello @uablrek!
Good to know that you started look into that. We're super welcome for refactorings/ideas on improvements. I also really like idea to support zap logging for NSM.
Let us know if you want to work with this issue. Otherwise we may schedule this improvement for v1.7.0.
@edwarnicke Thoughts?
Some things that should be done but breaks the API;
Move IsTracingEnabled() to the Logger interface
This is a per-logger thing and moving it removes the need for weird work-arounds with logrus.GetLevel() == logrus.TraceLevel
This would also make EnableTracing(enable bool)
obsolete.
Use a single logger as parameter to WithLog()
Current definition;
func WithLog(ctx context.Context, log ...Logger) context.Context
This enforces an extra grouplogger
layer every time a context is used and it is un-intuitive and un-necessary.
A user can easily use Combine
herself if needed.
ctx = log.WithLog(ctx, log.Combine(A, B))
The Join()
function should be updated accordingly, or ...
Consider ditching the Join() function
This seem to be a rarely used "nice-to-have" function. APIs should be kept a simple as possible to make test and maintenance easier.
The logging system is going to be more complicated.
I'd suggest drawing a diagram of classes and a diagram of use cases to check the current state of it. After that, we could consider rewriting/simplifying logging to solve all described problems.
I'd suggest drawing a diagram of classes and a diagram of use cases to check the current state of it. After that, we could consider rewriting/simplifying logging to solve all described problems.
Sounds good. When will it be ready?
Hello, We are trying to test setting different timezones for NSM and the current way of printing timestamps in the logs is confusing. It would be really great if the log messages would be well segmented, so that it could be paresed by other services (i.e. FluentBit) easily.
I'm thinking about something like this:
Current log message:
Nov 2 20:44:59.677[36m [INFO] [cmd:/bin/forwarder] [0mexecuting phase 2: run vpp and get a connection to it (time since start: 2.053701ms)
Future log message:
{"timestamp": "2022-11-02T20:44:59.677Z", "severity": "info", "binary": "/bin/forwarder", "phase": "2", "message": "run vpp and get a connection to it (time since start: 2.053701ms)"}
The PR https://github.com/networkservicemesh/sdk/pull/1348 adds a settable NSM-logger. You can configure it with a structured logging format of your preference. Initiate your code with something like;
if config.LogLevel == "TRACE" {
nsmlog.EnableTracing(true)
// Work-around for hard-coded logrus dependency in NSM
logrus.SetLevel(logrus.TraceLevel)
}
logger.Info("NSM trace", "enabled", nsmlog.IsTracingEnabled())
nsmlogger := log.NSMLogger(logger)
nsmlog.SetGlobalLogger(nsmlogger)
ctx = nsmlog.WithLog(ctx, nsmlogger)
The logger should be set as the global logger and added to the context passed to NSM.
We use the zap logger, but you can probably adapt another structured logger easily. Below is our code for an NSMLogger (narrowed) as an example.
The NSM programs themselves can't be configured to use a structured logger yet. We needed to set it in our NSE/NSC code for starters.
NSM-logger example
import "go.uber.org/zap"
// NSM logger;
type nsmLogger struct {
z *zap.Logger
s *zap.SugaredLogger
}
func (l *nsmLogger) Info(v ...interface{}) {
l.s.Info(v...)
}
func (l *nsmLogger) Infof(format string, v ...interface{}) {
l.s.Infof(format, v...)
}
func (l *nsmLogger) Warn(v ...interface{}) {
l.s.Info(v...)
}
func (l *nsmLogger) Warnf(format string, v ...interface{}) {
l.s.Infof(format, v...)
}
func (l *nsmLogger) Error(v ...interface{}) {
l.s.Error(v...)
}
func (l *nsmLogger) Errorf(format string, v ...interface{}) {
l.s.Errorf(format, v...)
}
func (l *nsmLogger) Fatal(v ...interface{}) {
l.s.Fatal(v...)
}
func (l *nsmLogger) Fatalf(format string, v ...interface{}) {
l.s.Fatalf(format, v...)
}
func (l *nsmLogger) Debug(v ...interface{}) {
l.s.Debug(v...)
}
func (l *nsmLogger) Debugf(format string, v ...interface{}) {
l.s.Debugf(format, v...)
}
func (l *nsmLogger) Trace(v ...interface{}) {
if l.z.Core().Enabled(-2) {
l.s.Debug(v...)
}
}
func (l *nsmLogger) Tracef(format string, v ...interface{}) {
if l.z.Core().Enabled(-2) {
l.s.Debugf(format, v...)
}
}
func (l *nsmLogger) Object(k, v interface{}) {
l.z.Info("Object", zap.Any(fmt.Sprintf("%v", k), v))
}
func (l *nsmLogger) WithField(key, value interface{}) nsmlog.Logger {
z := l.z.With(zap.Any(fmt.Sprintf("%v", key), value))
return &nsmLogger{
z: z,
s: z.Sugar(),
}
}