Orion icon indicating copy to clipboard operation
Orion copied to clipboard

Fix Potential Race on Logger

Open stevenCarousell opened this issue 5 years ago • 6 comments

Issue Description When multiple parallel routine try to log like log.GetLogger().Log(...), there is a potential race condition in logger.GetLogger.

==================
WARNING: DATA RACE
Write at 0x0000014994b0 by goroutine 9:
  github.com/carousell/Orion/utils/log.GetLogger.func1()
      /.../Orion/utils/log/log.go:65 +0xaa
  sync.(*Once).Do()
      /usr/local/go/src/sync/once.go:44 +0xde
  github.com/carousell/Orion/utils/log.GetLogger()
      /.../Orion/utils/log/log.go:64 +0xaf
  github.com/carousell/Orion.main.func1()
      /.../Orion/racecheck.go:22 +0x187

Previous read at 0x0000014994b0 by goroutine 8:
  github.com/carousell/Orion/utils/log.GetLogger()
      /.../Orion/utils/log/log.go:63 +0x3e
  github.com/carousell/Orion.main.func1()
      /.../Orion/racecheck.go:22 +0x187

Goroutine 9 (running) created at:
  github.com/carousell/Orion.main()
      /.../Orion/racecheck.go:18 +0xb5
  github.com/carousell/Orion.Test_main.func1()
      /.../Orion/racecheck_test.go:18 +0x2f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:865 +0x163

Goroutine 8 (running) created at:
  github.com/carousell/Orion.main()
      /.../Orion/racecheck.go:18 +0xb5
  github.com/carousell/Orion.Test_main.func1()
      /.../Orion/racecheck_test.go:18 +0x2f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:865 +0x163
==================

Test Code

go test -v -race -run main

racecheck_test.go a empty test just for go test race check

package main

import "testing"

func Test_main(t *testing.T) {

        // TODO: Init instances and mocks

        testCases := []struct {
                name string
        }{
                {
                        name: "hi",
                },
        }
        for _, tt := range testCases {
                t.Run(tt.name, func(t *testing.T) {
                        main()
                })
        }
}

racecheck.go simulate parallel GetLogger() case

package main

import (
        "context"
        "fmt"
        "sync"

        "github.com/carousell/Orion/utils/errors"
        "github.com/carousell/Orion/utils/log"
        "github.com/carousell/Orion/utils/log/loggers"
)

func main() {
        var wg sync.WaitGroup

        for i := 0; i < 2; i++ {
                wg.Add(1)
                go func(wg *sync.WaitGroup, routineIndex int) {
                        defer wg.Done()
                        errWithStack := errors.WrapWithSkip(fmt.Errorf("error %d", routineIndex), "", 1)
                        log.GetLogger().Log(context.Background(), loggers.ErrorLevel, 1, "err", errWithStack, "stack", errWithStack.StackFrame())
                }(&wg, i)
        }
        wg.Wait()
}

stevenCarousell avatar Dec 20 '19 09:12 stevenCarousell

I think its better if we move if defaultLogger == nil inside once.Do What do you think ?

ankurs avatar Dec 20 '19 09:12 ankurs

I think its better if we move if defaultLogger == nil inside once.Do What do you think ?

Not sure what is the concern to not assign default logger when init(). I think if we move if defaultLogger == nil inside once.Do, logically it works but we may need to check atomic.LoadUint32 every time we call GetLogger() in once.Do method.

func (o *Once) Do(f func()) {
	if atomic.LoadUint32(&o.done) == 1 {
		return
	}
	// Slow-path.
	o.m.Lock()
	defer o.m.Unlock()
	if o.done == 0 {
		defer atomic.StoreUint32(&o.done, 1)
		f()
	}
}

stevenCarousell avatar Dec 20 '19 10:12 stevenCarousell

My concern is that init() is always called, so if we import a package that has an init function, it will be executed. Since Logger is used by a number of small libs in Orion this init gets called in all the libs and ends up initialising logger (and other dependencies) everywhere these lib are imported. atomic.Load* operations are normally implemented as a hardware instruction so the over head should not be much

ankurs avatar Dec 23 '19 08:12 ankurs

Also the proper use of Logger was intended in the way where you call GetLogger() and then use that object everywhere in that scope

ankurs avatar Dec 23 '19 08:12 ankurs

My concern is that init() is always called, so if we import a package that has an init function, it will be executed. Since Logger is used by a number of small libs in Orion this init gets called in all the libs and ends up initialising logger (and other dependencies) everywhere these lib are imported. atomic.Load* operations are normally implemented as a hardware instruction so the over head should not be much

according to some descriptions on package initialization

  • https://golang.org/ref/spec#Package_initialization
  • https://medium.com/golangspec/init-functions-in-go-eac191b3860a

I think golang will only run init() once on package level sequentially before execute main(). So even there are lots of other libs import the log package, it will only initialize once and there will only one defaultLogger globally.

some golang codes use init() to initialize package level vars also: https://github.com/golang/go/blob/master/src/compress/bzip2/bzip2.go#L479 https://github.com/golang/go/blob/master/src/net/http/h2_bundle.go#L3114

stevenCarousell avatar Dec 23 '19 10:12 stevenCarousell

Also the proper use of Logger was intended in the way where you call GetLogger() and then use that object everywhere in that scope

Fixed as comment.

stevenCarousell avatar Jan 06 '20 09:01 stevenCarousell

Codecov Report

Merging #125 (214b2eb) into master (d30c4e5) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #125   +/-   ##
=======================================
  Coverage   58.49%   58.49%           
=======================================
  Files          13       13           
  Lines         559      559           
=======================================
  Hits          327      327           
  Misses        225      225           
  Partials        7        7           

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Aug 16 '22 04:08 codecov[bot]