loggie icon indicating copy to clipboard operation
loggie copied to clipboard

Concurrent map read and map write in file watcher

Open uzuku opened this issue 2 years ago • 3 comments

What version of Loggie?

main-fd73e56d

Expected Behavior

Running without fatal error.

Actual Behavior

Fatal error occurred while running:

fatal error: concurrent map read and map write

goroutine 6009094535 [running]:
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).reportWatchMetric(0xc00141a180, 0xc0010f90e0, {0xc003f86570?, 0x1, 0x1}, {0xc000475c68, 0x7}, {0xc000475d48, 0x7})
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:948 +0x774
github.com/loggie-io/loggie/pkg/source/file.ExportWatchMetric()
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:985 +0x193
github.com/loggie-io/loggie/pkg/ops/helper.fileInfoMetrics(0xc00d8d6b00)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/ops/helper/help.go:327 +0xa5
github.com/loggie-io/loggie/pkg/ops/helper.(*Helper).helperLogCollectionHandler(0xc000c2a0d8, {0x3118fa0, 0xc01071c620}, 0xc00d8d6b00)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/ops/helper/logcollection.go:107 +0x14a
net/http.HandlerFunc.ServeHTTP(0xc010713600?, {0x3118fa0?, 0xc01071c620?}, 0x40ddea?)
        /root/local/go/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc0106d7a55?, {0x3118fa0, 0xc01071c620}, 0xc00d8d6b00)
        /root/local/go/net/http/server.go:2500 +0x149
net/http.serverHandler.ServeHTTP({0xc00a6b9bf0?}, {0x3118fa0, 0xc01071c620}, 0xc00d8d6b00)
        /root/local/go/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc0106dd7a0, {0x3119e00, 0xc0009f2810})
        /root/local/go/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
        /root/local/go/net/http/server.go:3089 +0x5ed

...

goroutine 223 [runnable]:
github.com/rs/zerolog.glob..func3()
        /root/Code/go/src/github.com/loggie-io/loggie/vendor/github.com/rs/zerolog/event.go:14 +0x38
sync.(*Pool).Get(0x4976060)
        /root/local/go/sync/pool.go:151 +0xb9
github.com/rs/zerolog.newEvent({0x3107a88?, 0xc00059ebe8}, 0x1)
        /root/Code/go/src/github.com/loggie-io/loggie/vendor/github.com/rs/zerolog/event.go:58 +0x3a
github.com/rs/zerolog.(*Logger).newEvent(0xc000615b00, 0x1, 0x0)
        /root/Code/go/src/github.com/loggie-io/loggie/vendor/github.com/rs/zerolog/log.go:419 +0x69
github.com/rs/zerolog.(*Logger).Info(...)
        /root/Code/go/src/github.com/loggie-io/loggie/vendor/github.com/rs/zerolog/log.go:303
github.com/loggie-io/loggie/pkg/core/log.(*Logger).Info(0xc01f0484b0?, {0x2cd5757, 0x1e}, {0xc005ccf958?, 0x3, 0x3})
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/core/log/global.go:142 +0x4a
github.com/loggie-io/loggie/pkg/core/log.Info(...)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/core/log/global.go:218
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).eventBus(0xc00141a180, {0xc003200690?, 0xc01f040c80?, {0x0?, 0xc01f889790?}})
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:327 +0x8c5
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).createOrRename(0xc00141a180, {0xc01f0484b0?, 0x1d?}, 0xc005ccfcd0?, 0x0)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:447 +0x234
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).scanPaths(0x70000c000796660?, {0xc00108a020, 0xa}, {0xc00108a108, 0x3}, {0xc0011a41f0?, 0x1, 0xc003200630?}, 0xc0007965b0?, 0x0)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:429 +0x2f5
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).scanTaskNewFiles(0xc005ccfdf8?, 0xc005ccfef0?)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:373 +0x92
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).scanNewFiles(...)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:360
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).scan(0xc00141a180)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:529 +0xa5
github.com/loggie-io/loggie/pkg/source/file.(*Watcher).run(0xc00141a180)
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:722 +0x2be
created by github.com/loggie-io/loggie/pkg/source/file.newWatcher
        /root/Code/go/src/github.com/loggie-io/loggie/pkg/source/file/watch.go:83 +0x2c5

Steps to Reproduce the Problem

  1. Set up a loggie collecting log files from a local path, like /logs/*.log
  2. Set up a prometheus to scrape metrics from loggie.
  3. Randomly create log files under the path so that loggie starts to collect new files, and the fatal error may happen.

I believe that the concurrent map reading and writing happens when ExportWatchMetric is called while a file CREATE event is being handled in file watcher event bus. Both of them access the map Watcher.allJobs.

uzuku avatar Aug 01 '23 06:08 uzuku

Do you calling the /api/v1/help/log API?

github.com/loggie-io/loggie/pkg/ops/helper.(*Helper).helperLogCollectionHandler(0xc000c2a0d8, {0x3118fa0, 0xc01071c620}, 0xc00d8d6b00)

Have you ever calling the /api/v1/help/log API?

ethfoo avatar Aug 01 '23 11:08 ethfoo

Have you ever calling the /api/v1/help/log API?

Yes, I can find the API invoking record just at the time the fatal error occurred.

So the func ExportWatchMetric is called when the /api/v1/help/log API is invoked ?

uzuku avatar Aug 01 '23 11:08 uzuku

So the func ExportWatchMetric is called when the /api/v1/help/log API is invoked ?

This should be the cause of the exception.

ethfoo avatar Aug 01 '23 11:08 ethfoo