crowdsec icon indicating copy to clipboard operation
crowdsec copied to clipboard

Loki integration

Open athoune opened this issue 3 years ago • 10 comments
trafficstars

Grafana's Loki is prometheus for logs.

Loki provides a tail API for subscribing log events. This API is pure REST, and only depends on a websocket library.

  • [x] TestConfiguration
  • [x] TestConfigureDSN
  • [x] TestStreamingAcquisition
  • [x] TestOneShot
  • [x] Loki Auth
  • [ ] tail vs cat
  • [ ] Documentation

athoune avatar Jun 06 '22 14:06 athoune

coverage: 61.2%

athoune avatar Jun 07 '22 14:06 athoune

coverage: 69.0%

athoune avatar Jun 07 '22 17:06 athoune

any comments, @buixor ? the check list is ok? does it worth to continue?

athoune avatar Jun 10 '22 15:06 athoune

Hello @athoune,

First of all, thanks for the PR !

The check list is ok, we didn't have the opportunity to look at the code yet but we'll probably start to review next week (if you feel that the code is ready for a review).

blotus avatar Jun 10 '22 15:06 blotus

ok, the features are here, i'd like a review @blotus , if some parts need more comments or more tests. I tried to mimic style from the docker module, is it readable?

athoune avatar Jun 15 '22 16:06 athoune

coverage: 77.9%

athoune avatar Jun 15 '22 16:06 athoune

Hello @athoune,

I've just had a first look at the PR, and made some comments.

I think the most critical points to address are:

  • Live acquisition does not monitor the acquisition tomb, and thus will never exit on stop or reload
  • StreamingAcquisition is a bit too eager to return when it encounters are error, we should probably just keep retrying as errors coming from loki are most likely transient.

It would also be nice to more Debug and Trace logging to really understand what is going on during the acquisition if we need to debug something.

blotus avatar Jun 16 '22 15:06 blotus

Hello @athoune,

After a more in-depth review, I was not very happy with the way the loki queries were made directly from the datasource.

I've created a loki_client package to handle the queries, it allows for cleaner code inside the datasource itself (and it eases quite a bit the handling of the tomb).

I wanted to open a PR on your PR but I pushed directly to your branch :(.

Let me know what you think about the changes.

The ConfigureByDSN function probably needs to be cleaned up a bit.

blotus avatar Jul 28 '22 10:07 blotus

Codecov Report

Merging #1574 (ee31875) into master (799cc82) will decrease coverage by 2.34%. The diff coverage is 43.16%.

@@            Coverage Diff             @@
##           master    #1574      +/-   ##
==========================================
- Coverage   52.42%   50.07%   -2.35%     
==========================================
  Files         131      119      -12     
  Lines       17496    17213     -283     
==========================================
- Hits         9172     8620     -552     
- Misses       7306     7588     +282     
+ Partials     1018     1005      -13     
Flag Coverage Δ
func-crowdsec 45.84% <ø> (+30.52%) :arrow_up:
func-cscli 44.55% <ø> (-0.17%) :arrow_down:
unit-linux ?
unit-windows 53.18% <43.16%> (+0.55%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/acquisition/acquisition.go 75.75% <0.00%> (-0.47%) :arrow_down:
pkg/acquisition/modules/loki/entry.go 0.00% <0.00%> (ø)
pkg/acquisition/modules/loki/loki.go 42.85% <42.85%> (ø)
pkg/acquisition/modules/loki/timestamp.go 82.35% <82.35%> (ø)
pkg/apiclient/metrics.go 0.00% <0.00%> (-76.93%) :arrow_down:
pkg/acquisition/modules/journalctl/journalctl.go 0.00% <0.00%> (-71.74%) :arrow_down:
pkg/apiclient/signal.go 0.00% <0.00%> (-50.00%) :arrow_down:
pkg/acquisition/modules/kinesis/kinesis.go 0.00% <0.00%> (-34.24%) :arrow_down:
pkg/types/utils.go 0.00% <0.00%> (-32.65%) :arrow_down:
pkg/csplugin/watcher.go 69.30% <0.00%> (-25.75%) :arrow_down:
... and 70 more

Help us with your feedback. Take ten seconds to tell us how you rate us.

codecov-commenter avatar Jul 28 '22 10:07 codecov-commenter

Moving to 1.4.3 milestone

buixor avatar Sep 14 '22 12:09 buixor

I finally had time to deploy crowdsec with this pull request.

Globally it works pretty well, thanks 😀

But the first thing I noticed is that crowdsec refuses to start if loki is unavailable. Also, I had a network problem overnight, crowdsec crashed because loki was unavailable and couldn't restart because of that.

Adphi avatar Oct 11 '22 13:10 Adphi

Nov 08 12:33:45 crowdsec[31337]: panic: runtime error: invalid memory address or nil pointer dereference
Nov 08 12:33:45 crowdsec[31337]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x13383fa]
Nov 08 12:33:45 crowdsec[31337]: goroutine 168 [running]:
Nov 08 12:33:45 crowdsec[31337]: github.com/crowdsecurity/crowdsec/pkg/acquisition/modules/loki.(*LokiSource).StreamingAcquisition.func1()
Nov 08 12:33:45 crowdsec[31337]:         /work/pkg/acquisition/modules/loki/loki.go:285 +0x21a
Nov 08 12:33:45 crowdsec[31337]: gopkg.in/tomb%2ev2.(*Tomb).run(0x2653120, 0xc0005501e0?)
Nov 08 12:33:45 crowdsec[31337]:         /go/pkg/mod/gopkg.in/[email protected]/tomb.go:163 +0x36
Nov 08 12:33:45 crowdsec[31337]: created by gopkg.in/tomb%2ev2.(*Tomb).Go
Nov 08 12:33:45 crowdsec[31337]:         /go/pkg/mod/gopkg.in/[email protected]/tomb.go:159 +0xee

Adphi avatar Nov 08 '22 13:11 Adphi

I did a Pull Request on the pull request here to fix the nil pointer and make the loki connection more resilient (and rebased on master)

Adphi avatar Nov 28 '22 12:11 Adphi

Hello, I'm interested in this feature. Do you have any news about this feature?

lperdereau avatar May 31 '23 14:05 lperdereau

@blotus have you had any news about this feature? Is there a problem to complete this PR?

lperdereau avatar Jun 15 '23 07:06 lperdereau

Closed in favor of https://github.com/crowdsecurity/crowdsec/pull/2306

buixor avatar Oct 03 '23 10:10 buixor