vector icon indicating copy to clipboard operation
vector copied to clipboard

New source: "loki"

Open afoninsky opened this issue 3 years ago • 11 comments

For now, Vector has "loki" sink which is responsible for delivery logs to Loki. Loki also has functionality to consume live logs so its possible to create a source: https://grafana.com/docs/loki/latest/api/ -> /loki/api/v1/tail Theoretically, it will allow to execute queries with live tailing and create much more flexible processing pipelines.

afoninsky avatar Mar 24 '21 12:03 afoninsky

In case somebody wants to work on that, here's some example code how to merge multiple loki streams into one array of logs: https://github.com/livingdocsIO/loki-log-export/blob/3488c019d5f0c2525cdc844e02c3a1a8a7d3e8d2/merge-logs.js#L3-L13

marcbachmann avatar Sep 26 '21 11:09 marcbachmann

Vector could also implement a loki source listening on the /loki/api/v1/push API endpoint. Basically acting like the loki receiver in a typical promtail -> loki setup.

One thing that this would allow for is a path to incrementally move from a promtail/loki setup to a full vector pipeline. For example, in a promtail -> loki setup one could introduce vector aggregator into this pipeline easily: promtail -> vector-aggregator -> loki. Once the aggregator is in place it becomes straightforward to add additional outputs to your log pipeline: gcs, s3, bigquery, etc. Over time promtail should be replaced with vector agent. By introducing vector incrementally it gives the org time to convert their promtail config (filters, relabels, etc) and test safely before swapping promtail for vector agent.

joemiller avatar Sep 07 '22 23:09 joemiller

Good morning,

From what I can see there are two interesting use cases here.

(1) Described in this issue (initially), Loki is used as a source to retrieve the data as they are stored. Vector actively reads the data from the system. Most comparable to the prometheus_scrape source .

Use cases:

  • Export of data from Loki for further processing with Vector and/or transport to other systems
  • Federation of multiple Loki instances

This was also implemented by @zamazan4ik in #15405. Unfortunately closed. Behavior when Vector is not running: A gap is created in the downstream systems. The data can be viewed in Loki.

(2) The second suggestion in this issue makes Vector part of the processing chain. The logs are pushed to Vector. Most comparable to source prometheus_remote_write where it mimics a receiver endpoint.

Use cases:

  • Ingest buffer: As an intermediate instance, Vector can accept and buffer the messages and thus take the load off the target
  • Duplicate incoming logs to multiple systems. The feature is missing in the Grafana agent/Promtail/Loki stack
  • Normalize the data before ingest. As described, it also makes sense for a step-by-step replacement of Loki.

Behavior when Vector is not running: Logs cannot be sent. Load on the clients to buffer the logs.

In my view, both approaches make sense and something similiar has already been implement for promtheus. Which makes sense as a loki is like prometheus, but for logs.

Overview:

Grafana agent
or Promtail ---> Vector (2) ----> Loki <-- (pull)-- Vector(1) --> Sink

@jszwedko what is required to bring these improvements to vector? I can create RFCs for both or create a second issue describing option 2. I'm not sure if I can support with implementation (never did something in rust) but maybe preparing the road will unblock motivated people like @zamazan4ik .

kbudde avatar Jul 22 '23 05:07 kbudde

Hi @kbudde !

Just FYI that there is another contributor working on a websocket source over here: https://github.com/vectordotdev/vector/pull/17856

That seems likely to go in soon, after which I think a specialized loki pull source could be layered on top that pulls logs from Loki via its live tail API.

It additionally also seems sensible to have a loki API source that exposes the same API as Loki for clients like promtail to push to (your option 2). I think we would be amenable to that too. It seems like it could be a fairly light wrapper around the http_server source.

jszwedko avatar Aug 01 '23 19:08 jszwedko

I would love an integration for Promtail instances to send data to Vector.

bryanyork avatar Nov 21 '23 09:11 bryanyork

I would love for Vector to have a Loki API source. My use case: there's a tool (Unpoller) that supports sending log events to Loki only. Would be great to use Vector with it instead to be able to forward logs to other systems.

GreyTeardrop avatar Jan 15 '24 22:01 GreyTeardrop

We have many docker containers in production that are configured to to stream to loki server. I want to put Vector instead of loki server so that in followup I can redirect logs to both: elasticsearch(our coders for debugging) and Grafana/Loki (our support engineers). So yes, +1 for the Loki as source.

labmir avatar Feb 08 '24 16:02 labmir

@labmir if I understand your use case correctly, you need to use already existing Loki sink.

zamazan4ik avatar Feb 08 '24 16:02 zamazan4ik

@zamazan4ik Thank you, but I must've not explained it well. The reason we want Loki as source is because our docker daemon is configured to log to Loki and changing log driver for docker requires to restart docker service and rebuild docker compose which is a down time I want to avoid on the production. So we want to stop our Loki server and launch Vector on the same IP:PORT where Loki used to listen and tell Vector to redirect logs to both Loki (launched on a different IP:PORT) and Elastic

labmir avatar Feb 09 '24 08:02 labmir

there are two types of loki sources discussed here:

vector acting as loki server would be useful for us for two reasons:

  • send logs from promtail to multiple servers (e.g. loki and victorialogs)
  • use promtail to read docker logs instead of vectors own docker_logs source, to not lose logs #7358

pgassmann avatar Jul 04 '24 22:07 pgassmann