loki
loki copied to clipboard
Loki 3.0; breaking changes collection
Since we've talked internally about quite a few changes we'd like to make but that would be considered breaking changes I'd like to start collecting these so that we don't lose them:
- [ ] prefix all metrics within the loki and promtail binaries correctly with
lokiorpromtail, at least some in loki are prefixed withcortexor not prefixed at all - [ ] flags registration should always include a prefix, for example ingester run CLI flags are not prefxied with
ingester.ring.<flag>like other components, but justingester.<flag> - [ ] We chatted about deprecating non-boltdb index stores. Do we still want to do that?
- [ ] simplify upstream jsonnet; remove things like non-stateful ingesters and move wal.libsonnet code into ingester.libsonnet (removes the need for extra jsonnet configuration plus potential import order issues)
- [ ] consistently prefix command line arguments with each target they can be used by, or don't prefix them at all and include the targets they are valid for in the help text (ex:
'distributor.excluded-zones': 'zone-default',can be used by both the distributor and the ruler)
Wanted to bring up the long pending issue of supporting /v2 module path for Loki.
We had countless discussions before and the current state is tracked here.
I did some prototype long before and explained it here If we ever plan to do this (how can we gain confidence?) loki 3.0 is the way to go, else we have to wait for next major release.
- [ ] We chatted about deprecating non-boltdb index stores. Do we still want to do that?
+1 for this. We should do this to remove lots of code & dependencies we don't need and to give folks an opinionated, single approach towards index storage. We will have to create a story around migration, though.
I wonder if it's also worth changing the default tenant name from fake to default (or something less weird than fake, at least), as this is a common source of confusion.
- [ ] Multiple object store configurations of the same type (s3, gcs, etc) to enable migrating from one bucket to another within the same provider.
- [ ] Remove the
shared_storeoptions in theboltdb-shipperandtsdb-shipperconfigs. Instead, have them read period configs and use the object store listed there. - [ ] Remove the
ingestion_burst_size_mbconfigurations. Likely set them to some high maximum value instead.
- [ ] Multiple object store configurations of the same type (s3, gcs, etc) to enable migrating from one bucket to another within the same provider.
Is this not a feature instead of a v3.0 breaking change? I've got a (hacky) branch that handles this in a backwards-compatible way
Building on @dannykopping's suggestion which was also previously brought up by @DylanGuedes .
https://github.com/grafana/loki/issues/7076#issuecomment-1250653744
https://github.com/grafana/loki/issues/3363#issuecomment-847930073
I suggest we mimic Mimir in:
- changing
auth_enabledtomultitenancy_enabled - setting
multitenancy_enabled=trueby default - adding a flag that lets you configure the name used when multi-tenancy is disabled. I propose the default be
anonymousinstead offake. (Fake is what we have today; anonymous is what Mimir uses).
- [ ] We chatted about deprecating non-boltdb index stores. Do we still want to do that?
+1 for this. We should do this to remove lots of code & dependencies we don't need and to give folks an opinionated, single approach towards index storage. We will have to create a story around migration, though.
+1 This will also allow us to remove the table-manager component, which confuses the hell out of people.
Shall we introduce a basic config and advanced configuration section?
Adding one more entry to the Issue description. /cc @cstyan
make -config.expand-env=true as default in promtail. Rationale being good default experience for promtail to understand ENV in it's config. Often people miss this and hard to send logs that needs some secrets from ENV like $USERNAME, $PASSWORD, $LOKI_URL
PR https://github.com/grafana/loki/pull/7937 introduced a deprecated CLI flag -ruler.wal-cleaer.period that should also be removed on the next major Loki release
we should fix this in 3.0 as well since we're doing metrics breaking