greptimedb
greptimedb copied to clipboard
Supports PromQL natively
What problem does the new feature solve?
Right now GreptimeDB only supports SQL and gRPC protocols to query data. PromQL prevails in cloud-native observability. We want to support PromQL natively in GreptimeDB, so I open this issue to track it.
- [x] Design draft
- [ ] POC
What does the feature do?
Supports PromQL natively.
Implementation challenges
There are some challenges to supporting PromQL in GreptimeDB:
- The data mode between Prometheus and greptimedb doesn't match totally. There is a conversion between them.
- The PromQL has an interpreter to compute data and has a lot of functions.
- PromQL has lots of special data processing logic for metrics scenarios.
- The performance challenge. We don't just only support PromQL, but also want to provide better performance by pushing down predicate/aggregation etc.
- How to catch up community's progress. We don't want a forked language.
https://github.com/vthriller/promql this crate may be useful, but seems out dated.
https://github.com/vthriller/promql this crate may be useful, but seeoutdatedted.
Thanks for the link. It looks great! Prometheus has switched to a yacc(goyacc) based parser, so the pure hand-written one might be outdated, but it's still precious to learn from.
Motivation
The PromQL supported by Prometheus is used widely in cloud-native observability. We want to enable it natively in the GreptimeDB. The GreptimeDB already supports remote read/write protocol for Prometheus. But it's not good enough for performance and DevOps. We want to implement PromQL in pure Rust, and cooperate reasonably with our query engine and table engine. We want to push computation down into storage, reducing data transfer and providing the best performance. And also keep compatible with Prometheus.
Design(draft)
The PromQL in Prometheus:
Including:
- Parser: parse the PromQL string into AST. Refer to the source code parser.
- Engine: contains an evaluator to eval the AST, querying data from storage and call functions.
- Storage: time series storage, providing data to the engine.
So our design will focus on these three parts too.
Parser
Looks like Prometheus is using yacc to generate the parser. I think we can use it too. Hand-wiring is another choice, but I think it's not necessary. Using the same grammar file is a better way to make our parser compatible with Prometheus easily.
Engine(Evaluator)
The evaluator is the core part of the engine:
// An evaluator evaluates given expressions over given fixed timestamps. It
// is attached to an engine through which it connects to a querier and reports
// errors. On timeout or cancellation of its context it terminates.
type evaluator struct {
ctx context.Context
startTimestamp int64 // Start time in milliseconds.
endTimestamp int64 // End time in milliseconds.
interval int64 // Interval in milliseconds.
maxSamples int
currentSamples int
logger log.Logger
lookbackDelta time.Duration
samplesStats *stats.QuerySamples
noStepSubqueryIntervalFn func(rangeMillis int64) int64
}
The core function is eval:
// eval evaluates the given expression as the given AST expression node requires.
func (ev *evaluator) eval(expr parser.Expr) (parser.Value, storage.Warnings) {
....
}
It would be the most complex part of our work.
There are 70 functions at functions.go. They can be implemented step by step.
Storage
TODO
Test
Prometheus provides a compatible test suite for other implementations. https://github.com/promlabs/promql-compliance-tester We can use it to test our implementation.
Milestones
The first milestone may be in 2023, January. Make the PromQL run and let the compatible test cases pass over 60%.
Follow-up works are tracked at https://github.com/GreptimeTeam/greptimedb/issues/1042