quix-streams
quix-streams copied to clipboard
Support interactive-style local state lookups (Kafka Streams style) in Quix Streams
In the Java kafka-streams library, interactive queries let developers access local state stores directly. This enables efficient, local key-value lookups (via ReadOnlyKeyValueStore.get(key)), perfect for low-latency dashboards, API endpoints, and RPC layers without needing to produce events or use streaming joins.
💡 Why this matters
These interactive queries allow state to be accumulated incrementally and queried on-demand, enabling distributed key-value stores across application instances, each owning a shard. Lookups differ from streaming joins because they're external, random-access reads—not part of the streaming topology.
In contrast, while Quix Streams supports stateful processing via RocksDB-backed state stores, there’s no public API to query those stores outside of the .apply() or .update() callbacks. That means you can't synchronously fetch state during a REST request—the REST handler can't access State.get(...) outside streaming callbacks.
❌ An alternative approach that does not work
When you try to simulate a lookup by sending a message to a Kafka topic and then waiting for the streaming pipeline to output the response, you end up building a request-response loop:
- A REST handler publishes a lookup request message.
- A streaming pipeline processes that message.
- The method
.apply(..., stateful=True)generates a lookup-output record. - The REST handler consumes the response.
This paradigm combines async streaming with synchronous REST, introducing latency, complexity, and timing issues. It's not equivalent to a Kafka Streams interactive query, where you can call store.get(key) directly within service code. If Quix Streams exposed an API like Kafka Streams' store(...), you could write:
store = app.store("my-kv-store")
value = store.get(key)
Have you ever considered to add a similar feature to Quix Stream?
Hi @lucasimi, thanks for creating the issue.
We are aware of this feature, but it is not prioritized as of now.
Do you have in mind any particular example of the application using interactive queries?
Thanks again!
Thank you for considering this.
Example
In my case I want to develop a distributed key-value store where I can perform the full CRUD for any given key-value pair. For the lookup of the keys, the application will expose a public REST API with an endpoint where I can supply the key in the request, and receive the corresponding value in the response.
-
Data ingestion and stream processing: My idea is to implement this using Kafka to ingest data into the application, and to automatically handle the partitioning logic, together with all the benefits that Kafka gives. Instead of running complex queries at request time, I want to turn those into a stream processing pipeline that transforms data starting from a topic (or many) into a final streaming dataframe.
-
State materialization: At this point the pipeline is updating my dataframe in background, as soon as data is available in the source topics, and the dataframe will be materialized into some kind of local store (ideally a RocksDB table for each assigned partition). The REST API handler for the lookup will retrieve the value from the local store, which contains data that has been stored previously. So almost no data processing during retrieval, therefore reaching lower latencies.
-
Smart routing: To make the REST API distributed I need to split each endpoint into an internal and a public one. The single responsibility of public endpoints will be to route the keys to the instance where the key lives, which is predictable if consumer metadata can be retrieved from Kafka. In this way the instance hit by the public endpoint acts as an "introducer" for the query.
Using interactive queries together with stream processing brings data closer to the serving layer both in space (data lives on the application) and in time (stream processing runs in background). This design maximizes performance and enables scalable, low-latency key-based lookup.