pinot
pinot copied to clipboard
Pauseless Consumption
Background
Currently when commit protocol starts for a partition, the realtime server stops consumption and till the new consuming->online helix message for the next consuming segment arrives, messages are not read from the stream. Depending on the segment size, the period in which there's no consumption can take something from few seconds to a several minutes (and in some cases tens of minutes because of hitting concurrent segment build threshold on the servers). This leads to data freshness issues specially for high ingesting use cases.
Proposal
A simple solution to prevent these consumption pauses is to spin off a temporary consuming segment as soon as the segment commit starts. This temporary segment is a supplement to the main segment that is committing. There won't be any changes in Ideal State and External View.
Query execution
To make the ingested data in the temporary segment available for query execution, Server Query Executor sends the incoming query to the main segment as well as the temporary segment if there is one. Both segments execute the query and return the results.
Next Offline to Consuming Transition
After the main segment is committed and the next consuming segment is created during handling of offline->consuming helix transition message, the data in the temporary segment can be used to bootstrap the new consuming segment. Once the mutable segment is created for the next consuming segment, and before the consume loop starts, the data in the temporary segment can be read and inserted into the new mutable segment. After the data is copied over, the temporary segment can be deleted, and the consume loop in the new consuming segment starts from the offset of the last copied message.
Configurations
- We can add a new boolean flag to the Stream Config properties to enable/disable pause-less consumption. We can use something like
pauseless.consumption.enabled
. - For the size of the temporary segment, we can go with a default value of 20% of the main segment size. This ratio can be configured as a stream config property, something like
pauseless.consumption.temporary.segment.size.ratio
. - We also need to set a timeout for temporary consumption in case something is wrong and the helix transition message for the next consuming segment doesn't arrive to the server. We can have a default 10min value and also a new property in stream config to override that, something like
pauseless.consumption.timeout
.