SQL Source plugin to support batch ingestion
Currently, SQL Source Plugin gets the table rows and send it to sink with individual messages. EX. if we get 100 rows from SQL pull and send to REST sink. It's not going that 100 messages to one REST call instead 100 REST calls are happening. We need to have a batch to send it to sink. let's say 50 rows to one REST call and next 50 records into another rest call.
We need to have this for batch processing:
Why is this needed: To increase the performance and latency.
@manikandan-harman Thanks for bring this up. I have some questions about your proposal
- The SQL source plugin now designs to fetch data incrementally so the data size return each time may be different. If we support batch send, I would prefer to send out all the data at once instead of count for 50 rows or else. If you need to control the batch size, I would suggest to use the current row-based implementation together with count window.
SELECT * FROM yourStream GROUP BY countWindow(50). By this way, your REST sink will receive 50 records at once. - Do you have an idea of the batch send format from the SQL source? For example
{data:[{"id":"row1"},{"id":"row2"}]}
@ngjaying I could be able to achieve batch capability using SELECT * FROM yourStream GROUP BY countWindow(50). I made this feature request as per your suggestion in this. https://github.com/lf-edge/ekuiper/discussions/2234