bug: error on high ingest rate
Search before asking
- [X] I had searched in the issues and found no similar issues.
Version
0.8.177
What's Wrong?
Hi have a kafka connect process which ingest around 1000/s records on a 6 node cluster. Data is not written to DB and I have this logs on the nodes:
ERROR common_meta_api::schema_api_impl: error: TableVersionMismatched: 24590 expect == 45807 but 45810 while update_table_meta
with version progressing on every row/node
How to Reproduce?
Kafka connect receives netflow data with a consistent rate of 1000/s records which need to be written on Databend.
I'm using MySQL interface to ingest data.
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
This error is by design, because insert into the same table from cluster they need to race the snapshot lock from the metaservice, finally one will win and others will re-try, but the insert will works fine. We plan to change the log level from the error to warning cc @dantengsky
Hi, but a the end data is not written into the db, so the insert is not working. Probably too high ingest rate. I also found databend does not support prepared statements and batching insert does not work.
Il giorno lun 9 gen 2023 alle 01:15 BohuTANG @.***> ha scritto:
This error is by design, because insert into the same table from cluster they need to race the snapshot lock from the metaservice, finally one will win and others will re-try, but the insert will works fine. We plan to change the log level from the error to warning cc @dantengsky https://github.com/dantengsky
โ Reply to this email directly, view it on GitHub https://github.com/datafuselabs/databend/issues/9519#issuecomment-1374970437, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDYWGAUH73PQXYY5HE2HRTWRNKCLANCNFSM6AAAAAATUUTCPE . You are receiving this because you authored the thread.Message ID: @.***>
I also found databend does not support prepared statements and batching insert does not work
Hi, did you insert the data in one row per SQL ? It's recommended to insert the data in batches rather than single row per sql
- If you are using MySQL, you can concat a large SQL like
insert into table(a,b,c) values (1,2,3), (11,22,33) ....to insert the data - If you are using HTTP, you can use streaming load API to load the csv/json/parquet format data into databend. See https://databend.rs/doc/load-data/local
I think Databend needs a real streaming connector like kafka.
Streaming load api is good if you have โfilesโ. Mysql concat insert is not practical if used inside an integration platform.
What really miss is a fast ingest endpoint/connector like kafka connect or influxdb wire protocol.
Il giorno mar 10 gen 2023 alle 08:52 sundyli @.***> ha scritto:
I also found databend does not support prepared statements and batching insert does not work
Hi, did you insert the data in one row per SQL ? It's recommended to insert the data in batches rather than single row per sql
- If you are using MySQL, you can concat a large SQL like insert into table(a,b,c) values (1,2,3), (11,22,33) .... to insert the data
- If you are using HTTP, you can use streaming load API to load the csv/json/parquet format data into databend. See https://databend.rs/doc/load-data/local
โ Reply to this email directly, view it on GitHub https://github.com/datafuselabs/databend/issues/9519#issuecomment-1376857959, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDYWGA2O2GYEW2ZJQQ6EQ3WRUINFANCNFSM6AAAAAATUUTCPE . You are receiving this because you authored the thread.Message ID: @.***>
Hi, can you give more about your ingest case? Let's improve it.
Thanks.
Streaming load api is good if you have โfilesโ.
We have supported ClickHouse HTTP API in https://databend.rs/doc/integrations/api/clickhouse-handler. You can put the data inside the body in supported format like:
Json:
echo -e '{"a": 1}\n{"a": 2}' | curl 'root:@127.0.0.1:8124/?query=INSERT%20INTO%20t1%20FORMAT%20JSONEachRow' --data-binary @-
CSV:
echo -e '1\n2\n3' | curl 'root:@127.0.0.1:8124/?query=INSERT%20INTO%20t1%20FORMAT%20CSV' --data-binary @-
Streaming load did not require "files", you can put any data in HTTP body like examples