starrocks-connector-for-apache-spark icon indicating copy to clipboard operation
starrocks-connector-for-apache-spark copied to clipboard

Results 42 starrocks-connector-for-apache-spark issues
Sort by recently updated
recently updated
newest added

mysql 数据库,使用text 类型字段场景不少,缺少对text 类型支持,会导致数据同步受限!

基于 spark structured streaming, 提供读取 kafka 写入starrocks 实例。

我用spark加载starRocks数据 和 deltaLake数据湖数据 对比分析, 数据量12亿, deltaLake可以正常处理, 花费了4秒, Starrocks在3分29秒后出现异常, 我测试方法如下: deltalake测试: %spark val df2 = spark.read.format("delta").load("hdfs://172.16.xx.xxx:9000/users/.d/202309/8709141d4cac4a1a8861b26be198392b_A卷数据12亿测试.d"); df2.filter("`A103_VALUE` == '3'").count() res9: Long = 203581440 Took 4 sec. Starrocks测试: %spark val df4 =...

我在使用Spark-Connector读取Starrocks数据时,遇到数据类型转换的问题。 我目前使用的版本信息 Starrocks: 2.5.1 Spark: 3.2.0 Starrocks-Spark-Connector: 1.1.0 背景如下: 表中有个字段 dt 类型是 date 使用SparkSQL DataFrame读取Starrock, 这张表的数据。报错如下: ```java if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.sql.catalyst.util.DateTimeUtils$, DateType, fromJavaDate, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]),...

./build.sh 2 ```` [INFO] --- compiler:3.8.1:compile (default-compile) @ starrocks-spark2_2.11 --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 41 source files to /Users/atwong/sandbox/starrocks-connector-for-apache-spark/target/classes [INFO] /Users/atwong/sandbox/starrocks-connector-for-apache-spark/src/main/java/com/starrocks/connector/spark/sql/schema/AbstractRowStringConverter.java: Some input files use...

support spark catalog

Since StarRocks 2.2, the allowed minimum batch size is 4096, and when setting a batch size smaller than 4096, the StarRocks will reset it to 4096. To make the default...

Scanning a big batch of data may reach the [MaxMessageSize](https://github.com/apache/thrift/blob/master/doc/specs/thrift-tconfiguration.md#maxmessagesize) of thrift. One solution is to decrease `starrocks.batch.size`, but it will affect the performance. So we should add a configuration...