seatunnel
seatunnel copied to clipboard
[Bug] [doris-source-connectors] [2.3.5] DorisConnectorException with datetime field in doris source
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
Seems like if there's any datetime field in doris source table, it'll crash, and fine when not. Table structure (same as test_doris_test):
CREATE TABLE `doris_test` (
`id` LARGEINT NOT NULL,
`create_time` DATETIME,
`s1` VARCHAR(500),
`s2` VARCHAR(500),
`s3` VARCHAR(500),
`s4` VARCHAR(500),
`s5` VARCHAR(500),
`s6` VARCHAR(500),
`s7` VARCHAR(1000),
`s8` VARCHAR(1000),
`s9` VARCHAR(1000),
`s10` VARCHAR(1000),
`s11` VARCHAR(500),
`s12` VARCHAR(500),
`s13` VARCHAR(500),
`s14` VARCHAR(500)
) ENGINE = OLAP UNIQUE KEY(`id`) DISTRIBUTED BY HASH(`id`) BUCKETS 32 PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"enable_unique_key_merge_on_write" = "true"
);
SeaTunnel Version
2.3.5
SeaTunnel Config
env {
execution.parallelism = 1
job.mode = "BATCH"
checkpoint.interval = 10000
}
source {
Doris {
fenodes = "192.168.161.90:8030"
username = doris_l
password = "eewahTi9"
database = "doris"
table = "test_doris_test"
doris.filter.query = "create_time is not null"
}
}
sink {
Doris {
fenodes = "192.168.161.90:8030"
username = doris_l
password = "eewahTi9"
database = "doris"
table = "doris_test"
sink.label-prefix = "ds"
sink.enable-2pc = "true"
sink.enable-delete = "true"
doris.config {
format = "json"
read_json_by_line = "true"
}
}
}
Running Command
./bin/seatunnel.sh --config test.conf -m local
Error Exception
===============================================================================
2024-08-14 20:58:43,214 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Fatal Error,
2024-08-14 20:58:43,214 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues
2024-08-14 20:58:43,214 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Reason:SeaTunnel job executed failed
2024-08-14 20:58:43,217 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: org.apache.seatunnel.connectors.doris.exception.DorisConnectorException: ErrorCode:[Doris-05], ErrorDescription:[arrow read error] - class org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector cannot be cast to class org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector (org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector and org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector are in unnamed module of loader org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader @1986e9a7)
at org.apache.seatunnel.connectors.doris.source.serialization.RowBatch.readArrow(RowBatch.java:132)
at org.apache.seatunnel.connectors.doris.source.reader.DorisValueReader.hasNext(DorisValueReader.java:231)
at org.apache.seatunnel.connectors.doris.source.reader.DorisSourceReader.pollNext(DorisSourceReader.java:75)
at org.apache.seatunnel.engine.server.task.flow.SourceFlowLifeCycle.collect(SourceFlowLifeCycle.java:156)
at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.collect(SourceSeaTunnelTask.java:116)
at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.call(SourceSeaTunnelTask.java:121)
at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:703)
at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1004)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
... 2 more
2024-08-14 20:58:43,217 ERROR [o.a.s.c.s.SeaTunnel ] [main] -
===============================================================================
Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: org.apache.seatunnel.connectors.doris.exception.DorisConnectorException: ErrorCode:[Doris-05], ErrorDescription:[arrow read error] - class org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector cannot be cast to class org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector (org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector and org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector are in unnamed module of loader org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader @1986e9a7)
at org.apache.seatunnel.connectors.doris.source.serialization.RowBatch.readArrow(RowBatch.java:132)
at org.apache.seatunnel.connectors.doris.source.reader.DorisValueReader.hasNext(DorisValueReader.java:231)
at org.apache.seatunnel.connectors.doris.source.reader.DorisSourceReader.pollNext(DorisSourceReader.java:75)
at org.apache.seatunnel.engine.server.task.flow.SourceFlowLifeCycle.collect(SourceFlowLifeCycle.java:156)
at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.collect(SourceSeaTunnelTask.java:116)
at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.call(SourceSeaTunnelTask.java:121)
at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:703)
at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1004)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
... 2 more
2024-08-14 20:58:52,072 INFO [s.c.s.s.c.ClientExecuteCommand] [ForkJoinPool.commonPool-worker-23] - run shutdown hook because get close signal
Zeta or Flink or Spark Version
No response
Java or Scala Version
jdk11
Screenshots
No response
Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
What version of doris is yours?
What version of doris is yours?
2.1.1
+1
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after https://github.com/apache/doris/pull/38215 is fixed.
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this https://github.com/apache/doris/issues/38174#issuecomment-2245307121
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this apache/doris#38174 (comment)
Hmm, I talked with them after and seems they take it wrong, and seems now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this apache/doris#38174 (comment)
Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this apache/doris#38174 (comment)
Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)
This issue will be resolved, please be patient.
I encountered the same issue in Doris 3. x version, where this error is reported when the source is of type timestamp. If Doris version is 1.2.1, the primary key largeint will also report a type strong conversion error, and VarCharVector cannot be strong converted to BigIntVector
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this apache/doris#38174 (comment)
Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)
This issue will be resolved, please be patient.
Have this been fixed? Half year passed since then :p
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this apache/doris#38174 (comment)
Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)
This issue will be resolved, please be patient.
Have this been fixed? Half year passed since then :p
Please upgrade to version 2.3.9
Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.
Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx
Look at this apache/doris#38174 (comment)
Hmm, I talked with them after and seems they take it wrong, 和 now this pull is merged, so I guess this issue could be fixed now, 和 thanks in advance :)
This issue will be resolved, please be patient.
Have this been fixed? Half year passed since then :p
Please upgrade to version 2.3.9
Which pr is it specifically? I can't upgrade it for the time being
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.