seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Bug] [doris-source-connectors] [2.3.5] DorisConnectorException with datetime field in doris source

Open Darkzoneleet opened this issue 1 year ago • 9 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

Seems like if there's any datetime field in doris source table, it'll crash, and fine when not. Table structure (same as test_doris_test):

CREATE TABLE `doris_test` (
`id` LARGEINT NOT NULL,
`create_time` DATETIME,
`s1` VARCHAR(500),
`s2` VARCHAR(500),
`s3` VARCHAR(500),
`s4` VARCHAR(500),
`s5` VARCHAR(500),
`s6` VARCHAR(500),
`s7` VARCHAR(1000),
`s8` VARCHAR(1000),
`s9` VARCHAR(1000),
`s10` VARCHAR(1000),
`s11` VARCHAR(500),
`s12` VARCHAR(500),
`s13` VARCHAR(500),
`s14` VARCHAR(500)
) ENGINE = OLAP UNIQUE KEY(`id`) DISTRIBUTED BY HASH(`id`) BUCKETS 32 PROPERTIES (
  "replication_allocation" = "tag.location.default: 1",
  "enable_unique_key_merge_on_write" = "true"
);

SeaTunnel Version

2.3.5

SeaTunnel Config

env {
  execution.parallelism = 1
  job.mode = "BATCH"
  checkpoint.interval = 10000
}

source {
  Doris {
    fenodes = "192.168.161.90:8030"
    username = doris_l
    password = "eewahTi9"
    database = "doris"
    table = "test_doris_test"
    doris.filter.query = "create_time is not null"
  }
}

sink {
  Doris {
    fenodes = "192.168.161.90:8030"
    username = doris_l
    password = "eewahTi9"
    database = "doris"
    table = "doris_test"
    sink.label-prefix = "ds"
    sink.enable-2pc = "true"
    sink.enable-delete = "true"
    doris.config {
      format = "json"
      read_json_by_line = "true"
    }
  }
}

Running Command

./bin/seatunnel.sh --config test.conf  -m local

Error Exception

===============================================================================


2024-08-14 20:58:43,214 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Fatal Error, 

2024-08-14 20:58:43,214 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues

2024-08-14 20:58:43,214 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Reason:SeaTunnel job executed failed 

2024-08-14 20:58:43,217 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: org.apache.seatunnel.connectors.doris.exception.DorisConnectorException: ErrorCode:[Doris-05], ErrorDescription:[arrow read error] - class org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector cannot be cast to class org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector (org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector and org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector are in unnamed module of loader org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader @1986e9a7)
	at org.apache.seatunnel.connectors.doris.source.serialization.RowBatch.readArrow(RowBatch.java:132)
	at org.apache.seatunnel.connectors.doris.source.reader.DorisValueReader.hasNext(DorisValueReader.java:231)
	at org.apache.seatunnel.connectors.doris.source.reader.DorisSourceReader.pollNext(DorisSourceReader.java:75)
	at org.apache.seatunnel.engine.server.task.flow.SourceFlowLifeCycle.collect(SourceFlowLifeCycle.java:156)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.collect(SourceSeaTunnelTask.java:116)
	at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.call(SourceSeaTunnelTask.java:121)
	at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:703)
	at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1004)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
	... 2 more
 
2024-08-14 20:58:43,217 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - 
===============================================================================



Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: org.apache.seatunnel.connectors.doris.exception.DorisConnectorException: ErrorCode:[Doris-05], ErrorDescription:[arrow read error] - class org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector cannot be cast to class org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector (org.apache.seatunnel.shade.org.apache.arrow.vector.TimeStampMicroVector and org.apache.seatunnel.shade.org.apache.arrow.vector.VarCharVector are in unnamed module of loader org.apache.seatunnel.engine.common.loader.SeaTunnelChildFirstClassLoader @1986e9a7)
	at org.apache.seatunnel.connectors.doris.source.serialization.RowBatch.readArrow(RowBatch.java:132)
	at org.apache.seatunnel.connectors.doris.source.reader.DorisValueReader.hasNext(DorisValueReader.java:231)
	at org.apache.seatunnel.connectors.doris.source.reader.DorisSourceReader.pollNext(DorisSourceReader.java:75)
	at org.apache.seatunnel.engine.server.task.flow.SourceFlowLifeCycle.collect(SourceFlowLifeCycle.java:156)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.collect(SourceSeaTunnelTask.java:116)
	at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
	at org.apache.seatunnel.engine.server.task.SourceSeaTunnelTask.call(SourceSeaTunnelTask.java:121)
	at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:703)
	at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:1004)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
	... 2 more
2024-08-14 20:58:52,072 INFO  [s.c.s.s.c.ClientExecuteCommand] [ForkJoinPool.commonPool-worker-23] - run shutdown hook because get close signal

Zeta or Flink or Spark Version

No response

Java or Scala Version

jdk11

Screenshots

No response

Are you willing to submit PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

Darkzoneleet avatar Aug 14 '24 13:08 Darkzoneleet

What version of doris is yours?

liugddx avatar Aug 15 '24 02:08 liugddx

What version of doris is yours?

2.1.1

Darkzoneleet avatar Aug 15 '24 03:08 Darkzoneleet

+1

luzongzhu avatar Aug 19 '24 08:08 luzongzhu

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after https://github.com/apache/doris/pull/38215 is fixed.

liugddx avatar Aug 19 '24 08:08 liugddx

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Darkzoneleet avatar Sep 03 '24 09:09 Darkzoneleet

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this https://github.com/apache/doris/issues/38174#issuecomment-2245307121

liugddx avatar Sep 03 '24 11:09 liugddx

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this apache/doris#38174 (comment)

Hmm, I talked with them after and seems they take it wrong, and seems now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)

Darkzoneleet avatar Oct 11 '24 09:10 Darkzoneleet

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this apache/doris#38174 (comment)

Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)

Darkzoneleet avatar Oct 15 '24 02:10 Darkzoneleet

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this apache/doris#38174 (comment)

Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)

This issue will be resolved, please be patient.

liugddx avatar Oct 15 '24 02:10 liugddx

I encountered the same issue in Doris 3. x version, where this error is reported when the source is of type timestamp. If Doris version is 1.2.1, the primary key largeint will also report a type strong conversion error, and VarCharVector cannot be strong converted to BigIntVector

W-dragan avatar Nov 18 '24 07:11 W-dragan

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this apache/doris#38174 (comment)

Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)

This issue will be resolved, please be patient.

Have this been fixed? Half year passed since then :p

Darkzoneleet avatar Mar 19 '25 08:03 Darkzoneleet

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this apache/doris#38174 (comment)

Hmm, I talked with them after and seems they take it wrong, and now this pull is merged, so I guess this issue could be fixed now, and thanks in advance :)

This issue will be resolved, please be patient.

Have this been fixed? Half year passed since then :p

Please upgrade to version 2.3.9

liugddx avatar Mar 19 '25 11:03 liugddx

Currently, doris 2.1.x has an 8-hour time difference when querying time type data. I will fix this issue after apache/doris#38215 is fixed.

Eh sorry, I contact maintainers in selectdb and they don't really know what's going on, maybe there's need for further communication with them? @liugddx

Look at this apache/doris#38174 (comment)

Hmm, I talked with them after and seems they take it wrong, 和 now this pull is merged, so I guess this issue could be fixed now, 和 thanks in advance :)

This issue will be resolved, please be patient.

Have this been fixed? Half year passed since then :p

Please upgrade to version 2.3.9

Which pr is it specifically? I can't upgrade it for the time being

LeonYoah avatar May 15 '25 03:05 LeonYoah

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Jul 12 '25 00:07 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar Jul 20 '25 00:07 github-actions[bot]