inlong icon indicating copy to clipboard operation
inlong copied to clipboard

[Feature] Add Apache Doris Extract Node for Agent

Open dockerzhang opened this issue 3 years ago • 3 comments

Description

you can refer to: https://inlong.apache.org/docs/next/design_and_concept/how_to_write_plugin_agent

Use case

No response

Are you willing to submit PR?

  • [ ] Yes, I am willing to submit a PR!

Code of Conduct

dockerzhang avatar Jul 14 '22 08:07 dockerzhang

Hi, I want to do this task.

Loveca avatar Jul 23 '22 04:07 Loveca

@Loveca assigned to you.

dockerzhang avatar Jul 26 '22 09:07 dockerzhang

Motivition

Add Apache Doris Extract Node for Agent

About Doris Data Export

Data Export is a function provided by Doris to Export data. This function exports data in a table or partition specified by users to a remote storage system, such as HDFS/BOS, in text format through the Broker process. You can also export it locally.

Export To HDFS

EXPORT TABLE db1.tbl1 
PARTITION (p1,p2)
[WHERE [expr]]
TO "hdfs://host/path/to/export/" 
PROPERTIES
(
    "label" = "mylabel",
    "column_separator"=",",
    "columns" = "col1,col2",
    "exec_mem_limit"="2147483648",
    "timeout" = "3600"
)
WITH BROKER "hdfs"
(
    "username" = "user",
    "password" = "passwd"
);

Export To Local

EXPORT TABEL tablename TO "file:///local_file_path"

Design

image

1.Doris exports the data to the HDFS data file or local data file through Export command 2. The InLong Agent reads the corresponding data file

Implementation

  • Reader: Implements DorisReader to read the data files exported by Doris
  • Source: Implements DorisSource, implements Split logic, and returns Reader list
  • Sink: Use ProxySink

Loveca avatar Jul 28 '22 11:07 Loveca

In most usage scenarios, Doris is more used for storage than collection. So this issue is temporarily closed

dockerzhang avatar Aug 23 '22 07:08 dockerzhang