doris-flink-connector icon indicating copy to clipboard operation
doris-flink-connector copied to clipboard

[improve](cdc)improve compatibility with other databases by introducing an option to synchronize default values when inserting null data.

Open vinlee19 opened this issue 1 year ago • 0 comments

  1. In the previous PR https://github.com/apache/doris-flink-connector/pull/152, we introduced a configuration option to determine whether to synchronize default values for MySQL. However, with this setting enabled, if a schema has a default value and the insertion is null, the data synchronized to Doris will use the default value, leading to inconsistent results. By enabling the '--ignore-default-value' option, default values will not be synchronized for MySQL,Oracle, PostgreSQL, and SQL Server. For example , if you create table as follows in Oracle:
(
    ID                           NUMBER(9) not null
        primary key,
    VAL_VARCHAR                  VARCHAR2(1000)                    default 'vc2',
    VAL_VARCHAR2                 VARCHAR2(1000)                    default 'vc2',
    VAL_NVARCHAR2                NVARCHAR2(1000)                   default 'nvc2',
    VAL_CHAR                     CHAR(3)                           default 'c',
    VAL_NCHAR                    NCHAR(3)                          default 'nc',
    VAL_F                        FLOAT                             default 3.33,
    VAL_F_10                     FLOAT(10)                         default 8.888,
    VAL_NUM                      NUMBER(10, 6)                     default 4.444,
    VAL_DP                       FLOAT                             default 5.555,
    VAL_R                        FLOAT(63)                         default 6.66,
    VAL_DECIMAL                  NUMBER(10, 6)                     default 1234.567891,
    VAL_NUMERIC                  NUMBER(10, 6)                     default 1234.567891,
    VAL_NUM_VS                   NUMBER                            default 77.323,
    VAL_INT                      NUMBER                            default 1,
    VAL_INTEGER                  NUMBER                            default 22,
    VAL_SMALLINT                 NUMBER                            default 333,
    VAL_NUMBER_38_NO_SCALE       NUMBER(38)                        default 4444,
    VAL_NUMBER_38_SCALE_0        NUMBER(38)                        default 5555,
    VAL_NUMBER_1                 NUMBER(1)                         default 1,
    VAL_NUMBER_2                 NUMBER(2)                         default 99,
    VAL_NUMBER_4                 NUMBER(4)                         default 9999,
    VAL_NUMBER_9                 NUMBER(9)                         default 999999999,
    VAL_NUMBER_18                NUMBER(18)                        default 999999999999999999,
    VAL_NUMBER_2_NEGATIVE_SCALE  NUMBER(1, -1)                     default 94,
    VAL_NUMBER_4_NEGATIVE_SCALE  NUMBER(2, -2)                     default 9949,
    VAL_NUMBER_9_NEGATIVE_SCALE  NUMBER(8, -1)                     default 999999994,
    VAL_NUMBER_18_NEGATIVE_SCALE NUMBER(16, -2)                    default 999999999999999949,
    VAL_NUMBER_36_NEGATIVE_SCALE NUMBER(36, -2)                    default 999999999999999999999999999999999949,
    VAL_DATE                     DATE                              default TO_DATE('2022-10-30', 'yyyy-mm-dd'),
    VAL_TS                       TIMESTAMP(6)                      default TO_TIMESTAMP('2022-10-30 12:34:56.00789', 'yyyy-mm-dd HH24:MI:SS.FF5'),
    VAL_TS_PRECISION2            TIMESTAMP(2)                      default TO_TIMESTAMP('2022-10-30 12:34:56.12545', 'yyyy-mm-dd HH24:MI:SS.FF5'),
    VAL_TS_PRECISION4            TIMESTAMP(4)                      default TO_TIMESTAMP('2022-10-30 12:34:56.12545', 'yyyy-mm-dd HH24:MI:SS.FF5'),
    VAL_TS_PRECISION9            TIMESTAMP(6)                      default TO_TIMESTAMP('2022-10-30 12:34:56.125456789',
                                                                                        'yyyy-mm-dd HH24:MI:SS.FF9'),
    VAL_TSTZ                     TIMESTAMP(6) WITH TIME ZONE       default TO_TIMESTAMP_TZ(
            '2022-10-30 01:34:56.00789 -11:00', 'yyyy-mm-dd HH24:MI:SS.FF5 TZH:TZM'),
    VAL_TSLTZ                    TIMESTAMP(6) WITH LOCAL TIME ZONE default TO_TIMESTAMP_TZ('2022-10-30 01:34:56.00789',
                                                                                           'yyyy-mm-dd HH24:MI:SS.FF5'),
    VAL_INT_YTM                  INTERVAL YEAR(2) TO MONTH         default INTERVAL '-3-6' YEAR TO MONTH,
    VAL_INT_DTS                  INTERVAL DAY(3) TO SECOND(2)      default INTERVAL '-1 2:3:4.56' DAY TO SECOND
)
/

Insert a record with null values:

INSERT INTO DORISTEST.FULL_TYPES VALUES (
    2, NULL, NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL, NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL
);

If you do not set the '--ignore-default-value' option to true, the data will be displayed as follows:

ID: 2
                 VAL_VARCHAR: vc2
                VAL_VARCHAR2: vc2
               VAL_NVARCHAR2: nvc2
                    VAL_CHAR: c  
                   VAL_NCHAR: nc 
                       VAL_F: 3.33
                    VAL_F_10: 8.888
                     VAL_NUM: 4.444000
                      VAL_DP: 5.555
                       VAL_R: 6.66
                 VAL_DECIMAL: 1234.567891
                 VAL_NUMERIC: 1234.567891
                  VAL_NUM_VS: 77.323
                     VAL_INT: 1
                 VAL_INTEGER: 22
                VAL_SMALLINT: 333
      VAL_NUMBER_38_NO_SCALE: 4444
       VAL_NUMBER_38_SCALE_0: 5555
                VAL_NUMBER_1: 1
                VAL_NUMBER_2: 99
                VAL_NUMBER_4: 9999
                VAL_NUMBER_9: 999999999
               VAL_NUMBER_18: 999999999999999999
 VAL_NUMBER_2_NEGATIVE_SCALE: 94
 VAL_NUMBER_4_NEGATIVE_SCALE: 9949
 VAL_NUMBER_9_NEGATIVE_SCALE: 999999994
VAL_NUMBER_18_NEGATIVE_SCALE: 999999999999999949
VAL_NUMBER_36_NEGATIVE_SCALE: NULL
                    VAL_DATE: NULL
                      VAL_TS: NULL
           VAL_TS_PRECISION2: NULL
           VAL_TS_PRECISION4: NULL
           VAL_TS_PRECISION9: NULL
                    VAL_TSTZ: 2022-10-30 20:34:56.007890
                   VAL_TSLTZ: 2022-10-30 01:34:56.007890
                 VAL_INT_YTM: NULL
                 VAL_INT_DTS: NULL
  1. Support for the SQL Server XML data type.

Proposed changes

Issue Number: close #xxx

Problem Summary:

Describe the overview of changes.

Checklist(Required)

  1. Does it affect the original behavior: (Yes/No/I Don't know)
  2. Has unit tests been added: (Yes/No/No Need)
  3. Has document been added or modified: (Yes/No/No Need)
  4. Does it need to update dependencies: (Yes/No)
  5. Are there any changes that cannot be rolled back: (Yes/No)

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

vinlee19 avatar Apr 08 '24 12:04 vinlee19