监听oracle erp的归档日志,导致日志爆炸性增涨
Describe the bug(Please use English) A clear and concise description of what the bug is.
Environment :
- Flink version : 1.13
- Flink CDC version: 2.1
- Database and version: oracle 12c
我监听公司的oracle erp的归档日志,导致日志爆炸性增涨,平时一天几百兆日志,监听3个表后,每天200G的日志。
@szgyh I'm not quite familiar with Oracle but could you try to config some log cleanup strategies to prune useless log periodically?
@szgyh I guess you use the debezuim config(log.mining.strategy: redo_log_catalog) or do use this config default value(redo_log_catalog).I suggest that config log.mining.strategy to online_catalg that you can slove this question.
The root cause is that log.mining.strategy: redo_log_catalog will call procedure that will cause archive log to get new data dictionary. It is a question that we know.If you want to use log.mining.strategy: redo_log_catalog. You cat config you oracle database redo log file bigger to decline archive log.
我也碰到同样问题,请问在哪里可以设置这个参数?
@szgyh I guess you use the debezuim config(log.mining.strategy: redo_log_catalog) or do use this config default value(redo_log_catalog).I suggest that config log.mining.strategy to online_catalg that you can slove this question.
The root cause is that log.mining.strategy: redo_log_catalog will call procedure that will cause archive log to get new data dictionary. It is a question that we know.If you want to use log.mining.strategy: redo_log_catalog. You cat config you oracle database redo log file bigger to decline archive log.
问题解决了。感谢 @molsionmo
之前没看到文档,在创建表的时候设置 'debezium.log.mining.strategy'='online_catalog' 就可以了。
不设置这个参数的话,oracle不断的在归档日志,很快空间就满了。
而且数据传输也特别慢(大概要2分钟),加这个参数就快了。
CREATE TABLE products (
ID INT,
NAME STRING,
DESCRIPTION STRING,
PRIMARY KEY (ID) NOT ENFORCED
) WITH (
'connector' = 'oracle-cdc',
'hostname' = 'localhost',
'port' = '1521',
'username' = 'flinkuser',
'password' = 'flinkpw',
'database-name' = 'XE',
'schema-name' = 'flinkuser',
'table-name' = 'products',
'debezium.log.mining.strategy'='online_catalog'
);
@RichieZ @molsionmo 这样设置后, 如果cdc任务失败了, 数据理论上应该会丢失吧 ? After this setting, if the CDC task fails, the data should be lost in theory?
flink cdc job rely checkpoint to save scn, State and recovery. So open it.
问题解决了。感谢 @molsionmo 之前没看到文档,在创建表的时候设置 'debezium.log.mining.strategy'='online_catalog' 就可以了。 不设置这个参数的话,oracle不断的在归档日志,很快空间就满了。 而且数据传输也特别慢(大概要2分钟),加这个参数就快了。 CREATE TABLE products ( ID INT, NAME STRING, DESCRIPTION STRING, PRIMARY KEY (ID) NOT ENFORCED ) WITH ( 'connector' = 'oracle-cdc', 'hostname' = 'localhost', 'port' = '1521', 'username' = 'flinkuser', 'password' = 'flinkpw', 'database-name' = 'XE', 'schema-name' = 'flinkuser', 'table-name' = 'products', 'debezium.log.mining.strategy'='online_catalog' );
你好,请问可以不通过修改表的附加属性的形式,而是通过改变logminer的日志生成?读取策略来解决这一问题吗?
Considering collaboration with developers around the world, please re-create your issue in English on Apache Jira under project Flink with component tag Flink CDC. Thank you!
cc @GOODBOY008