python-mysql-replication icon indicating copy to clipboard operation
python-mysql-replication copied to clipboard

auto ignore table starts with _

Open dec1985 opened this issue 7 years ago • 6 comments

Use pt-online-schema-change will auto generate a new table which name starts with '_', just ignore it will incr performance a lot!

dec1985 avatar Feb 07 '18 08:02 dec1985

I don't think this is a good idea to do this. Some people may want those table.

I would prefer the consumer to just ignore those tables if he wants to.

baloo avatar Feb 07 '18 16:02 baloo

We have many big tables which have more than dozens millions rows, so we can only use pt-online-schema-change to alter table(this tool will make a new table which name starts with _ and ends with _new, then insert all rows into that new table and the rename new table to old table name). And we may alter table many times, such as alter a table everyday.

Use ignored_tables has to change the config files every time and restart that sync process, it's very painful.

Normal table's name always don't starts with "_".

Or maybe I add a config to control whether skip table's name with "_"?

dec1985 avatar Feb 08 '18 03:02 dec1985

I know pt-online-schema-change and I also run it. But as of it's current state this is a breaking change for most users, I'm not merging this.

Maybe a better option would be a PR to have ignore_table optionaly include a user function? Something like:

def ignore_pt_online_schema_change(tablename):
    return tablename.startswith('_')

ignore_tables = [
    'user_passwords',
    ignore_pt_online_schema_change
]
   
BinLogStreamReader(ignore_tables=ignore_tables)

Would that work for you?

baloo avatar Feb 08 '18 17:02 baloo

👍 for @baloo solution user can have use of table starting with _

But you are right with pt-online-schema-change you need to filter.

Alternative solution:

def ignore_pt_online_schema_change(tablename):
    return tablename.startswith('_')
   
BinLogStreamReader(ignore_tables_callback=ignore_pt_online_schema_change)

julien-duponchelle avatar Feb 08 '18 21:02 julien-duponchelle

noplay's solution would work for me, or I can add a controller which not filter table starts with "_" as its default value, same like noplay's solution in another way:

BinLogStreamReader(ignore_tables_startswith='_')

and ignore_tables_startswith default value is None.

dec1985 avatar Feb 09 '18 10:02 dec1985

@baloo idea of callback functions sounds like a good idea. we are also seeing bunch of unnecessary events getting created by PT-OSC. it would be nice to get this feature in :)

abrarsheikh avatar Feb 13 '18 00:02 abrarsheikh