python-mysql-replication
python-mysql-replication copied to clipboard
Critical Bug - Usage of skip_to_timestamp option causes querying the information schema huge number of times
Overview
We are using the library to periodically sync the updated rows to another database. We are using the skip_to_timestamp
option to skip the binary logs that were already synced in the previous cycle. But if we use this, we see that the library is executing this query too many times.
Bug description
According to the logic written in row_event.py (line 613), it seems the schema is fetched for every TABLE_MAP_EVENT if it is not already present in the table_map. But in binlogstream.py (lines 551 to 557), the result of the fetched schema is ignored if the event timestamp is lesser than the skip_to_timestamp
option. So, the schema will be fetched again in the next TABLE_MAP_EVENT as it was not populated in the table_map previously.
Resolution
Populating the table_map first before continuing the loop in binlogstream.py (lines 551 to 557) should fix the issue.
Hi @shivamgly thanks for the report would like to propose a PR ?