vitess icon indicating copy to clipboard operation
vitess copied to clipboard

vtbackup: disable redo log before starting replication

Open maxenglander opened this issue 2 years ago • 1 comments

Description

According to MySQL docs:

As of MySQL 8.0.21, you can disable redo logging using the ALTER INSTANCE DISABLE INNODB REDO_LOG statement. This functionality is intended for loading data into a new MySQL instance. Disabling redo logging speeds up data loading by avoiding redo log writes and doublewrite buffering.

See: https://dev.mysql.com/doc/refman/8.0/en/innodb-redo-log.html#innodb-disable-redo-logging

We can take advantage of this in vtbackup. This change disables the redo log on MySQL >= 8.0.21 before starting replication, and re-enables the redo log after stopping replication.

Related Issue(s)

Checklist

  • [ ] "Backport me!" label has been added if this change should be backported
  • [ ] Tests were added or are not required
  • [ ] Documentation was added or is not required

Deployment Notes

maxenglander avatar Sep 22 '22 22:09 maxenglander

Review Checklist

Hello reviewers! :wave: Please follow this checklist when reviewing this Pull Request.

General

  • [x] Ensure that the Pull Request has a descriptive title.
  • [x] If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • [x] If a new flag is being introduced, review whether it is really needed. The flag names should be clear and intuitive (as far as possible), and the flag's help should be descriptive. Additionally, flag names should use dashes (-) as word separators rather than underscores (_).
  • [x] If a workflow is added or modified, each items in Jobs should be named in order to mark it as required. If the workflow should be required, the GitHub Admin should be notified.

Bug fixes

  • [x] There should be at least one unit or end-to-end test.
  • [x] The Pull Request description should either include a link to an issue that describes the bug OR an actual description of the bug and how to reproduce, along with a description of the fix.

Non-trivial changes

  • [x] There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • [x] Should be documented, either by modifying the existing documentation or creating new documentation.
  • [ ] New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • [x] Protobuf changes should be wire-compatible.
  • [x] Changes to _vt tables and RPCs need to be backward compatible.
  • [x] vtctl command output order should be stable and awk-able.

vitess-bot[bot] avatar Sep 22 '22 22:09 vitess-bot[bot]

vtbackup_transform is failing consistently which is concerning, but I also see this test failing a bunch on main recently, so maybe it's a flake?

maxenglander avatar Sep 23 '22 16:09 maxenglander

vtbackup_transform is failing consistently which is concerning, but I also see this test failing a bunch on main recently, so maybe it's a flake?

vtbackup_transform has been VERY flaky so not likely related. @rsajwani has been looking into it but we haven't found the cause yet (only seems to happen in the CI).

mattlord avatar Sep 23 '22 18:09 mattlord

Hey @mattlord I was able to address your suggestions. I tried adding some checks to the E2E test, ran into a challenge where the E2E test is executing the vtbackup binary, which starts and destroys a fresh mysqld instance and didn't really give me any way to inspect MySQL from the test code. Reworked the code a bit so that vtbackup optionally retains temporary files, so that I'm able to inspect them (in particular the error.log) from the test after the backup completes, and then delete the temporary files from the test. Not ideal, definitely open to better ideas.

maxenglander avatar Sep 26 '22 19:09 maxenglander

@mattlord was able to do what you suggested:

  • Connect to mysqld directly from the E2E test.
  • Verify redo log changes with performance schema error sink.
  • Remove those flags and the CNF cruft.

A bit worried about flakiness since mysqld could shut down before test is able to verify, but the test is passing consistently locally 👍

maxenglander avatar Oct 20 '22 02:10 maxenglander

@maxenglander your last commit needed to include changes to unit tests. It's good to run make unit_test locally before pushing changes. CI is a very expensive way to detect unit test failures.

deepthi avatar Oct 21 '22 22:10 deepthi

Understood @deepthi, will do that going forward

maxenglander avatar Oct 21 '22 22:10 maxenglander