wp-rocket icon indicating copy to clipboard operation
wp-rocket copied to clipboard

Maybe optimize lookups of tables in DB to stop using SHOW TABLES LIKE

Open MathieuLamiot opened this issue 11 months ago • 3 comments

Context Reported by a user through the support: https://wp-media.slack.com/archives/C43T1AYMQ/p1709561232039849

Expected behavior Investigate the possibility to rework the exists() (and related) methods in the codebase, that are using SHOW TABLES LIKE, to something using SELECT table_name FROM information_schema.tables WHERE table_name like '%s' LIMIT 1 to reduce the compute time on the DB when there are many tables to parse from.

Acceptance Criteria Non-regression tests on the impacted features.

Additional information The issue comes from BerlinDB, not WP Rocket code itself: https://github.com/berlindb/core/blob/9c4d1fe588b4bff7b9f45091d7890bbbef9c0bc4/src/Database/Table.php#L351 We received 2 suggestions:

  • SELECT table_name FROM information_schema.tables WHERE table_name = '%s' LIMIT 1
  • SHOW TABLES = The main risk here are regressions from changing from a LIKE statement to an = (maybe with prefixes, etc. ?). There is probably a reason why LIKE is used in BerlinDB, so it's much safer to stick with it for now. That being said, I think the SHOW TABLES is a high-level instruction that will first list all tables, and then apply filters if any. Going with an explicit SQL statement should be much faster already, hence my suggestion:

SELECT table_name FROM information_schema.tables WHERE table_name like '%s' LIMIT 1

MathieuLamiot avatar Mar 04 '24 15:03 MathieuLamiot

This is still an issue. If you have a database with 150k+ tables, it destroys performance. It goes from 2-3 seconds per query to .02.

ScottE-BME avatar May 31 '24 21:05 ScottE-BME

What is happening with this? The 3.16 update to your plugin completely crippled our sites as you added even more of these statements with the Optimize Critical Images feature.

We are stuck on 3.15.10 until this is handled.

ScottE-BME avatar Jul 24 '24 02:07 ScottE-BME

related https://secure.helpscout.net/conversation/2685036291/508480/

alfonso100 avatar Aug 21 '24 13:08 alfonso100

Context:

The objective is to replace the current usage of SHOW TABLES LIKE with a more efficient SELECT table_name FROM information_schema.tables WHERE table_name LIKE '%s' LIMIT 1 query. This change aims to reduce the database load and improve performance when there are many tables in the database.

Reproduce the problem:

  1. Simulate a database with a large number of tables.
  2. Use the existing BerlinDB code that checks for table existence using SHOW TABLES LIKE %s.
  3. Measure the time it takes to execute this query.

Identify the root cause:

The SHOW TABLES LIKE method lists all tables and applies a filter afterward, making it less efficient for databases with many tables. The issue is caused by the need to parse through all the tables in the database before filtering for the specific one.

Scope a solution:

Replace the SHOW TABLES LIKE query with:

SELECT table_name FROM information_schema.tables WHERE table_name = '%s' LIMIT 1

This query directly checks the information_schema.tables and is more efficient since it only retrieves the necessary data.

Development Steps:

  1. Modify `table_exists() from AbstractQuery.php Updated code:
protected function table_exists(): bool {
    if ( self::$table_exists ) {
        return true;
    }

    // Get the database interface.
    $db = $this->get_db();

    // Bail if no database interface is available.
    if ( ! $db ) {
        return false;
    }

    // Query statement to check the information schema.
    $query    = 'SELECT table_name FROM information_schema.tables WHERE table_name = %s LIMIT 1';
    $prepared = $db->prepare( $query, $this->table_name );
    $result   = $db->get_var( $prepared );

    // Does the table exist?
    $exists = $this->is_success( $result );

    if ( $exists ) {
        self::$table_exists = $exists;
    }

    return $exists;
}
  1. Modify table_exists() from AbstractQueries.php Updated code:
protected function table_exists(): bool {
    if ( self::$table_exists ) {
        return true;
    }

    // Get the database interface.
    $db = $this->get_db();

    // Bail if no database interface is available.
    if ( ! $db ) {
        return false;
    }

    // Query statement to check the information schema.
    $query    = 'SELECT table_name FROM information_schema.tables WHERE table_name = %s LIMIT 1';
    $prepared = $db->prepare( $query, $this->table_name );
    $result   = $db->get_var( $prepared );

    // Does the table exist?
    $exists = $this->is_success( $result );

    if ( $exists ) {
        self::$table_exists = $exists;
    }

    return $exists;
}

Effort Estimation

  • Size: Small (S) — Only three files need modifications, and the changes are straightforward.Size: Small (S) — Only three files need modifications, and the changes are straightforward.

Non-Regression Testing:

  • Ensure tests cover scenarios where tables:
    • Exist and are queried using prefixes.
    • Do not exist, to confirm the correct return values.
  • Test with databases containing many tables to validate performance improvements.

Miraeld avatar Sep 09 '24 09:09 Miraeld

@Miraeld It's worth mentioning that the 1st point is from a dependency

MathieuLamiot avatar Sep 09 '24 10:09 MathieuLamiot

Yes true, I didn't pay attention to the full path of that file, I edited the grooming accordingly.

Miraeld avatar Sep 09 '24 14:09 Miraeld

As @MathieuLamiot mentioned, this is inside dependency, so we may need to fix it from berlindb side or at least in its fork, what do u think?

wordpressfan avatar Sep 11 '24 05:09 wordpressfan