revolution icon indicating copy to clipboard operation
revolution copied to clipboard

Potential drawbacks with InnoDB clustered indexes and MODx session structure

Open travisbotello opened this issue 8 years ago • 7 comments

Summary

Couple of weeks ago we saw in our status monitor (pinging each application in 5 minute intervals and measuring response time) that 3 applications are performing very slowly from time to time. (Response times from 2s-10s while the average is around ~0.1s). The slow log of MariaDB is full of slow queries (most of them easy insert and update queries for session storage) which sometimes take up to 15 seconds, but only from these three applications. (not MODx)

What these three applications have in common compared to MODx applications, they are all using InnoDB instead of MyISAM. So for further debugging we converted 3 MODx applications from MyISAM to InnoDB and these also started to suffer from the same delays.

Step to reproduce

Install MODx >= 2.6 which is using InnoDB as its default engine or convert your MyISAM tables to InnoDB. Then set the following in your my.cnf:

slow_query_log 				= 1
log_slow_verbosity				= query_plan,explain
long_query_time				= 1

It is very likely (especially if you run a very busy MODx site or multiple low-mid-sized MODx installations) that you start seeing your slowlog being triggered with queries like this:

# Time: 171128 15:11:48
# User@Host: modx[modx] @ localhost []
# Thread_id: 594  Schema: modx  QC_hit: No
# Query_time: 10.139805  Lock_time: 0.000081  Rows_sent: 0  Rows_examined: 0
# Rows_affected: 1
#
# explain: id   select_type table   type    possible_keys   key key_len ref rows    r_rows  filtered    r_filtered  Extra
# explain: 1    INSERT  modx_session    ALL NULL    NULL    NULL    NULL    NULL    NULL    100.00  100.00  NULL
#
use  modx;
SET timestamp=1511878308;
INSERT INTO `modx_session` (`id`, `access`, `data`) VALUES ('pk82r297e9p01tqc8cqqku55e0', 1511878298, 'modx.user.contextTokens|a:0:{}');

As you can see, this simple INSERT query took over 10s. When we profile one of those slow queries, you will see something like this:

MariaDB [modx]> SHOW PROFILE FOR QUERY 216;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000086 |
| checking permissions | 0.000010 |
| Opening tables       | 0.000027 |
| After opening tables | 0.000013 |
| System lock          | 0.000006 |
| Table lock           | 0.000006 |
| init                 | 0.000071 |
| updating             | 0.000099 |
| end                  | 0.000008 |
| query end            | 4.061019 |
| closing tables       | 0.000032 |
| Unlocking tables     | 0.000022 |
| freeing items        | 0.000012 |
| updating status      | 0.000028 |
| logging slow query   | 0.000155 |
| cleaning up          | 0.000026 |
+----------------------+----------+

The queries get stuck in query_end state as you can see in this example. Digging in deeper this happens when the db server stalls/flushes/re-organizes the index. When we take a look at the MODx sessions table with SHOW CREATE TABLE modx_sessions:

CREATE TABLE `modx_session` (
  `id` varchar(191) NOT NULL DEFAULT '',
  `access` int(20) unsigned NOT NULL,
  `data` mediumtext DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `access` (`access`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

We can see that PK id is not using auto_increment and DEFAULT '' which brings me to InnoDB clustered indexes issues which is summed up very well in this SO.

Summing up what is happening here:

From MySQL docs:

Every InnoDB table has a special index called the clustered index where the data for the rows is stored. Typically, the clustered index is synonymous with the primary key. To get the best performance from queries, inserts, and other database operations, you must understand how InnoDB uses the clustered index to optimize the most common lookup and DML operations for each table.

So the perfect PK structure for InnoDB tables can be described as: Unique, Narrow, Static, Ever-increasing pattern.

Source

When check MODx session table for this PK structure:

  • Unique => not really, hashing yes, but not using auto-increment
  • Narrow => not really
  • Static => yes
  • Ever-increasing pattern => no

So here we might have the potential source of the slow queries. What can happen when a non-ever-increasing clustered index is used is described here:

If the clustering key is ever-increasing then new rows have a specific location where they can be placed. If that location is at the end of the table then the new row needs space allocated to it but it doesn't have to make space in the middle of the table. If a row is inserted to a location that doesn't have any room then room needs to be made (e.g. you insert based on last name then as rows come in space will need to be made where that name should be placed). If room needs to be made, it's made by SQL Server doing something called a split. Splits in SQL Server are 50/50 splits - simply put - 50% of the data stays and 50% of the data is moved. This keeps the index logically intact (the lowest level of an index - called the leaf level - is a douly-linked list) but not physically intact. When an index has a lot of splits then the index is said to be fragmented. Good examples of an index that is ever-increasing are IDENTITY columns (and they're also naturally unique, natural static and naturally narrow) or something that follows as many of these things as possible - like a datetime column (or since that's NOT very likely to be unique by itself datetime, identity).

So, MODx sessions PK structure seems to be keeping InnoDB busy, especially when a lot of INSERTS and UPDATES are going on, either with a high-traffic installation or multiple low to mid-sized MODx installations. Obviously MyISAM was more solid/faster for this workflow...

When a new row is inserted (and this happens a lot under heavy load) at a random location of the index (because of the id/PK being a random hash) InnoDB sometimes won't find any space available and therefore has to rearrange the index and that takes time. Additionally this will cause fragmentation of the index and then slows down other queries not directly related to session.

From my dba exchange on this issue:

For every row added to modx_session table, you can expect it to take more time for any INSERT, UPDATE or DELETE as long as the ID is not UNIQUE (which it isn't because auto-increment is not used). Give this a try on just one table that is listed in slow log and then use EXPLAIN to confirm you went from RANGE type to EQ_REF type and you will get away from 100% filtering [..]

Expected behavior

Switching from MyISAM to InnoDB shouldn't have such a huge performance slowdown. From my point of view there are a couple of potential solutions:

  • Keep session related tables in MyISAM (quick fix only)
  • Switch default session driver from database to file
  • Re-Design session table to use unique, auto-increment PKs instead of hashing

Also MariaDB's conversion guide from MyISAM to InnoDB suggests that especially Indexes/PKs re-arrengment should be considered and auto-increment dummy PKs should be used: MariaDB converting tables from MyISAM to InnoDB

Recommendation. Look for tables without a PRIMARY KEY. Explicitly specify a PRIMARY KEY, even if it's an artificial AUTO_INCREMENT. This is not an absolute requirement, but it is a stronger admonishment for InnoDB than for MyISAM. Some day you may need to walk through the table; without an explicit PK, you can't do it.

Environment

Reproduced in:

  • CentOS 7
  • MODx 2.5.8, 2.6, 2.6.1
  • MariaDB 10.1.19
  • Nginx with PHP-FPM 7.1

travisbotello avatar Dec 19 '17 09:12 travisbotello

Thank you for a thorough detective job!

I can not say that I have seen this myself, because I have not played that much with InnoDB installs. Can you estimate the number of rows in the modx_session table? Are we talking a few thousand, tens of thousands or millions of records? I would like to be able to replicate this locally too.

I am not that familiar with MyISAM and InnoDB, and there are many years since I took any courses that dealt with the underlying implementation of such systems. But would the following solve the problem:

  1. Change the database layout to having a primary key that is an auto incremented integer.
  2. Add a index to the session key (today the id attribute).
  3. Make the index unique.

OptimusCrime avatar Dec 19 '17 09:12 OptimusCrime

For what it's worth - we run all our sites as INNODB (around 40-50 at the moment) and a few have over 800,000 rows in the modx_session table, our reporting doesn't show these slow queries or similar issues.

That being said, Adding an auto increment primary key and making "id" a unique index should help resolve the issue (as the compound key is then always incrementing). It should be simple to benchmark too.

markwillis82 avatar Dec 19 '17 09:12 markwillis82

I can't beat @markwillis82 numbers: We are running about 10 MODx installations with about 10k-15k rows in our modx_session tables..

@markwillis82 What version of MySQL/MariaDB are you running?

travisbotello avatar Dec 19 '17 10:12 travisbotello

AWS RDS - db.m4.xlarge, which is AWS' custom build of MySQL 5.6 (a bunch of changes have been backported from newer versions)

markwillis82 avatar Dec 19 '17 10:12 markwillis82

Yes, thank you for the detailed report. This is a fairly trivial change for new deployments, but I'm looking into how we can modify the existing "session_id" column and add the auto_increment primary key during an upgrade as it is not clear what the best way to approach that would be for existing tables.

opengeek avatar Dec 19 '17 22:12 opengeek

I am having the same issue on ModX v2.8.3. I've just added a dummy autoincrement column as primary key to the modx_session table. It has only been a day, but it looks like the issue is gone ever since. Before slow queries were logged every few hours.

An interesting fact that the table only had <1000 rows, still inserts sometimes took 20s. It is okay ever since the addition of the autoincrement column.

BenceSzalai avatar Sep 04 '21 18:09 BenceSzalai

Yes, thank you for the detailed report. This is a fairly trivial change for new deployments, but I'm looking into how we can modify the existing "session_id" column and add the auto_increment primary key during an upgrade as it is not clear what the best way to approach that would be for existing tables.

Maybe I don't understand the complexity involved, but what I did was this:

  • add new dummy_id column
  • create a new index for the existing id column
  • ensure the new dummy_id column has valid value for all existing rows, which can be done by numbering all existing rows from 1 to N, or deleting all existing session rows. I've chose the later as it was easier and we had no need to persist existing sessions, but an UPDATE query can be written that numbers each row incrementally as well for those who need to keep the sessions intact.
  • make the dummy_id column auto increment and make it primary key at the same time (i think auto increment requires the column to be primary key). Also if existing rows were numbered set the autoincrement value to the next available integer.

That's it. Table is "fixed". So after all it can be achieved by running few SQL statements. Am I missing a point?

BenceSzalai avatar Sep 05 '21 09:09 BenceSzalai