Mastering MySQL Indexing

Indexes are tricky things. In my experience, indexes are added whenever SQL queries are too slow. This makes sense. However, sometimes these indexes were added without being thought through enough (I am guilty of this). Sometimes they were thought through, but the table itself now has so many indexes that the optimizer doesn't know which one to choose from.
The overhead of indexes greatly effects tables that need to be written to or altered often (even indexes on NoSQL database greatly effect writes).
Over indexed tables inflate the database size dramatically which adds to people's concerns that the database becomes unwieldy.

I have written in the past about "why are your indexes larger then your data" which contains methods to find unused indexes. This post contains a lot more research and testing since then and I believe offers a more holistic approach to indexing.

Note: the examples in this post are taken from a system running Drupal.

Tools

Lets get started. To find how to index your tables, you need to gather information about how your database server is being used. The tools I use are as follows:

pt-query-digest

Since the days of mysqlsla, this tool has come far. It is the bread and butter of any performance audit. If you want to know which queries run on your server, you will need this tool to process those queries and order them nicely by worst/highest response time. There are many ways to actually use pt-query-digest and I will show a few of them in this post.

However, pt-query-digest has its issues. It is extremely resource intensive. So much so, that it is not recommended to run it on your main database server, but a redundant slave or an unused machine. I personally had it OOM quite a few times and caused a 'kernal panic' on one occasion. Also, the larger the log file you use with pt-query-digest the more exponentially longer it takes to finish processing it. I personally, do not recommend taking a log for longer than a 24 hour period or 30Gb of log file size, whichever is smaller.

pt-index-usage

When this tool came out, I could not wait to get my hands on it. It is a tool that runs the same slow log you use for pt-query-digest. It queries a database server with EXPLAIN statements from what it finds in the log. It then calculates which indexes were not used and prints out an ALTER TABLE statement to drop those indexes.

The problem with this tool is that it is largely unstable and unsafe to use. By its nature, it will flood the database with EXPLAIN calls and force the database to open tables. If your my.cnf variables are not suited for this, you can basically max out the tables that can be open on your server. When an app sends an SQL statement to the server at that point, it will give back an error. If it sends an SQL statement to a table that's already open, it will give a regular result.

Apart from opening too many tables, this tool can also corrupt MyISAM tables in the same respect that when a client doesn't close a MyISAM table properly, it will be marked as crashed.

One time, I saw it managed to corrupt some performance_schema tables.

The way I use this tool, is that I limit the work it should do DRASTICALLY:

I choose a specific table I want to review
I filter the slow log to use only the statements which query that table
I limit pt-index-usage to only look for that table.

If I miss the log filtering part, it causes me problems, so I do all the steps to play it safe.

user_statistics

This is a google patch for MySQL that can be found in Percona Server and MariaDB. Although you can patch it yourself as well.
It keeps track of all the tables that were read from or written to as well as the indexes that were read from. It also keeps track of users, but for this post, I will refer to the tables and indexes tracking.

What is nice about this patch/plugin is that it is very lightweight - completely unlike the two tools mentioned above. You can get information about tables and indexes over the course of a few days to a few weeks which is very difficult with both of the previous tools. You can then for example, safely remove an index after noticing that it hasn't been used for 3 weeks. In a short period of time, you may miss indexes that are used for reports or daily aggregation statements.

The problem with this patch/plugin is that you only get the number of rows that were read or changed. Those numbers can confuse you when one million queries could have read one row very efficiently and another query read one million rows very inefficiently. All that data is bunched together and you lose the granularity to know how it happened.

Timing is Everything

As I have eluded to before, there is a difference between the tools and the ranges of time each can offer.

A drupal table dp_users. Taken with mycheckpoint.

The above graph is taken with mycheckpoint and monitors the rows that are read on one of the most heavily read table on this system.

mysql> select table_name, rows_read from information_schema.table_statistics order by rows_read desc limit 5;

table_name rows_read

dp_users 2302477894

dp_node 1231318439

dp_comments 1071462211

dp_userpoints 1033073070

dp_search_index 260154684

5 rows in set (0.00 sec)

After I found that out, I set up mycheckpoint to custom monitor this table, just so I can have a better appreciation for how this table is used.

Note: When I refer to "slow log" in this post, I am referring to a slow log taken on a server that has microslow patch (available in Percona Server and MariaDB) and that the long_query_time was set to 0 for the period the slow log was taken.

You can see that there are two spikes (around 16:00 and 7:00) for that day.
If you take a slow log between 16:30 and 22:30, you will miss the queries that caused those spikes.
If you take a slow log for a short period of time around 15:00 and 18:00, that peak will skew your data and you will believe that the worst queries in your report have a much greater effect on your server capacity than they might do.

Alternatively, if you take a slow log for a 24 hour period, you may miss entirely the queries that cause the spikes as their impact has been diluted across a long period of time AND those slow queries may be the queries that have the most negative impact on your users.

There is no real ideal time period to measure. It depends on what you are setting out to do. If there is an issue with the system being slow at a certain peak time, then you can take a slow log for that period.
If you want a broad over view, you can try for a 24 hour period.
But be aware that this information can be imperfect and that you might need to do some more digging and a couple of tries to get a better idea of what is happening on your server.

So what I am saying is, that there is more than one way to view how your queries run on your server.

How I would gather data

Here is a list of data gathering techniques that are order by hierarchy

Ideally, a slow log from a database server with microslow patch and the user_statistics patch.
Slow log with microslow patch (no user_statistics).
TCPdump + 'pt-query-digest --type tcpdump' with additional filtering (with no microslow patch)
General log and a slave server to run 'pt-query-digest --execute' on (but you should only execute the SELECT statements).
Regular slow log with 1 sec or above (I would at least get slow queries)
Polling a server 'pt-query-digest --processlist'.

For my examples, I have information from option 1.

Processing the data

There are in fact, many ways to slice and dice your gathered data. I will demonstrate the methods I use and the reasons for them. The end result should be to:

Find the worst performing queries to improve. Either by changing the SQL or changing the table indexes to suit them better.
Find out which tables need index changes AND get data about which changes that will give the best overall results.
Verify that the changes you made have actually made an improvement or not.

With that in mind, what I try to find out from my data is:

Worst queries running on the server by response time

This is really what you are interested in. Response time is measured by calculating the number of times a certain query has run and the time each query took.
There is distinction between a "call" which is the count of that query versus response time. You can have a query that ran many times and had very low response time and you can have a query that ran a few times but had a very high response time.
You are interested in the response time, but keep an eye on calls and 'R/call' which is response time divided by calls.

pt-query-digest slow.log > slow.txt

Time range: 2012-03-17		00:00:01 to	05:16:20
Rank	Response time	Calls	R/Call	Item
====	===============	========	======	=====
1	480.9273 6.3%	600	0.8015	SELECT dp_node dp_usernode dp_buddylist dp_users dp_node_access
2	322.4220 4.3%	129258	0.0025	ADMIN INIT DB
3	314.8719 4.2%	30220	0.0104	UPDATE dp_users
4	287.7109 3.8%	51606	0.0056	SET
5	269.3434 3.6%	600	0.4489	SELECT dp_node dp_usernode dp_buddylist dp_users dp_node_access
6	238.8571 3.1%	2902141	0.0001	SELECT dp_url_alias
...

Interpretation - Our queries are quite evenly using the server. What you would like to see is a nice 10%-20% or more type of query to be able to get 'low hanging fruit'-returns when you optimize those queries.
Queries 1 and 5 are many-join'd table queries.
Query 3 is an update statement.
Query 6 is a select statement which is used almost 3 million times during the period the log was taken, but has low response time.
Queries 2 and 3 are possibly some sort of bugs with the tool. To filter those issues out, you can run:

pt-query-digest slow.log --filter '$event->{fingerprint} =~ m/^(!?select|update|delete|create|call|insert|replace)/' > slow.txt

Worst queries running on the server that process a lot of rows

I call these 'bulky' queries. From my research, these kind of queries give me information about many-joins queries and sometimes slow queries. What these queries do is make the server work harder and I usually like to know what those queries are.

pt-query-digest slow.log --filter '($event->{Rows_examined} > 1000)'

Time range: 2012-03-17 18:44:23 to 2012-03-20 05:17:32
Rank	Response time	Calls	Item
====	==========	===	================
1	2461.6309 8.2%	1322	SELECT dp_node dp_usernode dp_buddylist dp_users dp_node_access
2	1788.6931 6.0%	1043	SELECT dp_node dp_usernode dp_buddylist dp_users dp_node_access
3	1620.4995 5.4%	4056	SELECT dp_node dp_comments dp_node_comment_statistics
4	978.7295 3.3%	217645	SELECT dp_users
5	935.3744 3.1%	715	SELECT dp_node dp_users dp_node_access
6	870.0052 2.9%	1247	SELECT dp_users
7	845.3837 2.8%	708	SELECT dp_node dp_users dp_node_access

....

Interpretation - The command filters by rows_examined > 1000. This is open to change, but for me, it works for now.
Rows_examined means that the server had to read more then 1000 rows from disk, process them in some way such as filtering or sorting and then display the result.
Queries 1 and 2 look like they access the same tables. If it is a matter of changing indexes, I would need to find the root table in those queries and make changes to it.
Queries 5 and 7, same as above.
Queries 4 and 6 access the dp_users table. One is faster to respond and one is much slower. There is clearly room for improvement with those queries and that table.
One important thing to notice is that there seems to be more or less the same tables in all of the top 'bulky' queries.

Most Updated/Affected Tables

The tool pt-query-digest is quite a versatile tool and you can use it to get more information from your slow log than just worst queries. What I like to do is use it to get a list of top tables that are the most updated, written to or deleted from. I do this to help me know which tables I should be more 'aggressive' when it comes to removing indexes.

pt-query-digest slow.log --filter '($event->{fingerprint} =~ m/^(!?insert|replace|update|delete)/) && ($event->{Rows_affected} > 0)' --group-by >affected_tables.txt

Time range: 2012-03-17 18:44:21 to 2012-03-20 05:16:02
Rank	Response time	Calls	R/Call	Item
'====	===============	======	======	=================
1	1880.1428 25.5%	530420	0.0035	dp_users
2	543.0294 7.4%	505	1.0753	dp_xmlsitemap_node
3	411.2947 5.6%	240708	0.0017	dp_cache_page
4	284.5491 3.9%	697	0.4082	dp_node_comment_statistics
5	276.1430 3.7%	2007	0.1376	dp_node
6	270.5004 3.7%	217	1.2465	dp_comments
7	256.5459 3.5%	545248	0.0005	dp_sessions
...

Interpretation - It appears that the UPDATE dp_users statement from the first report shows up here as well. It looks like dp_users is a 'hot' table that has many changes applied to it and that optimizing this table would help implement those changes faster.
It is also worth noting that dp_xmlsidemap_node, dp_node and dp_comments may run queries that are quite difficult to apply to the tables or that the tables are quite large and it takes a long time to apply to it. This could actually be a number of reasons, but it is good to get an understanding about those tables. We can start thinking about partitioning, index changes, changing server variables or changes to the underlying file system. Once we made any changes, we can focus on those tables to see if it had an effect on them.

Most Referenced Tables

This is a report that allows me to see which tables have the most queries run against them that are or could use an index to get their results. This is helpful with indexing, especially if making a change to one table can have a useful benefit to many queries.

Note, while we are keeping an eye on response times, calls are also important. In my experience, the tables with high numbers in both usually show up in this report.

pt-query-digest slow.log --filter '($event->{fingerprint} =~ m/^(!?select|update|delete)/) && ($event->{Rows_read} > 0)' --group-by tables >read_tables.txt

Time range: 2012-03-17 18:44:21 to 2012-03-20 05:17:32
Rank	Response time	Calls	Item
'====	================	=========	=============
1	17712.6402 15.6%	19311364	dp_users
2	5787.0542 5.1%	62323064	dp_signup
3	5154.7679 4.5%	83492650	dp_event
4	4712.0806 4.2%	42915664	dp_users_roles
5	2139.8459 1.9%	42192873	dp_url_alias
…

Interpretation - You may notice that I have duplicated the filters for UPDATE and DELETE. This is because they can potentially use an index to process their statements. It is important to note that badly indexed UPDATE and DELETE statements can greatly effect innodb. So there is some overlap between the affected tables report and this one.
Baring that in mind, dp_users maybe a heavily used table and we may need to investigate the queries that are run against this table.
You can run this report again with only SELECT chosen if you want to get a different point of view on the data.

What do we have so far?

For the most affected tables, I should avoid from over indexing or reduce indexes if I can.
For the most referenced tables, I should make sure the current indexes are as effective as they can be or add more indexes.
Please notice per my example, there can be situations when some of those lists may have overlapping tables - which means you have to be especially careful.

We have the worst queries by response time and the worst bulky queries.

For good measure, lets do one more thing and get the tables that are connected with the worst bulky tables report:

pt-query-digest slow.log --filter '($event->{Rows_examined} > 1000)'  -group-by tables >bulky_tables.txt

Time range: 2012-03-17 18:44:23 to 2012-03-20 05:17:32
Rank	Response time	Calls	Item
====	================	======	================
1	13745.2407 18.3%	73055	dp_node
2	13678.9905 18.2%	416139	dp_users
3	11212.2637 14.9%	39703	dp_node_access
4	4308.1558 5.7%	2630	dp_buddylist
5	4303.9987 5.7%	2404	dp_usernode
6	3172.2970 4.2%	6630	dp_comments

Interpretation - You are probably not surprised by this list of table if you had seen the bulky queries report. However, dp_node has now pop'd up on my radar and dp_node_access as well.
I had seen some of these before on this list:

mysql> select table_name, rows_read from information_schema.table_statistics order by rows_read desc limit 5;

table_name rows_read

dp_users 2302477894

dp_node 1231318439

dp_comments 1071462211

dp_userpoints 1033073070

dp_search_index 260154684

5 rows in set (0.00 sec)

So it can mean that the bulky queries do read a considerable number of rows from some of these tables

Improving the queries that use these tables can have a very positive effect on the slowest and bulkiest queries. It might also possibly lighten the load on the server with regards to fetching a lot of rows which will improve IO.

On this system, reducing those bulky queries or optimizing them can help quite a bit.

Let's Focus

We have a lot of information about the system already, although some information me be more accurate then other.
What we now need to do is focus. In my assessment, I have initially chosen to focus on one table to start of with: dp_users.

mysql> show create table dp_users\G
*************************** 1. row ***************************
Table: dp_users
Create Table: CREATE TABLE `dp_users` (
`uid` int(10) unsigned NOT NULL auto_increment,
`name` varchar(60) NOT NULL default '',
`pass` varchar(32) NOT NULL default '',
...
PRIMARY KEY (`uid`),
UNIQUE KEY `name` (`name`),
KEY `created` (`created`),
KEY `access` (`access`),
KEY `status` (`status`),
) ENGINE=InnoDB CHARSET=utf8
1 row in set (0.00 sec)

What I can do now is get all the queries that reference this table:

pt-query-digest slow.log --filter '($event->{fingerprint} =~ m/^(!?select|update|delete)/) && ($event->{arg} =~ m/dp_users /) ' --no-report --print >dp_users.log
(filters the main log by queries that reference dp_users and use an index)
pt-query-digest dp_users.log --limit 100% >tbl_dp_users.txt

Time range: 2012-03-17 18:47:03 to 23:25:22
Rank Response time Calls   Item
==== ============== ======= ===========
   1 209.2863 10.7%   88850 UPDATE dp_users SET access = 133******3 WHERE uid = 23****01\G
-2 201.7256 10.3%      91 SELECT dp_node dp_node_comment_statistics dp_users dp_comments dp_node_access
   3 162.2711 8.3% 1309010 SELECT access FROM dp_users WHERE uid = 21***4\G
   4 139.9009 7.1%     197 SELECT uid, name FROM dp_users WHERE status = 1 ORDER BY created DESC\G
   5 133.8691 6.8%     327 SELECT * FROM dp_users u WHERE LOWER(name) = LOWER('s******s')\G
   6 109.6903 5.6%   29152 SELECT du.name, du.created, du.picture FROM dp_users du WHERE du.picture!='' AND du.status = '1' AND du.created BETWEEN '133*****0' AND '133*****60'\G
-7 92.9095 4.7% 360642 SELECT dp_node dp_users dp_node_revisions
   8 74.2426 3.8%     106 SELECT * FROM dp_users u WHERE LOWER(name) = LOWER('hoa****rio') AND pass = '3837********5f9b' AND status = 1\G
-9 70.3827 3.6%     436 SELECT dp_node dp_usernode dp_buddylist dp_users dp_node_access
-10 67.7836 3.4%     436 SELECT dp_node dp_usernode dp_buddylist dp_users dp_node_access
...

For the sake of clarity, I have marked in bold the queries that I want to focus on and that is because the other queries are join'd queries where the dp_users table is not the main underlying table.
Queries 1 and 3 use uid column and query 3 gets it's results from the access column. There maybe an opportunity for a covering index.
Query 5 and 8 need their SQL to change. If you wrap a column with a function, you cannot make use of the index very well and there is already an index on the name column. Plus, MySQL is case insensitive anyway.
Query 6 will probably bring up quite a few rows back. It uses status column to search on, created column to sort on and uses uid and name for its results.

Time for pt-index-usage.
pt-index-usage dp_users.log --host 127.0.0.1 --tables dp_users >idx_dp_users.txt
ALTER TABLE `dp_users` DROP KEY `access`;

Hmmm... so the index on access isn't being used at all or sometimes being used when another index is used in a higher number of situations.
After testing I have done personally in the past, I do tend to trust the output that pt-index-usage gives me.

Indexing our Chosen Table

Our indexes are these:

PRIMARY KEY (`uid`),
UNIQUE KEY `name` (`name`),
KEY `created` (`created`),
KEY `access` (`access`),
KEY `status` (`status`)

Let us remove the index on access, but I would like there to be a covering index for the majority of the queries (queries 1 and 3 considering calls and response time) on this table. I would like to try and make the primary key (uid,access) - which would make the primary key a covering index.
I also notice that the index on status would benefit from being a composite index of (status,created) - from queries 4 and 6.

So I plan to have:
PRIMARY KEY (`uid`,`access'),
UNIQUE KEY `name` (`name`),
KEY `created` (`created`),
KEY `status` (`status`,`created`)

I cannot remove the index on name because its a unique constraint. I am not sure to remove the index on created as it may be referenced by other queries.

Testing My Theory

I apply the table change to a slave database using ALTER TABLE or pt-online-schema-change (which I like a lot, but hate the one unique index limitation). It is important to test the indexes on a server that has data in the amounts that the master has. The optimizer makes its choices based on which index gives the best results for the query it is running, from the data it has.

You can test your results by running the following tool:
pt-query-digest dp_users.log --filter '($event->{fingerprint} =~ m/^(select)/)' --limit 100% --execute 127.0.0.1 --explain 127.0.0.1 >test_dp_users.txt

But in this case, the tool is running against a slave that it's cache is not as warm as the master's is and the results will not be good and you need to keep that in mind.
In my report, it showed a lot of results in EXPLAIN as USING INDEX which is good.
You may also notice that I am only running SELECTs. This is because I do not want UPDATE/DELETE/INSERT statements to be applied to my slave server. So I do not have the full view of all the other queries.

I also ran the same pt-index-usage tool against the slave server and got this result:
ALTER TABLE `dp_users` DROP KEY `created`;

Ok, so the index on created isn't used and I can remove it.

What have I done so far?

I have identified a table that I think is a "hot table" or a table that I think is important to check the queries against.
I have filtered a slow log to show the queries running against that table.
I have made some observations
I made some decisions to change 5 indexes down to 3. It is important to mention that I also enlarged the primary key, which isn't something you should do lightly.
I tested them to some degree
I ran pt-index-usage again for good measure
Applied the changes to the master server and monitored it for a while

Applying and Testing

Since this table has a unique index I can use pt-online-schema-change and I ran this command on the master server:

pt-online-schema-change h=127.0.0.1,D=prod,t=dp_users -P3306 --alter "DROP KEY access, drop primary key, add primary key (uid, access),drop key status, add key idx_status_created(status,created), drop key created;" --drop-old-table --execute

I kept an eye on the process, everything went well.

I kept an eye on the server and noticed if there was no serious performance problem. Now, you would do your best to prevent any serious problems that may happen, but keep your eye on it non the less.

A couple of days later, I took another slow log and compared the results.

Rank	Query ID	Response time	Calls	R/Call	Item
16	0x9E5D9252DA7F38C8	70.3198 0.9%	24788	0.0028	SELECT dp_users
14	0x9E5D9252DA7F38C8	184.6790 1.2%	63140	0.0029	SELECT dp_users

total	min	max	avg	95%	stddev	median
70s	18us	671ms	3ms	6ms	5ms	4ms
185s	16us	212ms	3ms	6ms	3ms	3ms

Interpretation - The first/top query is from the first slow log and the second/bottom query is from the second slow log.
This query stayed more or less the same. The maximum went down quite a lot and the standard deviation and median went down by a bit. This means that they are more clumped together around the same time periods.
All in all, not a massive change apart from the maximum.

Lets look at another query:

Rank	Query ID	Response time	Calls	R/Call	Item
4	0x6882EF28DAF29509	314.8719 4.2%	30220	0.0104	UPDATE dp_users
4	0x6882EF28DAF29509	460.8619 3.0%	103126	0.0045	UPDATE dp_users

total	min	max	avg	95%	stddev	median
315s	69us	20s	10ms	925us	306ms	287us
461s	91us	25s	4ms	1ms	220ms	384us

This is a bit confusing to me. The response time is much better with the change. R/Call has gone down from 0.0104 to 0.0045 and that is a 231% improvement.
However, the minimum, maximum and median went up. The average and standard deviation went down which means the are much more tightly clumped around certain time periods.
It appears that while I have reduced the overall time, I have added a little overhead in the process.

To remove the overhead, I can change the primary key back to (uid) and see if improves it. I cannot however, partition the table as there are two unique indexes in that table.

The End

Congratulations on reading the entire (and extremely long) article!

I hope that I have shown you new ways to holistically review a table's indexes and that these methods will benefit you.

Now go forth and experiment!

6 comments:

Anonymous6 April 2012 at 03:25
You should try this tool: https://github.com/box/Anemometer/wiki

It makes the kind of analysis you're doing a breeze, because you don't have to run pt-query-digest over and over again. Just process it once, and then query the results in different ways.
Jonathan Levin6 April 2012 at 14:21
Thanks, I'll take a look at that.
Juan15 April 2012 at 10:39
I'm trying to use this great tool in Windows, with perl 5.10.1.2

It works fine, but when I try to apply anyone of your "--filter" examples msdos command line says:

perl ..\..\percona-too
lkit-2.1.1\bin\pt-query-digest pcclonico-slow.log --filter '($event->{Rows_exami
ned} > 1000)' > h.txt
Error compiling --filter code: sub { my ( $args ) = @_; my $event = $args->{even
t}; ( '($event- ) && return $args; };
Can't find string terminator "'" anywhere before EOF at (eval 27) line 1.

Does anybody know what I'm doing worng ?

Thanks a lot
Jonathan Levin15 April 2012 at 15:49
try:

lkit-2.1.1\bin\pt-query-digest pcclonico-slow.log --filter "($event->{Rows_examined} > 1000)" > h.txt
Anonymous15 April 2012 at 22:44
Jonathan, thanks a lot.

I seems to be a bit stupid. (je, je)

Bye,
Jonathan Levin16 April 2012 at 15:53
Don't worry about it.

Pages

5 Apr 2012