8 Dec 2017

Data Modelling: Counter Table

A counter table is a table that keeps counts of particular items or for certain keys. This can range from page count on your blog to keep track of a limit the user is allowed to have from a particular item or service.


Usually, a counter table would be better kept in something like Memcached or Redis as frequent increment updates would be better suited to those in-memory systems.

MySQL and Innodb in particular has many stringent systems to make sure that your data has been reliably written to disk. Just going through those systems alone, can make having a counter table, not suitable, not even considering the speed it takes to update the actual table.

However, sometimes there is a need for certain assurances from failure scenarios where in-memory systems may not be suitable for - as when they crash, the data kept in memory is cleared out.

In those cases, may I recommend that you do what I consider a 'aggregate counter table'. The idea here is to replace doing lots of increment updates and simply count the original base table you are interested in having counts for.

In short, instead of:

INSERT INTO base_table;
UPDATE counter_table set value=value+1 where key=key1;


You would do

INSERT INTO base_table;
On interval (like 1 to 5 seconds):
- INSERT INTO counter_table 
- SELECT key1, count(1), max(primarykey) FROM base_table 
- WHERE last_count_position
- GROUP BY key1
- ON DUPLICATE KEY UPDATE value=value+recent_count


In order to be able to aggregate the base_table more correctly, you need to keep some sort of record of what was the last time or position you read for the base table. What I recommend you consider, is either the primary key, assuming its an integer as well as having a last_updated timestamp column.

Below is an example of a counter table that keeps the last id of the primary key it counted from the base table:


CREATE TABLE counter_table (
  key_id int(10) unsigned NOT NULL,
  counts int(10) unsigned DEFAULT '0',
  lastprimary_id int(10) unsigned DEFAULT '0',
  PRIMARY KEY (key_id),
  KEY idx_camp (lastprimary_id)
) ENGINE=InnoDB;


In order to run your 'refresh' query, you would first need to query the counter_table like this:
SELECT max(lastprimary_id) from counter_table;

Then populate the counter table by including in your above INSERT INTO SELECT statement a:
WHERE base_table.primarykey > lastprimary_id


This should be very fast and will prevent the many 'database-attacking update queries' that can become a serious bottleneck to your performance in the long run.


Downsides

This method doesn't factor in if the rows in the base table were UPDATE'd or DELETE'd. It just counts the row number. If this is a requirement, you can revert to using UPDATE statements for:
UPDATE counter_table SET value=value-1
with the understanding that this will happen infrequently.

You also, now need to maintain a procedure and monitor that it is running on the set intervals that you need it. Fortunately, MySQL has scheduled Events to help with that.

No comments:

Post a Comment