19 Apr 2012

MariaDB's Virtual Columns

I wanted to evaluate MariaDB's virtual column and see if it could store business rules next to table data. With virtual columns this could be done and if you specify as 'virtual', would not even take up space on disk.

The imaginary use case is that there is a sales team in a company and you would like to evaluate if a particular salesperson is eligible for a bonus. In order to get a bonus, you need to sell above the average for the day and be in the top 5 amongst the salespeople.

So here is what I did:

MariaDB [test]> create table salespeople (id int unsigned not null auto_increment primary key, salesperson_id int unsigned not null , `date` datetime not null default 0, sold decimal(15,2) not null default 0, day_avg decimal(15,2) not null default 0,  above_avg char(1) as (if(sold>day_avg,'Y','N')) virtual);
Query OK, 0 rows affected (0.02 sec)
So far so good.
 MariaDB [test]> insert into salespeople(salesperson_id, `date`, sold, day_avg) values (1,now(),300,150);
Query OK, 1 row affected (0.01 sec)
MariaDB [test]> select * from salespeople;
+----+----------------+---------------------+--------+---------+-----------+
| id | salesperson_id | date                | sold   | day_avg | above_avg |
+----+----------------+---------------------+--------+---------+-----------+
|  1 |              1 | 2012-04-19 13:16:32 | 300.00 |  150.00 | Y         |
+----+----------------+---------------------+--------+---------+-----------+
1 row in set (0.00 sec)
Cool, just like mashing a view into a table.

MariaDB [test]> update salespeople set sold = 149 where id = 1;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
MariaDB [test]> select * from salespeople;
+----+----------------+---------------------+--------+---------+-----------+
| id | salesperson_id | date                | sold   | day_avg | above_avg |
+----+----------------+---------------------+--------+---------+-----------+
|  1 |              1 | 2012-04-19 13:59:57 | 149.00 |  150.00 | N         |
+----+----------------+---------------------+--------+---------+-----------+
1 row in set (0.00 sec)

Works again. No triggers for UPDATE needed.
Now lets try to add another business rule:

MariaDB [test]> drop table salespeople;
Query OK, 0 rows affected (0.03 sec)

MariaDB [test]> create table salespeople (
    -> id int unsigned not null auto_increment primary key,
    -> salesperson_id int unsigned not null,
    -> `date` datetime not null default 0,
    -> sold decimal(15,2) not null default 0,
    -> day_avg decimal(15,2) not null default 0,
    -> above_avg char(1) as (if(sold>day_avg,'Y','N')) virtual,
    -> day_position int not null default 0,
    -> give_bonus char(1) as (if(above_avg='Y' and day_position between 1 and 5,'Y','N') )virtual
    -> );
ERROR 1900 (HY000): A computed column cannot be based on a computed column
MariaDB [test]>


DOH! >.<

ADDITION:

I have found a way to store more business rules with the limitation of not being able to use another virtual column to "build" a more complex rule set:

MariaDB [test]> create table salespeople (
    -> id int unsigned not null auto_increment primary key,
    -> salesperson_id int unsigned not null,
    -> `date` datetime not null default 0,
    -> sold decimal(15,2) not null default 0,
    -> day_avg decimal(15,2) not null default 0,
    -> day_position int not null default 0,
    -> bonus_amount decimal(15,2) as (
    -> case when sold>day_avg and day_position between 1 and 5
    ->  then 0.2 * (sold - day_avg) #20% bonus
    -> when sold>day_avg and day_position >5
    ->  then 0.05 * (sold - day_avg) #5% bonus
    -> else 0
    -> end )virtual
    -> );
Query OK, 0 rows affected (0.75 sec) 
MariaDB [test]> insert into salespeople(salesperson_id, `date`, sold, day_avg, day_position) values (1,curdate(),300,150,3);
Query OK, 1 row affected (0.01 sec) 
MariaDB [test]> select * from salespeople;
+----+----------------+---------------------+--------+---------+--------------+--------------+
| id | salesperson_id | date                | sold   | day_avg | day_position | bonus_amount |
+----+----------------+---------------------+--------+---------+--------------+--------------+
|  1 |              1 | 2012-04-20 00:00:00 | 300.00 |  150.00 |            3 |        30.00 |
+----+----------------+---------------------+--------+---------+--------------+--------------+
1 row in set (0.00 sec)
In the above example, the actual bonus amount was calculated.



9 Apr 2012

How to get a query-digest report from TCPdump on a set time

Note to self

Getting a query-digest report for 20 seconds worth of tcp traffic on linux
(no special slow log/microslow patch needed):
tcpdump -s 65535 -x -nn -q -tttt -i any -c 99999 port 3306 | pt-query-digest --type tcpdump --run-time 20s

thanks to aaron brown

Connect to Anemometer instead of using slow log
#crontab entry
*/5 * * * * tcpdump -s 65535 -x -nn -q -tttt -i any -c 99999 port 3306 | pt-query-digest --type tcpdump --run-time 20s --user=anemometer --password=superSecurePass --review h=db.example.com,D=slow_query_log,t=global_query_review --review-history h=db.example.com,D=slow_query_log,t=global_query_review_history --no-report --limit=0% --filter=" \$event->{Bytes} = length(\$event->{arg})"  

5 Apr 2012

Mastering MySQL Indexing

Indexes are tricky things. In my experience, indexes are added whenever SQL queries are too slow. This makes sense. However, sometimes these indexes were added without being thought through enough (I am guilty of this). Sometimes they were thought through, but the table itself now has so many indexes that the optimizer doesn't know which one to choose from. 
The overhead of indexes greatly effects tables that need to be written to or altered often (even indexes on NoSQL database greatly effect writes).
Over indexed tables inflate the database size dramatically which adds to people's concerns that the database becomes unwieldy.