30 Dec 2016

MariaDB's Columnar Store

I have been keeping an eye on MariaDB's Columnar store progress for a bit longer then half a year.

MariaDB chose to take the infinidb code after Calpoint closed shop about two years ago and implemented it into their product. I was a bit wary about infinidb as well as it was a columnar store without compression that had mixed reviews on news hacker.

However, it seems like MariaDB have pulled it off. They have added the infinidb engine to MariaDB with all its quirks and extra commands and they have added snappy compression as well. This is truely a huge win for them and their users, specifically in the area of reporting and analytics.

Here are two real life examples for getting data ready for reporting currently happening in the wild:
1) MySQL -> Sqoop -> Hadoop - where you would need a) 5-6 additional servers, b) someone to set those servers up in a hadoop cluster and then c) monitor the daily data transfer.
2) MySQL -> CDC -> Kafka -> Hadoop - a) more technologies to master, b) a few more servers and some c) more monitoring. But this time, its streaming.

To set all of this up could take from a couple of months to a year.

Now with MariaDB, you have:
1) MariaDB + some SQL scripts -  such as INSERT INTO datawarehouse.facttable SELECT ... FROM site.table1 WHERE date >= curdate() - interval 1 day;
2) MariaDB -> Pentaho/Talend -> MariaDB - Could be a bit slower, but with a GUI and really a lot of monitoring out of the box.

As you can see, there are a lot fewer technologies, a lot fewer complexities and it is a lot more straight forward to develop.

It is also very important to add that no one other than MariaDB is doing this. The closest you have is Tokudb which is great and can also last you a while, but a Columnar store is definitely more suited for this type of task.

So once again, congratulations to MariaDB for offering such a huge benefit to its users.

1 comment:

  1. Compression was added by InfiniDB before they closed shop, not by MariaDB.