I read an article recently, about how mobile apps will probably not get the hardware boost that people are expecting.
This is partially to do with that CPU performance hitting a sort of (heat) wall and cannot improve on their speed. As Linley Gwenapp said “we’ve been falling behind Moore’s Law ever since Intel hit the power wall back in 2005”.
I myself have noticed that on a few occasions when companies decided to buy an expensive machine for their main server, it turned out to perform slower on queries than their previous soon-to-be-updated machine.
In a recent example, a three year old server with CPUs that have 2.66Ghz clock speed was almost twice as fast as a brand new machines with CPUs that have 2.3Ghz clock speed. I'm not exactly sure, but the new machines probably have several times more cache on the CPUs, probably better instruction set and the hosting company swore that it is several times faster than the old machine. However, our results - specifically to MySQL - have been discouraging.
After reading the article, I would like to suggest a thought exercise:
As DBAs, what would happen if CPUs never improve. As in, their clock speed never improves.
They can probably add more cores, fit in more cache, maybe even double the size of the CPU on the motherboard. However, their basic core performance for single threaded applications would not improve.
What would you do?
How would you solve your current company's needs?
How would you solve your future company's needs in the face of issues such as Big Data?
In my opinion, MySQL will need to break up anything that needs to be single-threaded as much as possible. This would probably not be easy. Adding a Map/Reduce layer to MySQL may help this - it works for other commercial database vendors: Infobright, Greenplum, (I think also) Oracle.
(I am not sure if Oracle may be inclined to improve MySQL's processing of large amounts of data as it may hurt profitable parts of their business.)
Sharding can and has helped companies solve this problem. This breaks up the problem by having the single threads process less data per shard. I am not sure about the available and mature solutions there are if you need to group data across several shards.
Regarding hardware, there is certainly room for "SQL" chips (think Kickfire) and other FPGAs.
Hardware compression could help, especially compression that can spread across cores, but the actual processing of the data after decompression would still be single threaded.
Summary tables could very well help for certain workloads as they pre-process large amounts of data for you into more manageable sizes. In addition to using Hadoop and if you have a person that can model data properly, it can be a very long term solution.
Perhaps pre-processing would be a much bigger thing in the future. As in, you speed your queries now by preparing the answers ahead of time and caching them.
I would like to hear more approaches to solve this problem, but I would prefer the solution to lean on the side of 'tried and tested'.