Best Way to Process Data in the Database

Here is the best way to process data in or to the database. 
On the one hand, you have concurrency or splitting the work up across many INSERT statements or threads.
And on the other have, you have a single large batch process like one large INSERT statement.
This is based on my experience, based on hard facts, based on even harder science which is based on my opinions and patterns that I have noticed.


The 'Sweet Spot' would be mostly batch processing with a sprinkle of concurrency.

Another way to look at it is "chunking". You need to break down the work into chunks that are big enough to optimally use the database and what it is good and small enough that it does not bog down the database.

Here is why:


There you go. Hard facts, based on a graph I made in ms paint.

How to Speed up Database APIs

From a DBA's point of view, applying an abstracting layer over the database can have performance issues.
There are, of course, advantages to having your database interactions in one place and also being able to expose it to other resources.

I would argue that making it database vendor-agnostic is very hard if not impossible to do.
I would also argue that things like load balancing and even sharding, could be better accomplished with things like HA proxy and similar existing products.

However, if you are implementing a database API, you MUST allow for batching in your API or you are probably going to kill your performance.

Some information about batching in your API:
http://williamedwardscoder.tumblr.com/post/16516763725/how-i-got-massively-faster-db-with-async-batching
https://www.drupal.org/node/180528

P.S. It would also be a good idea to add some thresholds to monitor for slow database queries in the API.