Not Only SQL
No, or Not Only
One of the most sensible things to emerge for the recent no:sql(eu) event (which sadly I didn’t attend) was a statement that NOSQL should be expanded to Not Only SQL rather than No SQL. This is an interesting development, as there’s been lots of good stuff going on in the NOSQL world, but the debate has been polarised and driven off centre.
I could be wrong about this, but it seems to be that the movement would be better labelled as NORDBMS, as it’s with relational databases that the problem lies, not the language used to query them.
The central point: SQL != RDBMS
Now I don’t have any particular complaint against RDBMS per se, it’s just that the world has moved on from a place where an RDBMS can be used as the default mechanism for persistence. The RDBMS has its place, and that place isn’t everywhere for everything. In fact I’d go so far as to say that the rational architect wouldn’t choose RDBMS from the plethora of available choice for all but a few projects starting from scratch today. That is if it wasn’t for organisational (and ecosystem biases), if it wasn’t for…
The cult of the DBA
DBA’s are the self appointed guardians of the modern firms ‘books and records’. They’re a conservative, safety first, sort of bunch, which is why they don’t really do new technology. And why should they, when the vendors that keep them supplied with T-Shirts, track days and conferences to slack off to assure them that they’re on top of all this new fangled stuff like XML and object caching and data warehousing.
The cult of the DBA is also powerful, with it’s leaders firmly emplaced in senior positions within the firm and across the industry. There are rules around here, and we know who set them.
Leaving practicalities aside, when you ascend to layers 8 & 9 of the OSI model (politics and religion), then the essence of the NOSQL movement is ‘screw you DBAs, I’m going to do this myself, I don’t need you and your time consuming processes’.
The cult is not alone however, as the RDBMS is just a veneer on top of the real persistence and…
The priesthood of storage
These are the guys that add the extra zeroes to the end of your cost per TB. The guys that have to run their own private networks (with their own esoteric protocols) to make up for the fact that storage inherently has no security.
Of course the priesthood make out like they’re saints – these are the guys that work weekends, just in case that slip of a finger takes out the whole trade floor.
Together, the cult and the priesthood have established themselves in such a way that the typical enterprise developer has no choice – if you want your data to be secure (and who doesn’t) then you put it in here (the RDBMS) and we look after it there (an EMC DMX or similar).
The impedance cost
I have written before about impedance mismatches in data, so I’ll try not to repeat myself too much.
The trouble with RDBMS is that relational form (and the set theory it’s based on) and yes, the SQL query language itself aren’t often the optimum way of representing data. This is especially true if you manipulate that data in an object oriented language (where you almost certainly represent it as objects) or exchange that data with others (using perhaps a text based expansion like XML). There are conversion costs in time, compute effort and fidelity when moving between representations. Common sense therefore suggests that if data can be stored in the same form that it’s manipulated then that’s a big win.
This problem isn’t confined to the code, as it extends to the modelling domain too. Entity Relationship (ER) tools are great at helping you organise how you put things into an RDBMS, but I feel that’s where the story ends.
The training cost
A (perhaps specious) argument I’ve heard against new data management techniques and technologies is that ‘my developers know SQL’. Setting aside for a moment my scepticism that SELECTs and JOINs are blunt instruments compared to the other tools in the bag, I do accept that fully exploring and understanding the tool bag is too much for the average developer – so if SQL is what they know then let’s take advantage of that. This is where things get interesting…
Bolt on (or bolt in) a SQL parser
Just because you have an unconventional data storage paradigm (e.g. anything that’s not RDBMS) doesn’t meant that you can’t understand SQL. Mike Stonebraker clearly understood this when he chose StreamSQL as the way to go for complex event processing (CEP) queries with StreamBase. Others are starting to follow the lead. Sean Park recently pointed me at GenieDB, which looks like a very interesting hybrid of memcached and MySQL. My first reaction was that it’s a system that will let you have your cake and eat it (and that provides a reasonable path to eating LOTS of cake really quick when that need arises). I hope to kick the tyres soon (and report back here what I find).
And in the cloud
This post wouldn’t be complete without a comment on the big news of the day – VMWare (Springsource) buying Gemstone, which is clearly a big bet it the NOSQL space. This will get interesting, as VMWare now have all of the pieces to make a complete Platform as a Service (PaaS) that can sit on top of VSphere (and other?) Infrastructure as a Service (IaaS). Since they can abstract and automate a lot of the traditional admin overhead out of the way then I think this torpedoes the cult of the DBA and the priesthood of storage – there simply isn’t a place for these people in a cloudy world (public, private, hybrid or whatever). And just to be clear, I’m not arguing that systems administration evaporates into the cloud (I’m with the very wise Simon Wardley on this one); but the discipline of system administration changes – progress routes around obstacles, and the cult/priesthood certainly were obstacles. NOSQL has implications for developer productivity, systems administration workload and risk. No of these are trivial, but new and better choices are now on the table, and I sense that there’s more to come.
Filed under: cloud, software, technology | 3 Comments
Tags: cloud, database, dba, iaas, nosql, paas, rdbms, sql, storage