Making Databases Stupid
By cam (Fri Nov 18, 2005 at 06:49:55 AM EST) (all tags)
to make applications smarter? Interesting article by Alex Bunardzic on The Myth of Data Integrity. Found from one my favourites aggregators on the internets, Rusty Elliot-Harold. Alex targets stored procedures, normalisation and orphanage prevention.

From the article;

David Heinemeier Hansson mentions how he had to “slaughter the holy cows” of computer science. He then goes on to explain how he and his partners in crime proceeded to do it: “We took a pretty radical stand: Stored procedures and all things that make your database clever are evil.” And he added: “If you tell a lot of IT shops that, they’ll be majorly offended, because that’s just the way they do things.”

My view? I dislike stored procedures. It puts application level problems and descriptions in the database. IMO database are supposed to be dumb storage, they are intended to manage the storing of data, not business logic.

Another from the article;

Firstly, every relational jockey will insist that data integrity basically means that each row in the table is uniquely identified. This integrity, then, ensures that there are no duplicate rows in a table. ... Typically, the RDBMS will enforce this rule of uniqueness by implementing a bailout concept known as the surrogate key. What that means is that, indeed, each row in the Patient table will be unique, thanks to the unique surrogate key. But all the other information in the rows of that table could be absolutely identical, while giving the false impression of ‘uniqueness’.

My view; But I love normalisation! It does have its limitations however, though this particular instance can be handled at the application or database level to enforce uniqueness on a field in a table.

Another from the article;

Another myth of data integrity is the so-called ‘orphanage prevention’. This special case of data integrity, known as referential integrity, has much more applicability in the non-electronic storage systems than in the electronic ones. The reason for that is the fact that electronic orphaned rows are fairly innocuous, since there is no simple and obvious way to traverse the collection in order to reach them. In real life, non-electronic systems, such records are much more likely to stumble upon.

This is an application level problem, but this just proves his point. Let database be dumb, and application smart, after all the application is a model of the business process, the database is just storage to support and expedite the data being pulled into business objects.

Interesting article. Usually the database stuff doesn't get much navel gazing.

cam

Making Databases Stupid | 21 comments (21 topical, 0 hidden) | Trackback
Stored Procs by jump the ladder (4.00 / 2) #1 Fri Nov 18, 2005 at 07:12:49 AM EST
The thing is stored procs are useful even if they break some data integrity rules or encapsulate business rules because generally SQL database servers are better at datacentric tasks than your client or middleware.

They have their place by cam (4.00 / 1) #10 Fri Nov 18, 2005 at 10:26:27 AM EST
I prefer to write the queries through an OM layer or critera that can be compiled against the database schema. But that too has its issues. Like most things there arent any absolute rules, but I favour the no stored procedures style.
Dunno by R Mutt (4.00 / 3) #2 Fri Nov 18, 2005 at 07:17:59 AM EST
On the one hand, I've twice worked for places that have policies of "only touch the database via views and stored procedures". In theory, it's good to have the extra layer so you can change the database without having to change the application. In practice, that's been useful exactly once in my career, whereas there have been approximately eleventeen trillion occasions where I've had to redundantly change/deploy the code and the stored procedure to add/remove/change one little field.

On the other hand, I've worked with seriously fucked up databases with no relational integrity, and that's proven to be a real problem. It's not just "orphaned" records that are a problem: for instance, you can end up with two addresses attached to a customer, one of which is wrong, and no way to delete the wrong one, only both. Also we've had parent-child structures in the database with a redundant parent-child row and child-parent row... except that bugs have led to some having only one row, so a child has a parent but the parent doesn't have a child.

Having worked without integrity, I'd say it generally causes huge problems years down the line. Especially when the people who built the application have moved on and no-one left understands more than a few bits of it (the usual status quo). Code rot means you can't just "rely on the code" forever.

In general I'd say stored procs should only be used if there's an advantage to them (performance, complexity, commonly used tasks). Having business logic there is a bad idea. But that's not really a problem these days so much as business logic being embedded into XSL which is another rant entirely...

Relational integrity is a Good Thing though.

Since the Object Mapping libraries have by cam (4.00 / 1) #11 Fri Nov 18, 2005 at 10:29:16 AM EST
improved out of sight in the last six years, I dont think that stored procedures are as valuable as they used to be. Compiling an OM against a database schema/representation is more efficient, and IMO, better able to handle database schema volatility.

Agree on relational integrity, I tend to be anal in that area, and go into hyper-ventilation when I see data structures that ignore it.

Stored procs have an overly bad reputation by lm (4.00 / 2) #14 Fri Nov 18, 2005 at 11:01:12 AM EST
Mostly because MS SQL requires stored procs just about everywhere to get good performance which leads to everyone who learns SQL on MS to over use them and take their bad habit elsewhere.

There is no more degenerate kind of state than that in which the richest are supposed to be the best.
Cicero, The Republic
[ Parent ]

bizarre by lm (4.00 / 4) #3 Fri Nov 18, 2005 at 07:52:49 AM EST
every relational jockey will insist that data integrity basically means that each row in the table is uniquely identified

Am I the only one that instantly went WTF?

There is no more degenerate kind of state than that in which the richest are supposed to be the best.
Cicero, The Republic

More than just one person, though. by ambrosen (4.00 / 2) #5 Fri Nov 18, 2005 at 08:47:42 AM EST

[ Parent ]

very true by lm (4.00 / 2) #6 Fri Nov 18, 2005 at 09:03:06 AM EST
For the most part there is a very large chasm between those who take IT work as science and those who take IT work as skilled labor. While I tend to be in the second camp, I think that the world would be a much better place if more IT workers were in the first camp.

There is no more degenerate kind of state than that in which the richest are supposed to be the best.
Cicero, The Republic
[ Parent ]

I have no idea which camp I am in. by ambrosen (4.00 / 2) #7 Fri Nov 18, 2005 at 09:05:08 AM EST
However, I am sure that I wish to post a diary on Monday in which everyone can tell me.

[ Parent ]

It's an easy distinction to make by lm (4.00 / 1) #19 Sat Nov 19, 2005 at 05:17:48 AM EST
Think of the last bit of programming that you did. Do you have calculations that prove the results are correct? If not, you fall into the skilled labor camp rather than the scientist camp.

There is no more degenerate kind of state than that in which the richest are supposed to be the best.
Cicero, The Republic
[ Parent ]

In that sense, by ambrosen (2.00 / 0) #20 Sat Nov 19, 2005 at 08:21:01 AM EST
definitely skilled labour.

[ Parent ]

I think I sit between the two camps by Canthros (4.00 / 1) #16 Fri Nov 18, 2005 at 04:54:47 PM EST
And most of my employer's IT staff in the latter. You're 100% correct.

I don't think I'd want all of them in the former group, but it'd be damn nice if at least some of them seemed to be.

--
I'm not here, man.

[ Parent ]

Let me modify that a bit. by Canthros (4.00 / 1) #17 Fri Nov 18, 2005 at 05:16:26 PM EST
I sit between the two camps, but clearly more in the skilled labor than the science.

--
I'm not here, man.

[ Parent ]

so what you're saying is by martingale (4.00 / 1) #18 Fri Nov 18, 2005 at 06:37:10 PM EST
You are gray, you sit between the darkness and the light?
--
$E(X_t|F_s) = X_s,\quad t > s$
[ Parent ]

Inasmuch as that makes any actual sense, by Canthros (4.00 / 1) #21 Sat Nov 19, 2005 at 07:11:44 PM EST
sure. I was trained as a software engineer. I am employed as a code monkey. I am a computer scientist by (college) degree, if not by disposition or aptitude.

In short, I do not like my job.

I am not a Mimbari.

--
I'm not here, man.

[ Parent ]

IAWTP (nt) by DesiredUsername (4.00 / 2) #4 Fri Nov 18, 2005 at 08:38:09 AM EST

---
Now accepting suggestions for a new sigline

I don't agree with most of that by theboz (4.00 / 3) #8 Fri Nov 18, 2005 at 09:13:59 AM EST
The thing is, a database itself is an application, and rather than just being "dumb storage", the purpose is to store data and make it easy to work with for the applications. Where you delineate this depends on a lot of things, but the vast majority of people want the database to be capable of handling itself.

I can see the point on stored procedures. They are something that applications should do most of the time (I still use the hell out of them, but that's beside the point.) There may be cases where they're needed for things outside the app that could justify them though. It's a bad idea to have a stored procedure for a situation like when a company name changes on the company table to update the company name on the contacts (I wouldn't store the field twice, but I couldn't think of a better example.) However, if say, the DBA uses a stored procedure to give himself a count of how may records are in 50 different tables (again, probably not a great example) then it can be useful. So in my mind, stored procedures should be used by people who work on the database that are not app developers, even though I am guilty of this myself sometimes.

On the other hand, complaining about things like surrogate keys sounds stupid to me. There's no reason that a database should not have protections built in to make things unique in case a programmer fucks up on something. It's not the same as forcing a unique key on a table, but just a thing to make life easier. I bet the guy also thinks that people shouldn't use ADO/ODBC/JDBC connections and instead make their apps smart enough to make a connection to the DB from scratch, or that he thinks indexes are bad because the applications should be finding ways to access the data in a quicker fashion, or whatever. He doesn't seem to grasp the purpose that you're not building an app and a database and everything in seperate worlds but that they all are supposed to work together as efficiently as possible, and that 99.9% of the time the lines as to what component is responsible is blurred. Network, server, database, app, user, etc. all have to be in sync in the most efficient way as possible.
- - - - -
That's what I always say about you, boz, you have a good memory for random facts about pussy. -- joh3n

Hi, I am an applications developer by cam (4.00 / 2) #13 Fri Nov 18, 2005 at 10:38:21 AM EST
and I only use hammers on nails - lucky I see everything as a nail. Which is probably my software development philosophy :)

Yeh the database is excellent for storage, retrieval, hot backup, replication etc. It is definately not a dumb application. But I think a lot of people try to force business rules into the database, when in reality, the business rules are the problem domain of the application developer. Which probably leads to database centric developers repeating my opening paragraph.

Constraints on the data need enforcing everywhere by herbert (4.00 / 3) #9 Fri Nov 18, 2005 at 09:40:23 AM EST
So if the database doesn't do it, everywhere else that writes to the database will need to do it.

So for the sake of keeping one copy of the rules in one place, they might as well belong to the database.  An access layer shared by all the applications that write to the database could do it too, but the database can probably do it more efficiently, especially if it's at the other end of a network connection.

Yeh that rule ends up polluted by cam (4.00 / 1) #12 Fri Nov 18, 2005 at 10:33:33 AM EST
It ends up being done everywhere, html forms, javascript checking, action checking, and finally database checking. I dont have a problem with redundancy in that area.
I still can't get over the idiocy of that article by lm (4.00 / 3) #15 Fri Nov 18, 2005 at 12:05:54 PM EST
In summary, correctness, accuracy and consistency of the data cannot be enforced by the database.

As a matter of fact, yes it can as either an academic excercise or a commercial concern.

If the databases were completely bullet-proof, that wouldn’t mean a thing

Except that if a program knows that the data is correct, if the manipulation is also known to be correct, the results are guaranteed to be correct. But if the data is not known to be correct (by putting it into a relational model as but one example) then no matter how the data is manipulated, the results can never be known to be correct.

Aside from which, if the data is in a known structure, then one is not restricted to a single solution. Given understanding of a data model, any tool capable of querying an SQL database can be used to manipulate the data with comprehensible results.

There is no more degenerate kind of state than that in which the richest are supposed to be the best.
Cicero, The Republic

Making Databases Stupid | 21 comments (21 topical, 0 hidden) | Trackback