SQL Server MVP Deep Dives 2, Chapter 6
Chapter six is the shortest chapter in the book so far, but Mr. Nielson uses that space very well to put forth the idea of generalization within a database design. Generalization isn’t the opposite of normalization, but it can help a database that seems to be over-normalized. He argues that “over-normalized” is the wrong term and you can keep normalization while reducing complexity in design by generalizing.
The first half of the chapter is building a case for a usable database vs. a hyper-normalized database and he draws for parallels in the User Interface world. One of the favorite things I came away from in this chapter was a principle from UIX design that states, “There is no such thing as intuitive–there is only familiar.” With any deployment of a new program we wouldn’t dream of not testing the usability, but when a new database is designed it is not typically put before the end users to see if they understand it. He proposes that you put all of your report writers and developers in a room and asked them to write queries against the database without giving them any documentation. Of course based on some of the documentation with third party applications I feel like I have done this at most places where I have worked. It’s not fun and when working with a database that has been in place for years it does no good to ask, “Why did this table get added like this?”
For the purists out there it may seem that usability isn’t the end goal and Mr. Nielson has an answer for that as well. If you believe the database design only needs to take into account normalization then:
As someone who’s had to write queries against your database, I respectfully disagree. If the database isn’t readily understandable and easy to consume by those with reasonable professional skills, then the design is less than it could be. I’ll put it plainly: your baby is ugly. (Chapter Six, SQL Server MVP Deep Dives 2)
He describes normalization as the grammar of database design and just like a good novel you need more than grammar to make it complete. After the chapter builds this argument for usability he presents us with the solution, generalization.
First a definition is provided of what he means by generalization. Simply put it is an entity’s scope. A good example would be having a table for animals vs separate tables for cats and dogs. Depending on what you need to track the general animal table may be all you need. It will certainly be easier to work with and create a less complex database which will probably help with performance and make it easier to extend the database. If you need to add a mouse entity you can just add a record to the animal table. No need to create its own table.
It seems like a pretty straight forward concept and I think the real skill lies in figuring out where the breakdown should be. Paul points out that you want to balance scope to keep things from becoming overly complex, but you shouldn’t take it to the absurd extreme of one big table for everything. My logical mind would like to see some concrete steps to follow for how to decide where those breaks should go, but I’m not sure what that would look like. They would most likely be different depending on the data you are working with. In fact the author compares generalization to a data-driven design.
Chapter Six SQL Server MVP:
Paul Nielsen (B|T) is the founder of Ministry Weaver, a software startup offering an object database for ministries. He focuses on extreme data modeling, database development, and the occasional XMAL and C#, and he was the author of the SQL Server Bible series and the lead editor of the first volume of the SQL Server MVP Deep Dives. Paul offers consulting and seminars on data modeling and schema optimization. He lives in Colorado with his family, and he’s passionate about marriage, adoption, theology, music, writing, and their next family trip to Maui.