database normalization, are you normal?


It seems that the question of the hour is always what my opinion is on Normalization. Or maybe they are asking me if I’m normal… hmm, I think I’ll not go there. ;)
What is normalization? Well, its the method in which you reduce redundancy in a database.

There are several industry standards on this, they have fancy schmancy names (not really). The standards I’m covering here were created by an Englishman named Edgar F. Codd.

His concepts go from the most basic form of Normalization 1st Normal Form(1NF), to the stricter form, 3rd normal form(3NF). (There are more however I’m not going to cover them here)

To get the gist of how this all works lets define our database:
Your database stores information on Mickey Mouse and his Balloon Race competition.
Mickey and his friends need three mouskatools to get Donald Duck through the race.
Our table displays which character used which tool.
Mickey and Donald used Stickers to fix the holes in the balloon.
Mickey used the baby elephant to weight down the balloon.
Pete used the rope swing to get unstuck from a tree.
Pluto tags along but doesn’t really do anything.

So you might be tempted to have a table that looks like this.

Character_ID Name AnimalType Tool
1 Mickey mouse stickers, baby elephant
2 Pete dog rope ladder
3 Donald duck stickers
4 Pluto dog  

Lets talk about 1st Normal Form(1NF):
The rule of 1st normal form says that you shouldn’t have any data that repeats within one cell (above). Nor should you, in this scenario, define multiple columns that hold the same type of data(Tool1,Tool2). Basically we are removing redundancy from left to right (horozontally)

Character_ID Name AnimalType Tool1 Tool2
1 Mickey mouse stickers baby elephant
2 Pete dog rope ladder  
3 Donald duck stickers  
4 Pluto dog    

In order for us to follow this Form we need to break the above data up into multiple tables with primary keys.

character_id Name animaltype
1 Mickey Mouse mouse
2 Pete dog
3 Donald Duck duck
4 Pluto dog
character_id tool
2 rope ladder
1 stickers
1 baby elephant
3 stickers

Lets talk about 2nd Normal Form (2NF):
The rule of 2NF says that you must follow 1NF AND seperate out groups of data into their own tables.

character_id Name
1 Mickey Mouse
2 Pete
3 Donald Duck
4 Pluto
tool_id tool
a rope ladder
b stickers
c baby elephant
character_id tool_id
1 b
1 c
2 a
3 b

3rd normal form (3NF):
The rule of 3NF says that you must follow 2NF, AND remove redundancy from top to bottom (vertically).

character_id Name animal_id
1 Mickey Mouse 1
2 Pete 2
3 Donald Duck 3
4 Pluto 2
character_id tool
2 rope ladder
1 stickers
1 baby elephant
3 stickers
animal_id animaltype
1 mouse
2 dog
3 duck
   

So there you go, a basic basic idea of 1-3 Normal Form.


One Comment, Comment or Ping

  1. Excellent explanation.

    At times, database normalization is a handy way to give your design flexibility and the ability to have increased capacity. On the other hand, many programmers take normalization too far.

    Some times a nice, slightly abnormal database is easier to work with than a database where every normalization opportunity has been taken.

    The overriding principle should always be KISS, what is the simplest answer to solve the problem.

    November 29th, 2008

Reply to “database normalization, are you normal?”