Hi, I was studying the article on Database Normalization 2NF in Wikipedia at Wikipedia, when I came across the last example, on tournament winners. There it says that because there is a transitive dependency in the table, it is not in 2NF, and further optimization is needed (like splitting it into further tables) to restore that to 3NF and eliminating chances of data corruption. Can anybody tell me what kinds of 'corruption' can creep in to "show the same person from being shown with different dates of birth on different records"?
2 Replies - 988 Views - Last Post: 29 June 2012 - 12:22 PM
Replies To: What is the problem with transitive dependency in RDBMS?
Re: What is the problem with transitive dependency in RDBMS?
Posted 29 June 2012 - 07:04 AM
Can anybody tell me what kinds of 'corruption' can creep in to "show the same person from being shown with different dates of birth on different records"?
Well, that is the corruption - the data is internally inconsistent. Obviously it's impossible for one person to have two different dates of birth. But in that schema, there is nothing to prevent you from entering different dates of birth for the same person. So if you have two rows that contain the same winner, but with different dates of birth, that's "corruption" in the sense that the database is giving you two different answers to the question of when a particular person was born.
Part of the point of normalization is to eliminate this problem by simply not storing the same information in multiple places. In that example, the key for that table was the combination of tournament and year. That means that those two columns will be unique for each row, but if the same person wins a tournament several years in a row, then you'd have redundant storage of the date of birth. It's redundant because that is determined by the person and does not change - it has nothing to do with the tournament. That redundancy makes it possible to store inconsistent data because you rely on the person entering the data to put in the right date every time. By putting the winners in their own table, you only have to store the date of birth once. This solves the problem because there's nothing for the data to be inconsistent with - it's always stored in just one place.
Page 1 of 1