DNAunion2000
01-26-2004, 12:15 AM
In a different thread, I slapped together a quick and dirty, simple database consisting of only 2 related tables. Here are some relation schemas for them.
BOOKS(ISBN, bkTitle, bkPrimaryAuthor, bkYear, bkPubID)
PUBLISHERS(pubID, pubName, pubCity, pubState)
where bold indicates the primary keys.
jemfinch asserted that I needed to buy a book and read up on normalization because of the poor design.
Go to the bookstore and read about normalization. You're lacking in your knowledge of databases (no doubt influenced by VFP). This is why "easy" languages oftentimes aren't the best ones to learn first: their users tend to forget the fundamentals (like you have with database normalization).
This is where the human side of me wants to insult you, but I'm restraining myself. Your arguments do a good enough job already.
Well, that gives rise to the question as to whether or not these tables are normalized.
Yes, they are. In fact, not only are they in 3NF, but they are also in the higher BCNF. In fact, not only are they in BCNF, but they are also in the higher 4NF. Here's the check.
1) A table is in 1NF (first normal form) if it meets all of the criteria for being a relation (most often this means that each attribute can contain only one, atomic value for each tuple).
Since both of the above tables meet all of the required conditions for being a relation, they are both in 1NF.
2) A table is in 2NF (second normal form) if it is in 1NF and no non-key column is functionally dependent upon only part of the primary key.
Since any table in 1NF that does not have a composite primary key is automatically in 2NF, both tables are in 2NF.
3) A table is in 3NF (third normal form) if it is in 2NF and it contains no transitive dependencies.
One could try to argue that there are transitive dependencies. For example, if we assume pubName is unique, then pubID functionally determines pubName and pubName functionally determines pubCity: thus, pubID-->pubName-->pubCity. However, this fails because the intermediary link in the chain of attributes is itself a candidate key, and:
"A functional dependency X --> Y in a relation R is a transitive dependency if there is a set of attributes Z that is neither a candidate key nor a subset of any key of R, and both X --> Z and Z --> Y hold." (Ramez Elmasri & Shamkant B Navathe, Fundamentals of Database Systems: Fourth Edition, Addison Wesley, 2003, p320)
Since both tables are in 2NF and neither has any transitive dependencies that would prevent it from being in 3NF, both tables are in 3NF.
4) A table is in BCNF (Boyce-Codd normal form) if it is in 3NF and the only determinants it contains are candidate keys.
The only determinants known in the tables are the primary keys (which are, by definition, candidate keys).
But what about pubName? Doesn't a given valud for it yield only one unambigious value for pubCity and pubState? We could probably safely assume so, but if we do, pubName would be unique and thus a candidate key (the alternative is just about nonsensical: for pubName to functionally determine pubCity and pubState and NOT be a candidate key, pubID would not functionally determine pubName, yet the pubName would have to functionally determine pubCity and pubState. In other words, you could have two different publishers with the same name, but if you did, they would have to be located in the same city and state). Since it would be a candidate key, its being a determinant would not prevent the table from being in BCNF.
If we go out of our way to assume that pubName is not unique (to prevent it from being a candidate key), then multiple publishers could have the same name: if that were the case, there would be no valid means of enforcing a rule that any such multiple publishers would have to be located in the same city or the same state. Thus, pubName would no longer functionally determine city or state and so wouldn't be a determinant: again, not preventing the table from being in BCNF.
The same type of logic that showed pubName does not prevent PUBLISHERS from being in BCNF works for bkTitle in TITLES too (also, I later stated explicitly that there was no constraint for bkTitle to be unique, and thus there could be multiple, different books with the same title: there's no valid way to create a constraint that demands that each of the different books in such a set have the same year, publisher, or primary author, and so bkTitle would no longer be a determinant. Thus, it could not be used to exclude the BOOKS table from being in BCNF).
The only remaining possible determinant would be pubCity, as in the functional dependency pubCity-->pubState. But that cannot just be assumed: one would need to know the business rules. Furthermore, since a single value for city -- such as "Philadelphia" -- can be in more than one state, city does NOT functionally determine state. Thus, pubCity is not a determinant and cannot possibly disqualify the PUBLISHER table from being in BCNF.
Thus, both tables are in BCNF.
5) A table is in 4NF (fourth normal form) if it is in BCNF and has no multivalued dependencies.
"...a relation R(A, B, C), a multi-value dependency exists if A determines multiple values of B, A determines multiple values of C, and B and C are independent of each other." (David M Kroenke, Database Processing: Fundamentals, Design, and Implementation, Prentice Hall, 2004, p132)
This is not the case for either of the tables being discussed.
Therefore, both tables are in 4NF.
And thus another of jemfinch's assertions is shown to be wrong
BOOKS(ISBN, bkTitle, bkPrimaryAuthor, bkYear, bkPubID)
PUBLISHERS(pubID, pubName, pubCity, pubState)
where bold indicates the primary keys.
jemfinch asserted that I needed to buy a book and read up on normalization because of the poor design.
Go to the bookstore and read about normalization. You're lacking in your knowledge of databases (no doubt influenced by VFP). This is why "easy" languages oftentimes aren't the best ones to learn first: their users tend to forget the fundamentals (like you have with database normalization).
This is where the human side of me wants to insult you, but I'm restraining myself. Your arguments do a good enough job already.
Well, that gives rise to the question as to whether or not these tables are normalized.
Yes, they are. In fact, not only are they in 3NF, but they are also in the higher BCNF. In fact, not only are they in BCNF, but they are also in the higher 4NF. Here's the check.
1) A table is in 1NF (first normal form) if it meets all of the criteria for being a relation (most often this means that each attribute can contain only one, atomic value for each tuple).
Since both of the above tables meet all of the required conditions for being a relation, they are both in 1NF.
2) A table is in 2NF (second normal form) if it is in 1NF and no non-key column is functionally dependent upon only part of the primary key.
Since any table in 1NF that does not have a composite primary key is automatically in 2NF, both tables are in 2NF.
3) A table is in 3NF (third normal form) if it is in 2NF and it contains no transitive dependencies.
One could try to argue that there are transitive dependencies. For example, if we assume pubName is unique, then pubID functionally determines pubName and pubName functionally determines pubCity: thus, pubID-->pubName-->pubCity. However, this fails because the intermediary link in the chain of attributes is itself a candidate key, and:
"A functional dependency X --> Y in a relation R is a transitive dependency if there is a set of attributes Z that is neither a candidate key nor a subset of any key of R, and both X --> Z and Z --> Y hold." (Ramez Elmasri & Shamkant B Navathe, Fundamentals of Database Systems: Fourth Edition, Addison Wesley, 2003, p320)
Since both tables are in 2NF and neither has any transitive dependencies that would prevent it from being in 3NF, both tables are in 3NF.
4) A table is in BCNF (Boyce-Codd normal form) if it is in 3NF and the only determinants it contains are candidate keys.
The only determinants known in the tables are the primary keys (which are, by definition, candidate keys).
But what about pubName? Doesn't a given valud for it yield only one unambigious value for pubCity and pubState? We could probably safely assume so, but if we do, pubName would be unique and thus a candidate key (the alternative is just about nonsensical: for pubName to functionally determine pubCity and pubState and NOT be a candidate key, pubID would not functionally determine pubName, yet the pubName would have to functionally determine pubCity and pubState. In other words, you could have two different publishers with the same name, but if you did, they would have to be located in the same city and state). Since it would be a candidate key, its being a determinant would not prevent the table from being in BCNF.
If we go out of our way to assume that pubName is not unique (to prevent it from being a candidate key), then multiple publishers could have the same name: if that were the case, there would be no valid means of enforcing a rule that any such multiple publishers would have to be located in the same city or the same state. Thus, pubName would no longer functionally determine city or state and so wouldn't be a determinant: again, not preventing the table from being in BCNF.
The same type of logic that showed pubName does not prevent PUBLISHERS from being in BCNF works for bkTitle in TITLES too (also, I later stated explicitly that there was no constraint for bkTitle to be unique, and thus there could be multiple, different books with the same title: there's no valid way to create a constraint that demands that each of the different books in such a set have the same year, publisher, or primary author, and so bkTitle would no longer be a determinant. Thus, it could not be used to exclude the BOOKS table from being in BCNF).
The only remaining possible determinant would be pubCity, as in the functional dependency pubCity-->pubState. But that cannot just be assumed: one would need to know the business rules. Furthermore, since a single value for city -- such as "Philadelphia" -- can be in more than one state, city does NOT functionally determine state. Thus, pubCity is not a determinant and cannot possibly disqualify the PUBLISHER table from being in BCNF.
Thus, both tables are in BCNF.
5) A table is in 4NF (fourth normal form) if it is in BCNF and has no multivalued dependencies.
"...a relation R(A, B, C), a multi-value dependency exists if A determines multiple values of B, A determines multiple values of C, and B and C are independent of each other." (David M Kroenke, Database Processing: Fundamentals, Design, and Implementation, Prentice Hall, 2004, p132)
This is not the case for either of the tables being discussed.
Therefore, both tables are in 4NF.
And thus another of jemfinch's assertions is shown to be wrong