Keys
is a key if uniquely determines all of and no subset of does
K is a superkey for relation if contains a key of
see also: keys
functional dependency
Think of it as ” holds in ”
convention: no braces used for set of attributes, just instead of
properties
- splitting/combining
- trival FDs
- Armstrong’s Axioms
FD are generalisation of keys
superkey: , must include all the attributes of the relation on RHS
trivial
always hold (right side is a subset)
splitting/combining right side of FDs
when each of , , …, holds for
ex: is equiv to and
ex: and can be written as
Armstrong’s Axioms
Given are sets of attributes
rules
Rule | Description |
---|---|
Reflexivity | If , then |
Augmentation | If , then for any |
Transitivity | If and , then |
Union | If and , then |
Decomposition | If , then and |
dependency inference
is implied by
transitivity
example: Key
List all the keys of with the following FDs:
sol:
closure test
Given attribute set and FD set , we have is the closure of relative to
Or set of all FDs given/implied by
Steps:
- Start:
- While there exists a s.t :
- End:
minimal basis
The idea is to remove redundant FDs.
for minimal cover for FDs
- Right sides are single attributes
- No FDs can be removed, otherwise is no longer a minimal basis
- No attribute can be removed from a LEFT SIDE
construction:
- decompose RHS to single attributes
- repeatedly try to remove a FD to see if the remaining FDs are equivalent to the original set
- or is equivalent to
- repeatedly try to remove an attribute from LHS to see if the removed attribute can be derived from the remaining FDs.
Schema decomposition
goal: avoid redundancy and minimize anomalies (update and deletion) w/o losing information
One can also think of projecting FDs as geometric projections within a given FDs space
good properties to have
- lossless join decomposition (should be able to reconstruct after decomposed)
- avoid anomalies (redundant data)
- preservation:
information loss with decomposition
- Decompose into and
- consider FD with and
- FD loss
- Attribute and not in the same relation (thus must join and to enforce , which is expensive)
- Join loss
- neither nor is in
A lossy decomposition results in the reconstruction of components to include additional information that is not in the original constructions
how can we test for losslessness?
A binary decomposition of into and is lossless iff:
- is the
- is the
if form a superkey of either or , then decomposition of is lossless
Projection
- Starts with
- For each subset
- Compute
- For each attribute
- If in : add to
- Compute minimal basis of
Normal forms
Normal Form | Definition | Key Requirements | Example of Violation | How to Fix |
---|---|---|---|---|
First Normal Form (1NF) | All columns contain atomic values and there are no repeating groups. | - Each cell holds a single value (atomicity) - No repeating groups or arrays | A column storing multiple phone numbers in a single cell (e.g., “123-4567, 234-5678”). | Split the values into separate rows or columns so each cell is atomic. |
Second Normal Form (2NF) | A 1NF table where every non-key attribute depends on the whole of a composite key. | - Already in 1NF - No partial dependencies on a composite primary key | A table with a composite primary key (e.g., StudentID, CourseID) where a non-key attribute (e.g., StudentName) depends only on StudentID. | Move attributes that depend on only part of the key into a separate table. |
Third Normal Form (3NF) | A 2NF table with no transitive dependencies. | - Already in 2NF - No transitive dependencies (non-key attributes depend only on the key, not on other non-key attributes) | A table where AdvisorOffice depends on AdvisorName, which in turn depends on StudentID. | Put attributes like AdvisorName and AdvisorOffice in a separate Advisor table keyed by AdvisorID. |
Boyce-Codd Normal Form (BCNF) | A stronger version of 3NF where every determinant is a candidate key. | - For every functional dependency X → Y, X must be a candidate key | A table where Professor → Course but Professor is not a candidate key. | Decompose the table so that every functional dependency has a candidate key as its determinant. |
1NF
no multi-valued attributes allowed
idea: think of storing a list of a values in an attributes
counter: Course(name, instructor, [student, email]*)
2NF
non-key attributes depend on candidate keys
idea: consider non-key attribute , then there exists an FD s.t. and is a candidate key
3NF
non-prime attribute depend only on candidate keys
idea: consider FD , then either is a superkey, or is prime (part of a key)
counter: , where studioAddr
depends on studio
which is not a candidate key
theorem
It is always possible to convert a schema to lossless join, dependency-preserving 3NF
what you get from 3NF
- Lossless join
- dependency preservation
- anomalies (doesn’t guarantee)
Boyce-Codd normal form (BCNF)
on additional restriction over 3NF where all non-trivial FDs have superkey LHS
theorem
We say a relation is in BCNF if is a non-trivial FD that holds in , and is a superkey 1
what you get from BCNF
- no dependency preservation (all original FDs should be satisfied)
- no anomalies
- Lossless join
decomposition into BCNF
relation with FDs , look for a BCNF violation ( is not a superkey)
- Compute
- find ( is a superkey)
- Replace by relations with
- Continue to recursively decompose the two new relations
- Project given FDs onto the two new relations.
Remarque
-
means is not contained in ↩