Practice
Q1.
- a. F
- b. F (wrong: Must be T)
- A relation R(A,B,C) may have at most three minimal keys (not superkey)
- c. T
- d. T
- e. T (any ops involving a null is a null)
- f. F (DML: data manipulation, not management)
- g. F (a weak entity set has one or more many-many relationship)
- h. F
Q3.
- model is PK for all relations
type
are “laser” and “ink-jet”- every PC model and every printer model is a Product model (every PC/printer must be referenced in relation to Product)
- price of a product should not be more than 10% higher than the average price of all product (average price of all product is given value avgPrice)
- model and price are int, all other attributes of type char(20)
Q4.
a. (1,3) b. cartesian products
Foreign Keys and Relational Models
See also slides
A relation is a table
Relations are unordered ⇒ relations are sets
tuple and domain constraints
- tuple: expresses conditions on the values of each tuple
- domain constraint: tuple constrain that involves a single attributes
unique identifier
A superkey is a set of attributes for a relation if cannot contain two distinct tuples and such that
A (candidate) key for if is a minimal superkey
ex: superkey of
RegNum
primary value
handles
null
valuePresence of nulls in keys
definition
Each relation must have a primary key on which nulls are not allowed.
notation: the attributes of the primary keys are underlined
⇒ references between relations are realised through primary keys
Remark
A set of fields is a key for a relation if:
- No two distinct tuples can have same values in all key fields
- This is not true for any subset of the key (minimal)
If #2 is false, then a superkey
If there’s > 1 key for a relation, one of the keys is chosen to be primary key
Example:
requirements:
- For a given student and course, there is a single grade.
- Students can take only one course, and received a single grade for that courses; further, no two students in a course receive the grade
IC are validated when data is updated
interpolation constraints (foreign keys)
Referential integrity constraints are imposed in order to guarantee values refer to existing tuples
Definition
A foreign key requires that the values on a set of attributes of a relation must appear as values for the primary key of another relation
Ex: sid is a foreign key referring to Students
If al foreign key constraints are enforced ⇒ referential integrity is enforced
enforcing referential integrity
See also source
Lien vers l'original
A weak entity doesn’t have enough information to have its own PK and relies on supporting entity for unique identification
Weak Entity
weak identity we need one (or more) many-to-one (supporting) relationship(s) to other (supporting) entity sets
Role
- entity set may appear more than once in a relationship (label the edge between relationship)
sql.
values
any values can be
NULL
, unless specified otherwise
PRIMARY KEYS vs. UNIQUE.
- 1 PK for a relation, but several UNIQUE
- No attributes of PK can be NULL
- Attributes declared UNIQUE may have NULL
DATE and TIME
constraints.
-
keys
-
foreign keys
-
domain
-
tuple-based
-
assertions
-
REFERENCES
attribute ==must be==PRIMARY KEY
orUNIQUE
enforcing constraints from relation to relation , the following violation are possible:
- insert/update introduces values not found in
- deletion/update to causes tuple of to “dangle”
ex: suppose
delete or update to that removes a beer value found in some tuples of
actions:
- Default: reject modification
CASCADE
: make the same changes in Sells- Delete beer: delete Sells tuple
- Update beer: change value in Sells
SET NULL
: change beer toNULL
Can choose either
CASCADE
orSET NULL
as policy, otherwise reject as default
attributed-based check
CHECK(<cond>)
: cond may use name of attribute, but any other relation/attribute name MUST BE IN subquery
CHECK
only runs when a value for that attribute is inserted or updated.
Tuple-based checks
added as a relation-schema element
check on insert or update only
queries
patterns
%
is any string, and_
is any character
In sql, logics are 3-valued: TRUE, FALSE, UNKNOWN
- comparing any value with
NULL
yieldsUNKNOWN
- A tuple in a query answer iff the
WHERE
isTRUE
ANY(<queries>)
and ALL(<queries>)
ensures anyof or allof relations.
IN
operator
IN
is concise
IN is a predicate about R
tuples
NOT EQUAL operator in SQL is
<>
Difference between
ANY
andNOT IN
ANY
means not = a, or not = b, or not = cNOT IN
means not = a, and not = b, and not = c. (analogous toALL
)
EXISTS
operator
EXISTS(<subquery>)
is true iff subquery result is not empty.
UNION
,INTERSECT
,EXCEPT
structure:
(<subquery>)<predicate>(<subquery>)
bag
or a multiset, is like a set, but an element may appear more than once.
- Force results to be a set with
SELECT DISTINCT
- Force results to be a bag with
UNION ALL
ORDER BY
ops followed with desc
insert, update, delete
add DEFAULT
value during CREATE TABLE
(DEFAULT
value will be used if inserted tuple has no value for given attributes)
Those drinkers who frequent at least one bar that Sally also frequents
DELETE FROM
:
UPDATE
schema:
aggregations
SUM
, AVG
, COUNT
, MIN
, MAX
can be applied toa column in SELECT
clause
COUNT(*)
counts the number of tuples
NULL
never contributes to a sum, average, or counthowever, if all values in a column are
NULL
, then aggregation isNULL
exception:
COUNT
of an empty set is 0
GROUP BY
: according to the values of all those attributes, and any aggregation is applied only within each group:
restriction on
SELECT
with aggregationeach element of
SELECT
must be either:
Aggregated
An attribute on
GROUP BY
listillegal example
HAVING(<condition>)
may followed by GROUP_BY
If so, the condition applies to each group, and groups not satisfying the condition are eliminated.
requirements on HAVING
:
- Anything goes in a subquery
- Outside subqueries they may refer to attributes only if they are either:
- A grouping attribute
- aggregated
cross product (cartesian product)
Or known as join operations ⇒ all join operations are considered cartesian products.
Outer join preserves dangling tuples by padding with NULL
A tuple of that has no tuple of which it joins is said to be
dangling
Left outer join
Right outer join
full outer join
inner join
- natural: check equality on all common attributes && no two attributes with same name
- left: padding dangling tuples of R only
- right: padding dangling tuples of S only
- full: padding both (default)
views
- many views (how users see data), single logical schema (logical structure) and physical schema (files and indexes used)
virtual views does not stored in database (think of query for constructing relations)
materialized views are constructed and stored in DB.
Usually one shouldn’t update view, as it simply doesn’t exists.
index
idea: think of DS to speed access tuple of relations, organize records via tree or hashing
DS: B+ Tree Index or Hash-based Index
B+ Tree
note: each node are at least 50% full
cost
tree is “height-balanced”
insert/delete at cost
min 50% occupancy, each node contains entries, where is the order or the tree
insert a data entry
- find correct leaf
- put data entry onto
- if has enough space ⇒ done!
split
- redistribute entries evenly,
copy up
middle key - insert index entry point to in parent of
- redistribute entries evenly,
split grow trees, root split increase heights
delete a data entry
- find leaf where entry belongs
- remove entry
- if L is at least half-full ⇒ done!
- if not
- redistribute, borrowing from sibling (adjacent node with same parent of )
- if fails, merge and sibling
- merge occured then delete entry (point to or sibling) from parent of
merge propagate root, decrease heights
Hash-based Index
- index is a collection of buckets
Insert: if bucket is full ⇒ split
Alternatives for data entries
How | |
---|---|
By Value | record contents are stored in index file (no need to follow pointers) |
By Reference | <k, rid of matching data record> |
By List of References | <k, [rid of matching data record, …]> |