DBMS - (6) Relational Algebra - select, project, join
Query(expression) on set of relations produces relation as a result.
Example
College(cName, state, enrollment)
Student(sID, sName, GPA, size of HS)
Apply(sID, cName, major, decision)
Select operator : picks certain rows
(1) Students with GPA>3.7
- σ(GPA>3.7)(Student)
(2) Students with GPA>3.7 and HS<1000
- σ(GPA>3.7^HS<1000)(Student)
(3) Applications to Stanford CS major
- σ(cName='stanford'^major='cs')(Apply)
=> select operator is σ(condition)(Relation name)
Project operator : picks certain columns
(1) ID and decision of all applications
- π(sID, dec)(Apply)
=> project operator is π(A1, A2, ..., An)(Relation name)
Select & Project : To pick both rows and column
(1) ID and name of students with GPA>3.7
- π(sID, sName)σ(GPA>3.7)(Student)
Duplicate
(1) List of application majors and decision
- π(major, dec)(Apply)
- Relation algebra는 중복된 값을 가진 tuple을 자동으로 제거
- SQL에서는 중복된 값을 가진 tuple을 제거하지 않음( 중복을 제거하기 위해선 distinct 키워드를 사용)
- σ(condition)(Expression)
- π(A1, A2, ..., An)(Expression)
Cross-Product : combine two relations(also known as cartesian product)
(1) Names and GPAs of students with HS>1000 who applied to CS and were rejected.
- π(sName, GPA)(σ(Student.sID=Apply.sID ^ HS>1000 ^ major = 'cs' ^ dec = 'R')(Student X Apply))
- Cross product 연산은 두개의 relation의 attribute를 풀칠한다.
- degree = a + b
- Cross product 연산으로 생성되는 relation의 cardinality는 피연산자 relation의 cardinality간 곱과 같다.
- cardinality = n * m
Natural join
- Enforce equality on all attributes with same name
cross-projuct에서 condition으로 동일한 attiribute을 필터링하는 것이 필요없어진다(=natural join에 내장됨)
- Eliminate one copy of duplicate attributes
서로 다른 relation이 서로 동일한 attribute name을 갖는다면 natural join 연산 결과로 생성되는 relation에는 중복된 attribute가 단 1개의 attribute로 표현된다.
(1) Names and GPAs of students with HS>1000 who applied to CS and were rejected.
- π(sName, GPA)(σ(HS>1000 ^ major='cs' ^ dec='R')(Student⋈Apply))
(2) Names and GPAs of students with HS>1000 who applied to CS at college with enr>20,000 and were rejected.
- π(sName, GPA)(σ(HS>1000 ^ major='cs' ^ dec='R' ^ enr>20,000)(Student⋈(College⋈Apply)))
Natural join doesn't provide additional expressive power.
Exp1⋈Exp2 ==
π(schema(Exp1) U schema(Exp2))σ(E1.A1=E2.A1^ E1.A2=E2.A2 ^ ... ^E1.An=E2.An)(Exp1 X Exp2)
Theta Join
- Exp1 ⋈θ Exp2 == σ(θ)(Exp1 X Exp2)
- θ is selection operator(=condition)
- Basic operation is implemented in DBMS.
- Tern 'join' often means theta join
conclusion
- Query(expression) on set of relations produces relation as a result
- simplest query : relation name
- use operator to filter, slice, combine.
- Operators so far : select, project, cross-product(=cartesian product), natural join, theta join
Reference
https://www.youtube.com/watch?v=tii7xcFilOA&list=PL6hGtHedy2Z4EkgY76QOcueU8lAC4o6c3&index=9
'데이터베이스' 카테고리의 다른 글
DBMS - (8) Introduction to SQL (0) | 2019.07.07 |
---|---|
DBMS - (7) Relational Algebra - set operator, renaming (0) | 2019.06.27 |
DBMS - (5) JSON(JavaScript Object Notation) (0) | 2019.06.27 |
DBMS - (4) XML(Extensible Markup Language) (0) | 2019.06.27 |
DBMS - (3) Querying Relational Databases (0) | 2019.06.27 |