본문 바로가기

데이터베이스

DBMS - (6) Relational Algebra - select, project, join

DBMS - (6) Relational Algebra - select, project, join

Query(expression) on set of relations produces relation as a result.

Example
College(cName, state, enrollment)
Student(sID, sName, GPA, size of HS)
Apply(sID, cName, major, decision)

<그림 1: Example relations>

Select operator : picks certain rows
(1) Students with GPA>3.7
σ(GPA>3.7)(Student)
(2) Students with GPA>3.7 and HS<1000
σ(GPA>3.7^HS<1000)(Student)
(3) Applications to Stanford CS major
σ(cName='stanford'^major='cs')(Apply)
=> select operator is σ(condition)(Relation name)

Project operator : picks certain columns
(1) ID and decision of all applications
π(sID, dec)(Apply)
=> project operator is π(A1, A2, ..., An)(Relation name)

Select & Project : To pick both rows and column
(1) ID and name of students with GPA>3.7
π(sID, sName)σ(GPA>3.7)(Student)

Duplicate
(1) List of application majors and decision
π(major, dec)(Apply)
- Relation algebra는 중복된 값을 가진 tuple을 자동으로 제거
- SQL에서는 중복된 값을 가진 tuple을 제거하지 않음( 중복을 제거하기 위해선 distinct 키워드를 사용)

- σ(condition)(Expression)
- π(A1, A2, ..., An)(Expression)

Cross-Product : combine two relations(also known as cartesian product)
(1) Names and GPAs of students with HS>1000 who applied to CS and were rejected.
- π(sName, GPA)(σ(Student.sID=Apply.sID ^ HS>1000 ^ major = 'cs' ^ dec = 'R')(Student X Apply))
- Cross product 연산은 두개의 relation의 attribute를 풀칠한다.
- degree = a + b
- Cross product 연산으로 생성되는 relation의 cardinality는 피연산자 relation의 cardinality간 곱과 같다.
- cardinality = n * m

Natural join
- Enforce equality on all attributes with same name
cross-projuct에서 condition으로 동일한 attiribute을 필터링하는 것이 필요없어진다(=natural join에 내장됨)
- Eliminate one copy of duplicate attributes
서로 다른 relation이 서로 동일한 attribute name을 갖는다면 natural join 연산 결과로 생성되는 relation에는 중복된 attribute가 단 1개의 attribute로 표현된다.

(1) Names and GPAs of students with HS>1000 who applied to CS and were rejected.
- π(sName, GPA)(σ(HS>1000 ^ major='cs' ^ dec='R')(Student⋈Apply))
(2) Names and GPAs of students with HS>1000 who applied to CS at college with enr>20,000 and were rejected.
- π(sName, GPA)(σ(HS>1000 ^ major='cs' ^ dec='R' ^ enr>20,000)(Student⋈(College⋈Apply)))

Natural join doesn't provide additional expressive power.
Exp1⋈Exp2 ==
π(schema(Exp1) U schema(Exp2))σ(E1.A1=E2.A1^ E1.A2=E2.A2 ^ ... ^E1.An=E2.An)(Exp1 X Exp2)

Theta Join
- Exp1 θ Exp2 == σ(θ)(Exp1 X Exp2)
- θ is selection operator(=condition)
- Basic operation is implemented in DBMS.
- Tern 'join' often means theta join

conclusion
- Query(expression) on set of relations produces relation as a result
- simplest query : relation name
- use operator to filter, slice, combine.
- Operators so far : select, project, cross-product(=cartesian product), natural join, theta join

Reference
https://www.youtube.com/watch?v=tii7xcFilOA&list=PL6hGtHedy2Z4EkgY76QOcueU8lAC4o6c3&index=9