Given:
Relation EMPLOYEE(EID, Name, Dept, Salary) with two sites:
Question: Define a horizontal fragmentation schema.
Solution:
Horizontal fragmentation partitions a relation into subsets of tuples based on a predicate.
Fragment 1: EMPLOYEE_Sales = σ_Dept=‘Sales’(EMPLOYEE)
Fragment 2: EMPLOYEE_HR = σ_Dept=‘HR’(EMPLOYEE)
All other tuples (e.g., Dept=‘IT’) could go to a default fragment at a chosen site or be replicated. Given: Relation EMPLOYEE(EID, Name, Dept, Salary) with two
Given:
R = 10,000 tuples, S = 50,000 tuples. Hash function partitions data into 10 buckets. Each site sends its bucket to a single join site. Network cost = 1 per tuple. Local join cost negligible.
Question: Compute total network cost.
Solution:
If hash partitioning already aligned with join site:
Each tuple of R and S sent exactly once to join site → cost = 10,000 + 50,000 = 60,000.
If data is initially distributed across sites and must be repartitioned: Question: Define a horizontal fragmentation schema
Better is to perform parallel hash join: each site joins locally on its own bucket after exchanging only needed buckets (cost = same total data volume). So 60,000 is correct.
Given read and write operations from transactions T1, T2, T3 on data items X, Y, Z stored at different sites. Determine if the schedule is conflict-serializable and if the protocol would allow it.
Ensuring atomicity (all nodes commit or all nodes abort) is critical. Better is to perform parallel hash join: each
Exercises often present a schedule of operations across sites and ask: Is this schedule serializable under 2PL (Two-Phase Locking) or T/O (Timestamp Ordering)?
One of the first exercises students encounter involves designing correct and complete fragmentation schemas.