Principles Of Distributed Database Systems Exercise Solutions

Principles Of Distributed Database Systems Exercise Solutions

Given:
Relation EMPLOYEE(EID, Name, Dept, Salary) with two sites:

Question: Define a horizontal fragmentation schema.

Solution:
Horizontal fragmentation partitions a relation into subsets of tuples based on a predicate.

Fragment 1: EMPLOYEE_Sales = σ_Dept=‘Sales’(EMPLOYEE)
Fragment 2: EMPLOYEE_HR = σ_Dept=‘HR’(EMPLOYEE)
All other tuples (e.g., Dept=‘IT’) could go to a default fragment at a chosen site or be replicated. Given: Relation EMPLOYEE(EID, Name, Dept, Salary) with two

Given:
R = 10,000 tuples, S = 50,000 tuples. Hash function partitions data into 10 buckets. Each site sends its bucket to a single join site. Network cost = 1 per tuple. Local join cost negligible.
Question: Compute total network cost.

Solution:
If hash partitioning already aligned with join site:
Each tuple of R and S sent exactly once to join site → cost = 10,000 + 50,000 = 60,000.

If data is initially distributed across sites and must be repartitioned: Question: Define a horizontal fragmentation schema

Better is to perform parallel hash join: each site joins locally on its own bucket after exchanging only needed buckets (cost = same total data volume). So 60,000 is correct.


Given read and write operations from transactions T1, T2, T3 on data items X, Y, Z stored at different sites. Determine if the schedule is conflict-serializable and if the protocol would allow it.

Ensuring atomicity (all nodes commit or all nodes abort) is critical. Better is to perform parallel hash join: each

Exercises often present a schedule of operations across sites and ask: Is this schedule serializable under 2PL (Two-Phase Locking) or T/O (Timestamp Ordering)?

One of the first exercises students encounter involves designing correct and complete fragmentation schemas.