In a distributed DBMS, tables are often split into smaller pieces and stored at different sites. This splitting is called fragmentation, and it helps balance performance, storage, and availability.
The key idea is that while the logical table looks whole to users, the physical data may be divided and located on several machines.
What Is Fragmentation?
Fragmentation means breaking a table (relation) into disjoint fragments such that:
Every fragment is a subset of the original table (rows or columns).
All fragments together contain all the data of the original table.
The DBMS can reassemble the table for queries when needed.
For example, a CUSTOMER table might be fragmented by:
Region (e.g., North, South, East, West).
City (e.g., Mumbai‑only rows, Delhi‑only rows).
Or by sensitive vs. non‑sensitive columns.
Types of Fragmentation
1. Horizontal Fragmentation
Horizontal fragmentation splits a table by rows:
Each fragment contains a subset of the rows of the table, but all columns.
The split is usually based on a condition such as
WHERE region = 'North'.
Example:
CUSTOMER_NORTH→ rows whereregion = 'North'CUSTOMER_SOUTH→ rows whereregion = 'South'
Advantages:
Natural for location‑based or range‑based queries.
Easy to distribute data to sites that serve those regions.
Disadvantage:
Queries that need all customers must access multiple fragments, which increases network cost if sites are far apart.
2. Vertical Fragmentation
Vertical fragmentation splits a table by columns:
Each fragment contains all rows but only a subset of the columns.
The primary key is usually included in at least one fragment to allow joining or reassembly.
Example:
CUSTOMER_BASIC→cust_id, name, phoneCUSTOMER_FINANCE→cust_id, credit_limit, balance
Advantages:
Helps security (sensitive columns stored separately).
Improves local query performance when only a few columns are needed.
Disadvantage:
Any query needing many columns must join multiple fragments, which can be slow.
3. Mixed (Hybrid) Fragmentation
Mixed fragmentation combines horizontal and vertical fragmentation:
First, split by rows (horizontal), then some of those fragments are further split by columns (vertical), or vice versa.
Example:
First, split
CUSTOMERby region:CUSTOMER_NORTH,CUSTOMER_SOUTH.Then, split
CUSTOMER_NORTHvertically intoCUSTOMER_NORTH_BASICandCUSTOMER_NORTH_FINANCE.
This gives very fine‑grained control over where data is stored and how it is accessed.
Why Fragmentation Matters in Distributed DBMS
Performance:
Frequently accessed data can be placed close to the users who need it.
Availability and locality:
If one site manages only a fragment, failures at other sites affect only part of the data.
Security and privacy:
Sensitive columns can be kept at a secure, restricted site.
Storage and load balancing:
Large tables can be spread across sites to avoid overloading one machine.
For beginners, think of fragmentation as cutting a big table into smaller pieces and then placing those pieces on different machines. The DBMS hides this splitting so that queries still look as if they are running on the original, whole table.
Summary
Fragmentation in distributed DBMS is the technique of splitting a logical table into smaller fragments (horizontal, vertical, or mixed) and storing them on different sites. Horizontal fragmentation divides by rows, vertical fragmentation by columns, and mixed fragmentation uses both. Fragmentation helps improve performance, locality, security, and load balancing, but it also introduces complexity in query processing and joins, which the distributed DBMS must handle transparently.