In a distributed DBMS, a single transaction may touch data at multiple sites. To ensure that such a distributed transaction does all or nothing, the system often uses the Two‑Phase Commit Protocol (2PC).
2PC is a distributed algorithm that coordinates all participating sites so that either every site commits the transaction, or every site aborts it, even if some sites fail or lose network connection.
What Is the Two‑Phase Commit Protocol?
The Two‑Phase Commit Protocol (2PC) is a synchronous, blocking protocol used to decide whether a distributed transaction should commit or abort.
It involves:
A coordinator site that starts and controls the protocol.
Participant sites that execute the local part of the transaction and answer the coordinator’s requests.
The protocol is called “two‑phase” because it has two distinct stages: the prepare phase and the commit/abort phase.
Phase 1: Prepare Phase
In the prepare phase, the coordinator asks every participant: “Can you commit this transaction?”
Steps:
The coordinator sends a prepare message to all participants.
Each participant:
Writes the transaction’s updates to its log.
Releases locks only after it is sure it can commit.
Replies:
“Yes” (ready to commit), or
“No” (vote to abort).
If any participant votes “No”, the coordinator decides to abort the global transaction.
Phase 2: Commit or Abort Phase
In the second phase, the coordinator broadcasts the final decision to all participants.
If all participants voted “Yes”:
Coordinator sends a commit message to all sites.
Each site writes a commit record to its log and completes the transaction.
If any participant voted “No” (or did not respond):
Coordinator sends an abort message.
Each site undoes the transaction using its log.
After this phase, the distributed transaction is either fully committed at all sites or fully aborted at all sites.
Why Two‑Phase Commit Is Important
Atomicity in distributed transactions:
Ensures that the transaction commits at all sites or none.
Consistency across sites:
Prevents partial updates where some sites see changes and others do not.
Recovery support:
Logs at each site allow the system to recover correctly after failures.
However, 2PC has some drawbacks:
It is blocking: if the coordinator fails after participants vote “Yes” but before the second phase completes, participants may wait indefinitely.
It requires stable storage and logging at every site.
For beginners, think of 2PC as a group decision‑making process: first, every member votes “yes” or “no”; then, the leader announces the final decision, and everyone must obey that decision consistently.
Summary
The Two‑Phase Commit Protocol (2PC) is a coordination protocol used in distributed DBMS to ensure that a distributed transaction either commits at all participating sites or aborts at all. It consists of a prepare phase, where participants vote, and a commit/abort phase, where the coordinator broadcasts the final decision. 2PC is essential for maintaining atomicity and consistency in distributed transactions, though it requires careful handling of failures and logging.