1. Introduction
Deadlock detection allows the operating system to identify when a deadlock has occurred. However, simply detecting a deadlock does not solve the problem.
Once a deadlock is detected, the operating system must take corrective action to break the deadlock and restore system progress.
This process is known as Deadlock Recovery.
Deadlock Recovery consists of the techniques used by the operating system to eliminate deadlocks after they have occurred by releasing resources and allowing blocked processes to continue execution.
The primary objective is:
To break the circular dependency among processes and return the system to a state where progress becomes possible.
Unlike prevention and avoidance, recovery is a reactive strategy because it acts only after a deadlock has already occurred.
2. Concept of Deadlock Recovery
When deadlock is detected:
Deadlock Detected
↓
Recovery Procedure Starts
↓
Resources Released
↓
Processes Resume
↓
System Progress Restored
The operating system must decide:
Which process should be affected?
Which resources should be released?
How much work should be sacrificed?
Recovery therefore becomes an optimization problem where the OS attempts to minimize the cost of breaking the deadlock.
3. Major Recovery Strategies
Two primary recovery approaches are used:
1. Process Termination
2. Resource Preemption
Both methods aim to free resources and eliminate circular waiting, but they differ significantly in cost and complexity.
4. Strategy 1 – Process Termination
Process termination resolves deadlock by forcibly ending one or more processes involved in the deadlock.
When a process is terminated:
Process Ends
↓
Resources Released
↓
Other Processes Proceed
This is the simplest recovery technique.
There are two common approaches.
4.1 Terminate All Deadlocked Processes
Concept
The operating system immediately terminates every process involved in the deadlock.
Example:
Deadlocked Processes:
P1
P2
P3
Action:
Kill P1
Kill P2
Kill P3
All resources are released instantly.
Deadlock disappears immediately.
Advantages
Simple Implementation
No complex decision-making required.
Fast Recovery
Deadlock is removed immediately.
Guaranteed Solution
All circular dependencies are destroyed.
Disadvantages
Maximum Loss of Work
All computation performed by the terminated processes is lost.
Resource Waste
CPU time already consumed becomes wasted.
Possible Data Loss
Partially completed operations may be lost.
Consistency Problems
Databases and files may require recovery procedures.
Key Insight
Fastest Recovery
But
Highest Cost
4.2 Terminate One Process at a Time
Concept
Instead of terminating every process, the OS selects one process as a victim.
Procedure:
Select Victim
↓
Terminate Victim
↓
Run Detection Again
↓
Deadlock Removed ?
If deadlock still exists:
Choose Another Victim
and repeat.
Example
Deadlock involves:
P1
P2
P3
Terminate:
P2
Resources are released.
If deadlock disappears:
Recovery Complete
Otherwise:
Terminate Another Process
Advantages
Lower Loss of Work
Only selected processes are terminated.
Better Resource Utilization
Less wasted computation.
More Controlled Recovery
The OS can minimize damage.
Disadvantages
Multiple Detection Cycles
The OS may need to repeatedly run detection.
Complex Victim Selection
Choosing the wrong victim can increase recovery cost.
4.3 Victim Selection Criteria (Very Important)
Choosing the victim process is one of the most important aspects of deadlock recovery.
The objective is:
Minimize Recovery Cost
The operating system may consider:
Process Priority
Lower-priority processes are preferred victims.
Amount of Work Completed
Processes that have done little work are cheaper to terminate.
Resources Held
Processes holding many resources may be attractive victims.
Resources Still Needed
Processes requiring many additional resources may be terminated first.
User Impact
Critical system processes should generally not be terminated.
Restart Cost
Processes that are easy to restart are preferred victims.
Example
Suppose:
P1 → 95% Complete
P2 → 10% Complete
Terminating:
P2
is usually cheaper.
5. Strategy 2 – Resource Preemption
Instead of terminating processes, the operating system may forcibly take resources away from a process.
This approach is called:
Resource Preemption
Concept
Resources are temporarily reclaimed and reassigned to other processes.
Example:
P1 Holds Resource A
P2 Holds Resource B
Both processes wait for each other's resources.
The OS may:
Take Resource A From P1
and give it to:
P2
allowing P2 to complete.
After P2 finishes:
Resource B Released
Deadlock breaks.
6. Steps in Resource Preemption
A typical preemption-based recovery procedure follows these steps.
Step 1 – Select Victim
Choose the process from which resources will be taken.
Step 2 – Preempt Resources
Remove one or more resources from the victim.
Step 3 – Rollback Process
Return the victim process to a previous safe state.
Step 4 – Restart Later
The process may resume execution later.
Conceptually:
Select Victim
↓
Take Resources
↓
Rollback
↓
Restart Later
7. Example of Resource Preemption
Suppose:
P1 Holds A
Needs B
P2 Holds B
Needs A
Deadlock exists.
The OS decides:
Preempt A From P1
Then:
Give A To P2
P2 completes.
P2 releases:
A and B
P1 then obtains B and continues.
Deadlock disappears without terminating either process.
8. Key Issues in Resource Preemption
Resource preemption introduces several challenges.
8.1 Victim Selection
The OS must determine:
Which Process
Should Lose Resources?
Poor choices can significantly increase recovery cost.
8.2 Rollback
After resources are removed:
Process State May Become Invalid
The process may need to restart from an earlier checkpoint.
This operation is known as:
Rollback
8.3 Starvation
A process repeatedly chosen as victim may never finish.
Example:
P1 Chosen Again
P1 Chosen Again
P1 Chosen Again
Result:
Starvation
9. Rollback Mechanism
Rollback is essential in many preemption systems.
Checkpointing
The OS periodically saves process state.
Example:
Checkpoint 1
Checkpoint 2
Checkpoint 3
If recovery becomes necessary:
Restore Checkpoint
instead of restarting from the beginning.
Advantages
Less Work Lost
Only recent computation is discarded.
Faster Recovery
Processes resume quickly.
Disadvantages
Storage Overhead
Checkpoints consume memory.
Additional Complexity
Checkpoint management is expensive.
10. Avoiding Starvation During Recovery
The OS must ensure fairness.
Several techniques are used.
Aging
Processes repeatedly selected as victims receive increasing protection.
Recovery Counters
Track:
Number Of Rollbacks
for each process.
Victim Rotation
Choose different victims over time.
These mechanisms ensure:
Every Process
Eventually Completes
11. Comparison of Recovery Methods
| Feature | Process Termination | Resource Preemption |
|---|---|---|
| Complexity | Low | High |
| Implementation | Simple | Difficult |
| Data Loss | High | Low |
| Flexibility | Low | High |
| Overhead | Low | High |
| Recovery Speed | Fast | Moderate |
| Rollback Required | No | Usually Yes |
| Starvation Risk | Lower | Higher |
12. Key Insight
Deadlock recovery is fundamentally a trade-off:
System Progress
vs
Cost Of Lost Work
Process termination sacrifices work for simplicity.
Resource preemption preserves work but increases complexity.
There is no universally optimal solution.
The best approach depends on system requirements.
13. Real-World Analogy
Consider a major traffic jam.
Option 1
Remove some vehicles completely.
Terminate Processes
Traffic clears immediately.
Option 2
Redirect vehicles onto alternate roads.
Preempt Resources
More complex but less disruptive.
Deadlock recovery uses the same ideas.
14. When is Recovery Used?
Recovery is typically used in systems that employ:
Deadlock Detection
rather than prevention or avoidance.
Common scenarios:
Large database systems
Transaction processing systems
Distributed systems
High-performance servers
where:
Deadlocks Are Rare
and flexibility is important.
15. Practical Challenges
Real operating systems face several difficulties.
Safe Rollback
Processes must be restored correctly.
Data Consistency
Files and databases must remain valid.
Victim Optimization
Selecting the lowest-cost victim is difficult.
Scalability
Large systems may contain thousands of processes.
User Impact
Critical applications must be protected.
These challenges make recovery one of the most complex aspects of deadlock management.
16. Comparison of All Deadlock Handling Approaches
| Method | Strategy | Deadlock Allowed? | Overhead | Flexibility |
|---|---|---|---|---|
| Prevention | Break conditions | No | Low | Low |
| Avoidance | Stay safe | No | High | Medium |
| Detection | Find deadlock | Yes | Medium | High |
| Recovery | Remove deadlock | Yes | High | High |