1. Introduction

Deadlock detection allows the operating system to identify when a deadlock has occurred. However, simply detecting a deadlock does not solve the problem.

Once a deadlock is detected, the operating system must take corrective action to break the deadlock and restore system progress.

This process is known as Deadlock Recovery.

Deadlock Recovery consists of the techniques used by the operating system to eliminate deadlocks after they have occurred by releasing resources and allowing blocked processes to continue execution.

The primary objective is:

To break the circular dependency among processes and return the system to a state where progress becomes possible.

Unlike prevention and avoidance, recovery is a reactive strategy because it acts only after a deadlock has already occurred.

2. Concept of Deadlock Recovery

When deadlock is detected:

Deadlock Detected
        ↓
Recovery Procedure Starts
        ↓
Resources Released
        ↓
Processes Resume
        ↓
System Progress Restored

The operating system must decide:

  • Which process should be affected?

  • Which resources should be released?

  • How much work should be sacrificed?

Recovery therefore becomes an optimization problem where the OS attempts to minimize the cost of breaking the deadlock.

3. Major Recovery Strategies

Two primary recovery approaches are used:

1. Process Termination
2. Resource Preemption

Both methods aim to free resources and eliminate circular waiting, but they differ significantly in cost and complexity.

4. Strategy 1 – Process Termination

Process termination resolves deadlock by forcibly ending one or more processes involved in the deadlock.

When a process is terminated:

Process Ends
        ↓
Resources Released
        ↓
Other Processes Proceed

This is the simplest recovery technique.

There are two common approaches.

4.1 Terminate All Deadlocked Processes

Concept

The operating system immediately terminates every process involved in the deadlock.

Example:

Deadlocked Processes:
P1
P2
P3

Action:

Kill P1
Kill P2
Kill P3

All resources are released instantly.

Deadlock disappears immediately.

Advantages

Simple Implementation

No complex decision-making required.

Fast Recovery

Deadlock is removed immediately.

Guaranteed Solution

All circular dependencies are destroyed.

Disadvantages

Maximum Loss of Work

All computation performed by the terminated processes is lost.

Resource Waste

CPU time already consumed becomes wasted.

Possible Data Loss

Partially completed operations may be lost.

Consistency Problems

Databases and files may require recovery procedures.

Key Insight

Fastest Recovery
But
Highest Cost

4.2 Terminate One Process at a Time

Concept

Instead of terminating every process, the OS selects one process as a victim.

Procedure:

Select Victim
      ↓
Terminate Victim
      ↓
Run Detection Again
      ↓
Deadlock Removed ?

If deadlock still exists:

Choose Another Victim

and repeat.

Example

Deadlock involves:

P1
P2
P3

Terminate:

P2

Resources are released.

If deadlock disappears:

Recovery Complete

Otherwise:

Terminate Another Process

Advantages

Lower Loss of Work

Only selected processes are terminated.

Better Resource Utilization

Less wasted computation.

More Controlled Recovery

The OS can minimize damage.

Disadvantages

Multiple Detection Cycles

The OS may need to repeatedly run detection.

Complex Victim Selection

Choosing the wrong victim can increase recovery cost.

4.3 Victim Selection Criteria (Very Important)

Choosing the victim process is one of the most important aspects of deadlock recovery.

The objective is:

Minimize Recovery Cost

The operating system may consider:

Process Priority

Lower-priority processes are preferred victims.

Amount of Work Completed

Processes that have done little work are cheaper to terminate.

Resources Held

Processes holding many resources may be attractive victims.

Resources Still Needed

Processes requiring many additional resources may be terminated first.

User Impact

Critical system processes should generally not be terminated.

Restart Cost

Processes that are easy to restart are preferred victims.

Example

Suppose:

P1 → 95% Complete
P2 → 10% Complete

Terminating:

P2

is usually cheaper.

5. Strategy 2 – Resource Preemption

Instead of terminating processes, the operating system may forcibly take resources away from a process.

This approach is called:

Resource Preemption

Concept

Resources are temporarily reclaimed and reassigned to other processes.

Example:

P1 Holds Resource A
P2 Holds Resource B

Both processes wait for each other's resources.

The OS may:

Take Resource A From P1

and give it to:

P2

allowing P2 to complete.

After P2 finishes:

Resource B Released

Deadlock breaks.

6. Steps in Resource Preemption

A typical preemption-based recovery procedure follows these steps.

Step 1 – Select Victim

Choose the process from which resources will be taken.

Step 2 – Preempt Resources

Remove one or more resources from the victim.

Step 3 – Rollback Process

Return the victim process to a previous safe state.

Step 4 – Restart Later

The process may resume execution later.

Conceptually:

Select Victim
      ↓
Take Resources
      ↓
Rollback
      ↓
Restart Later

7. Example of Resource Preemption

Suppose:

P1 Holds A
Needs B
P2 Holds B
Needs A

Deadlock exists.

The OS decides:

Preempt A From P1

Then:

Give A To P2

P2 completes.

P2 releases:

A and B

P1 then obtains B and continues.

Deadlock disappears without terminating either process.

8. Key Issues in Resource Preemption

Resource preemption introduces several challenges.

8.1 Victim Selection

The OS must determine:

Which Process
Should Lose Resources?

Poor choices can significantly increase recovery cost.

8.2 Rollback

After resources are removed:

Process State May Become Invalid

The process may need to restart from an earlier checkpoint.

This operation is known as:

Rollback

8.3 Starvation

A process repeatedly chosen as victim may never finish.

Example:

P1 Chosen Again
P1 Chosen Again
P1 Chosen Again

Result:

Starvation

9. Rollback Mechanism

Rollback is essential in many preemption systems.

Checkpointing

The OS periodically saves process state.

Example:

Checkpoint 1
Checkpoint 2
Checkpoint 3

If recovery becomes necessary:

Restore Checkpoint

instead of restarting from the beginning.

Advantages

Less Work Lost

Only recent computation is discarded.

Faster Recovery

Processes resume quickly.

Disadvantages

Storage Overhead

Checkpoints consume memory.

Additional Complexity

Checkpoint management is expensive.

10. Avoiding Starvation During Recovery

The OS must ensure fairness.

Several techniques are used.

Aging

Processes repeatedly selected as victims receive increasing protection.

Recovery Counters

Track:

Number Of Rollbacks

for each process.

Victim Rotation

Choose different victims over time.

These mechanisms ensure:

Every Process
Eventually Completes

11. Comparison of Recovery Methods

FeatureProcess TerminationResource Preemption
ComplexityLowHigh
ImplementationSimpleDifficult
Data LossHighLow
FlexibilityLowHigh
OverheadLowHigh
Recovery SpeedFastModerate
Rollback RequiredNoUsually Yes
Starvation RiskLowerHigher

12. Key Insight

Deadlock recovery is fundamentally a trade-off:

System Progress
        vs
Cost Of Lost Work

Process termination sacrifices work for simplicity.

Resource preemption preserves work but increases complexity.

There is no universally optimal solution.

The best approach depends on system requirements.

13. Real-World Analogy

Consider a major traffic jam.

Option 1

Remove some vehicles completely.

Terminate Processes

Traffic clears immediately.

Option 2

Redirect vehicles onto alternate roads.

Preempt Resources

More complex but less disruptive.

Deadlock recovery uses the same ideas.

14. When is Recovery Used?

Recovery is typically used in systems that employ:

Deadlock Detection

rather than prevention or avoidance.

Common scenarios:

  • Large database systems

  • Transaction processing systems

  • Distributed systems

  • High-performance servers

where:

Deadlocks Are Rare

and flexibility is important.

15. Practical Challenges

Real operating systems face several difficulties.

Safe Rollback

Processes must be restored correctly.

Data Consistency

Files and databases must remain valid.

Victim Optimization

Selecting the lowest-cost victim is difficult.

Scalability

Large systems may contain thousands of processes.

User Impact

Critical applications must be protected.

These challenges make recovery one of the most complex aspects of deadlock management.

16. Comparison of All Deadlock Handling Approaches

MethodStrategyDeadlock Allowed?OverheadFlexibility
PreventionBreak conditionsNoLowLow
AvoidanceStay safeNoHighMedium
DetectionFind deadlockYesMediumHigh
RecoveryRemove deadlockYesHighHigh