1. Why Traditional File Systems Became a Bottleneck
To understand LFS, we first need to understand a fundamental limitation of traditional file systems.
Most conventional file systems, including journaling file systems, update data in place.
Consider modifying a file:
Old Data Block → Overwritten
Old Metadata → Updated
Directory Entry → Updated
This causes multiple writes to different disk locations.
What Happens Internally?
Suppose you edit a file.
The operating system may need to update:
Data block
inode
Directory entry
Free space bitmap
Journal (if journaling is enabled)
These updates are often scattered across the disk.
Write #1 → Block 120
Write #2 → Block 8400
Write #3 → Block 35
Write #4 → Block 7001
Why Is This a Problem?
Traditional disks perform poorly when writes are:
Small
Frequent
Randomly scattered
Each random write may require:
Seek Time
+
Rotational Delay
+
Transfer Time
Result
Even a small modification can trigger expensive disk operations.
Key Insight
Traditional file systems optimize storage organization but often generate inefficient write patterns.
2. What is a Log-Structured File System (LFS)?
A Log-Structured File System (LFS) is a file system design in which the entire disk is treated as a continuous append-only log.
Instead of updating existing blocks, all new data and metadata are written sequentially at the end of the log.
Definition
A Log-Structured File System stores all modifications by appending them sequentially to a log rather than overwriting existing disk blocks.
Key Idea
Old Data Remains
New Data Appended
Traditional File System
Block 50
Old Data
↓
Overwrite
↓
New Data
Log-Structured File System
Block 50 → Old Data
Block 900 → New Data
Important Observation
Nothing is overwritten.
Everything is appended.
Core Insight
The disk behaves like a giant log where every update creates a new version rather than modifying the old one.
3. The Fundamental Philosophy of LFS
Traditional file systems ask:
Where should I update this data?
LFS asks:
Where is the end of the log?
Every write operation follows the same principle:
Append
Append
Append
Append
Why This Is Powerful
Sequential disk writes are dramatically faster than random writes.
Sequential Write
↓
Minimal Disk Movement
↓
Higher Throughput
Key Insight
LFS trades storage complexity for extremely efficient write performance.
4. How LFS Works (Step-by-Step)
Consider a file:
report.txt
Currently stored as:
Block 100 → Data Version 1
Now the user modifies the file.
Step 1: File Modification Occurs
Application performs:
write(fd, buffer, size);
The operating system receives new data.
Step 2: New Data is Written to End of Log
Instead of overwriting block 100:
Block 100 → Old Version
Block 850 → New Version
Important Observation
Old block remains untouched.
Step 3: New Metadata is Generated
The inode must now point to:
Block 850
Instead of:
Block 100
The updated inode is also appended.
Block 851 → New inode
Step 4: Mapping Structures Updated
The file system updates structures that indicate:
Latest inode location
These updates are also appended.
Block 852 → Updated Mapping
Final Result
Old Data
↓
Still Exists
New Data
↓
Stored at End of Log
Key Insight
A modification creates new versions rather than altering old versions.
5. Visualization of LFS
Traditional File System
Disk
[Data A]
↓
Overwrite
↓
[Updated Data A]
LFS
Disk Log
[Data A]
[Data B]
[Data C]
[Updated Data A]
Sequential Append
Beginning ---------------- End
Data
Metadata
inode
Directory Update
New Data
New Metadata
Log Growth
Write 1 → Append
Write 2 → Append
Write 3 → Append
Write 4 → Append
Important Observation
All writes occur at the end of the log.
6. Major Components of LFS
Several specialized structures are required to make this design practical.
6.1 The Log
The entire disk is treated as a giant append-only log.
Contents
The log stores:
Data blocks
inodes
Directory updates
Metadata
Allocation information
Structure
Log
├── Data
├── Metadata
├── inode
├── Data
├── Metadata
└── Data
Key Insight
Everything is written sequentially.
6.2 Segments
Writing one block at a time would be inefficient.
Therefore, LFS groups writes into large chunks called segments.
Definition
A segment is a large contiguous region of disk used as the basic unit of writing.
Example
Segment 1
Segment 2
Segment 3
Segment 4
Benefit
Large sequential writes improve disk throughput.
Key Insight
LFS writes segments rather than individual blocks.
6.3 Inode Map (imap)
Because data keeps moving, inode locations constantly change.
Problem
Traditional inode:
inode 50
always remains in one place.
In LFS:
inode 50
Old Location → Block 200
New Location → Block 900
The system needs a way to find the newest version.
Solution
Use an inode map.
Structure
inode 50 → Block 900
inode 51 → Block 910
inode 52 → Block 940
Purpose
Maps inode numbers to their current locations.
Key Insight
The inode map is the navigation system of LFS.
7. The Biggest Challenge: Garbage Accumulation
LFS never overwrites data.
This creates a serious problem.
Example
Initial Version
Block 100 → File A
Modified Version
Block 850 → Updated File A
Now:
Block 100
is no longer useful.
What Happens Over Time?
Version 1 ❌
Version 2 ❌
Version 3 ❌
Version 4 ✔
Many obsolete copies accumulate.
Result
Disk space becomes filled with invalid data.
Key Insight
LFS gains write speed by creating garbage that must eventually be cleaned.
8. Segment Cleaning (Most Important Concept)
Segment Cleaning is the mechanism used to reclaim disk space.
Purpose
Remove obsolete data and free disk space.
Process
Step 1: Select a Segment
Cleaner identifies:
Segment 10
containing:
Valid Data
Invalid Data
Obsolete Data
Step 2: Copy Valid Blocks
Valid blocks are moved elsewhere.
Valid Block A
↓
New Segment
Example
Segment 10
[Valid]
[Invalid]
[Invalid]
[Valid]
Valid blocks are copied.
Step 3: Free Entire Segment
After copying:
Segment 10
becomes empty.
Result
Reusable Space
Step 4: Reuse Segment
New writes can now use:
Segment 10
again.
Key Insight
Cleaning converts fragmented log space into reusable free space.
9. Advantages of LFS
9.1 Extremely Fast Write Performance
All writes are sequential.
Append
Append
Append
Minimal disk seeks.
Benefit
Very high write throughput.
9.2 Efficient Disk Utilization
Sequential allocation makes excellent use of disk bandwidth.
Benefit
Better performance than random writes.
9.3 Improved Crash Recovery
Because everything is written as a log:
Recent Operations
are easy to identify.
Benefit
Recovery is often simpler.
9.4 Excellent for Write-Heavy Workloads
Examples:
Logging systems
Databases
Flash storage
Embedded devices
Key Insight
LFS shines when writes dominate system activity.
10. Disadvantages of LFS
10.1 Cleaning Overhead
Garbage collection consumes resources.
Problem
Move Data
Copy Data
Rewrite Data
Result
Additional I/O.
10.2 Read Performance Issues
Data may become scattered throughout the log.
Example:
Version 1 → Segment 5
Version 2 → Segment 90
Version 3 → Segment 250
Reading older structures may require multiple accesses.
10.3 Increased Complexity
LFS requires:
Inode maps
Segment cleaners
Log management
Version tracking
Result
Much more complex implementation.
10.4 Cleaning Can Become Expensive
If utilization becomes high:
Little Free Space
Cleaning must run more frequently.
Result
Performance degradation.
Key Insight
The biggest challenge in LFS is not writing—it is cleaning.
11. LFS vs Journaling File Systems
| Feature | Journaling FS | LFS |
|---|---|---|
| Main Idea | Log Changes | Entire Disk is a Log |
| Write Method | Mixed Updates | Sequential Append |
| Overwrites Data | Yes | No |
| Recovery | Journal Replay | Log Recovery |
| Complexity | Medium | High |
| Write Performance | Good | Excellent |
| Cleaning Required | No | Yes |
| Design Philosophy | Safety Layer | Complete Storage Model |
Important Observation
Journaling adds a log to an existing file system.
LFS transforms the entire file system into a log.
Key Insight
Journaling is a reliability feature.
LFS is a complete file system architecture.
12. Real-World Analogy
Imagine maintaining a notebook.
Traditional Method
Whenever something changes:
Erase Old Entry
Write New Entry
This takes effort.
LFS Method
Simply:
Write New Entry
on the next empty page.
Old entries remain.
Problem
Notebook eventually fills with outdated entries.
Solution
Periodically rewrite useful notes into a fresh notebook.
This is exactly what segment cleaning does.
13. Real Systems Using LFS Concepts
F2FS (Flash-Friendly File System)
Designed specifically for:
SSDs
eMMC
Flash storage
Uses many LFS principles.
NILFS
A true log-structured file system.
Features:
Continuous snapshots
Log-based storage
Flash Storage Systems
Many SSD-oriented systems adopt:
Append-only writing
Log structures
Cleaning mechanisms
Key Insight
Modern flash storage naturally benefits from log-structured designs.
14. LFS at a Glance
| Aspect | Log-Structured File System |
|---|---|
| Main Idea | Entire disk treated as append-only log |
| Writes | Sequential |
| Overwrites | Never |
| File Updates | New versions appended |
| Mapping Structure | Inode Map |
| Space Reclamation | Segment Cleaning |
| Write Performance | Excellent |
| Read Performance | Moderate |
| Complexity | High |
| Best For | Write-intensive and flash-based systems |
Final Insight
A Log-Structured File System improves write performance by treating the entire disk as a sequential log. Instead of overwriting existing blocks, every modification is appended to the end of the log, converting expensive random writes into fast sequential writes. This design provides excellent write throughput and crash resilience, but introduces the challenge of garbage accumulation, which must be managed through segment cleaning. Modern flash-oriented file systems such as F2FS and NILFS adopt these principles because sequential append operations align exceptionally well with the characteristics of SSDs and flash memory.