1. Why Traditional File Systems Became a Bottleneck

To understand LFS, we first need to understand a fundamental limitation of traditional file systems.

Most conventional file systems, including journaling file systems, update data in place.

Consider modifying a file:

Old Data Block → Overwritten

Old Metadata → Updated

Directory Entry → Updated

This causes multiple writes to different disk locations.

What Happens Internally?

Suppose you edit a file.

The operating system may need to update:

  • Data block

  • inode

  • Directory entry

  • Free space bitmap

  • Journal (if journaling is enabled)

These updates are often scattered across the disk.

Write #1 → Block 120

Write #2 → Block 8400

Write #3 → Block 35

Write #4 → Block 7001

Why Is This a Problem?

Traditional disks perform poorly when writes are:

  • Small

  • Frequent

  • Randomly scattered

Each random write may require:

Seek Time
     +
Rotational Delay
     +
Transfer Time

Result

Even a small modification can trigger expensive disk operations.

Key Insight

Traditional file systems optimize storage organization but often generate inefficient write patterns.


2. What is a Log-Structured File System (LFS)?

A Log-Structured File System (LFS) is a file system design in which the entire disk is treated as a continuous append-only log.

Instead of updating existing blocks, all new data and metadata are written sequentially at the end of the log.

Definition

A Log-Structured File System stores all modifications by appending them sequentially to a log rather than overwriting existing disk blocks.

Key Idea

Old Data Remains

New Data Appended

Traditional File System

Block 50

Old Data
    ↓
Overwrite
    ↓
New Data

Log-Structured File System

Block 50 → Old Data

Block 900 → New Data

Important Observation

Nothing is overwritten.

Everything is appended.

Core Insight

The disk behaves like a giant log where every update creates a new version rather than modifying the old one.


3. The Fundamental Philosophy of LFS

Traditional file systems ask:

Where should I update this data?

LFS asks:

Where is the end of the log?

Every write operation follows the same principle:

Append
Append
Append
Append

Why This Is Powerful

Sequential disk writes are dramatically faster than random writes.

Sequential Write
        ↓
Minimal Disk Movement
        ↓
Higher Throughput

Key Insight

LFS trades storage complexity for extremely efficient write performance.


4. How LFS Works (Step-by-Step)

Consider a file:

report.txt

Currently stored as:

Block 100 → Data Version 1

Now the user modifies the file.


Step 1: File Modification Occurs

Application performs:

write(fd, buffer, size);

The operating system receives new data.


Step 2: New Data is Written to End of Log

Instead of overwriting block 100:

Block 100 → Old Version

Block 850 → New Version

Important Observation

Old block remains untouched.


Step 3: New Metadata is Generated

The inode must now point to:

Block 850

Instead of:

Block 100

The updated inode is also appended.

Block 851 → New inode

Step 4: Mapping Structures Updated

The file system updates structures that indicate:

Latest inode location

These updates are also appended.

Block 852 → Updated Mapping

Final Result

Old Data
      ↓
Still Exists

New Data
      ↓
Stored at End of Log

Key Insight

A modification creates new versions rather than altering old versions.


5. Visualization of LFS

Traditional File System

Disk

[Data A]
   ↓
Overwrite
   ↓
[Updated Data A]

LFS

Disk Log

[Data A]

[Data B]

[Data C]

[Updated Data A]

Sequential Append

Beginning ---------------- End

Data

Metadata

inode

Directory Update

New Data

New Metadata

Log Growth

Write 1 → Append

Write 2 → Append

Write 3 → Append

Write 4 → Append

Important Observation

All writes occur at the end of the log.


6. Major Components of LFS

Several specialized structures are required to make this design practical.


6.1 The Log

The entire disk is treated as a giant append-only log.

Contents

The log stores:

  • Data blocks

  • inodes

  • Directory updates

  • Metadata

  • Allocation information

Structure

Log

├── Data
├── Metadata
├── inode
├── Data
├── Metadata
└── Data

Key Insight

Everything is written sequentially.


6.2 Segments

Writing one block at a time would be inefficient.

Therefore, LFS groups writes into large chunks called segments.

Definition

A segment is a large contiguous region of disk used as the basic unit of writing.

Example

Segment 1

Segment 2

Segment 3

Segment 4

Benefit

Large sequential writes improve disk throughput.

Key Insight

LFS writes segments rather than individual blocks.


6.3 Inode Map (imap)

Because data keeps moving, inode locations constantly change.

Problem

Traditional inode:

inode 50

always remains in one place.

In LFS:

inode 50

Old Location → Block 200

New Location → Block 900

The system needs a way to find the newest version.

Solution

Use an inode map.

Structure

inode 50 → Block 900

inode 51 → Block 910

inode 52 → Block 940

Purpose

Maps inode numbers to their current locations.

Key Insight

The inode map is the navigation system of LFS.


7. The Biggest Challenge: Garbage Accumulation

LFS never overwrites data.

This creates a serious problem.

Example

Initial Version

Block 100 → File A

Modified Version

Block 850 → Updated File A

Now:

Block 100

is no longer useful.

What Happens Over Time?

Version 1 ❌

Version 2 ❌

Version 3 ❌

Version 4 ✔

Many obsolete copies accumulate.

Result

Disk space becomes filled with invalid data.

Key Insight

LFS gains write speed by creating garbage that must eventually be cleaned.


8. Segment Cleaning (Most Important Concept)

Segment Cleaning is the mechanism used to reclaim disk space.

Purpose

Remove obsolete data and free disk space.

Process


Step 1: Select a Segment

Cleaner identifies:

Segment 10

containing:

Valid Data

Invalid Data

Obsolete Data

Step 2: Copy Valid Blocks

Valid blocks are moved elsewhere.

Valid Block A
      ↓
New Segment

Example

Segment 10

[Valid]
[Invalid]
[Invalid]
[Valid]

Valid blocks are copied.


Step 3: Free Entire Segment

After copying:

Segment 10

becomes empty.

Result

Reusable Space

Step 4: Reuse Segment

New writes can now use:

Segment 10

again.

Key Insight

Cleaning converts fragmented log space into reusable free space.


9. Advantages of LFS

9.1 Extremely Fast Write Performance

All writes are sequential.

Append

Append

Append

Minimal disk seeks.

Benefit

Very high write throughput.


9.2 Efficient Disk Utilization

Sequential allocation makes excellent use of disk bandwidth.

Benefit

Better performance than random writes.


9.3 Improved Crash Recovery

Because everything is written as a log:

Recent Operations

are easy to identify.

Benefit

Recovery is often simpler.


9.4 Excellent for Write-Heavy Workloads

Examples:

  • Logging systems

  • Databases

  • Flash storage

  • Embedded devices

Key Insight

LFS shines when writes dominate system activity.


10. Disadvantages of LFS

10.1 Cleaning Overhead

Garbage collection consumes resources.

Problem

Move Data

Copy Data

Rewrite Data

Result

Additional I/O.


10.2 Read Performance Issues

Data may become scattered throughout the log.

Example:

Version 1 → Segment 5

Version 2 → Segment 90

Version 3 → Segment 250

Reading older structures may require multiple accesses.


10.3 Increased Complexity

LFS requires:

  • Inode maps

  • Segment cleaners

  • Log management

  • Version tracking

Result

Much more complex implementation.


10.4 Cleaning Can Become Expensive

If utilization becomes high:

Little Free Space

Cleaning must run more frequently.

Result

Performance degradation.

Key Insight

The biggest challenge in LFS is not writing—it is cleaning.


11. LFS vs Journaling File Systems

FeatureJournaling FSLFS
Main IdeaLog ChangesEntire Disk is a Log
Write MethodMixed UpdatesSequential Append
Overwrites DataYesNo
RecoveryJournal ReplayLog Recovery
ComplexityMediumHigh
Write PerformanceGoodExcellent
Cleaning RequiredNoYes
Design PhilosophySafety LayerComplete Storage Model

Important Observation

Journaling adds a log to an existing file system.

LFS transforms the entire file system into a log.

Key Insight

Journaling is a reliability feature.

LFS is a complete file system architecture.


12. Real-World Analogy

Imagine maintaining a notebook.

Traditional Method

Whenever something changes:

Erase Old Entry

Write New Entry

This takes effort.

LFS Method

Simply:

Write New Entry

on the next empty page.

Old entries remain.

Problem

Notebook eventually fills with outdated entries.

Solution

Periodically rewrite useful notes into a fresh notebook.

This is exactly what segment cleaning does.


13. Real Systems Using LFS Concepts

F2FS (Flash-Friendly File System)

Designed specifically for:

  • SSDs

  • eMMC

  • Flash storage

Uses many LFS principles.


NILFS

A true log-structured file system.

Features:

  • Continuous snapshots

  • Log-based storage


Flash Storage Systems

Many SSD-oriented systems adopt:

  • Append-only writing

  • Log structures

  • Cleaning mechanisms

Key Insight

Modern flash storage naturally benefits from log-structured designs.


14. LFS at a Glance

AspectLog-Structured File System
Main IdeaEntire disk treated as append-only log
WritesSequential
OverwritesNever
File UpdatesNew versions appended
Mapping StructureInode Map
Space ReclamationSegment Cleaning
Write PerformanceExcellent
Read PerformanceModerate
ComplexityHigh
Best ForWrite-intensive and flash-based systems

Final Insight

A Log-Structured File System improves write performance by treating the entire disk as a sequential log. Instead of overwriting existing blocks, every modification is appended to the end of the log, converting expensive random writes into fast sequential writes. This design provides excellent write throughput and crash resilience, but introduces the challenge of garbage accumulation, which must be managed through segment cleaning. Modern flash-oriented file systems such as F2FS and NILFS adopt these principles because sequential append operations align exceptionally well with the characteristics of SSDs and flash memory.