Introduction

Traditional file access in operating systems usually involves:

  1. Opening a file

  2. Reading data into buffers

  3. Processing data

  4. Writing results back

This approach works, but it introduces:

  • Multiple data copies

  • System-call overhead

  • Buffer management complexity

  • Additional CPU usage

Modern operating systems therefore introduced:

Memory-Mapped Files

Memory-mapped files allow files to be accessed directly as part of a process’s virtual memory address space.

Instead of:

  • Explicit read/write operations

applications can:

  • Access file contents like normal memory

This technique is one of the most important concepts in:

  • Operating systems

  • Virtual memory

  • Database systems

  • High-performance I/O

  • Shared memory systems

  • File systems

Memory mapping is heavily used in:

  • Databases

  • Web browsers

  • Compilers

  • Multimedia systems

  • Shared libraries

  • Large-scale servers

because it provides:

  • Faster I/O

  • Efficient memory usage

  • Simplified programming

  • Shared access capabilities

What are Memory-Mapped Files?

A memory-mapped file is a file whose contents are mapped directly into a process’s virtual address space.

The operating system allows the process to:

  • Access file contents using normal memory operations

Core Idea

Files are accessed as memory instead of explicit read/write operations

Important Insight

Memory mapping integrates file I/O directly with the virtual memory system

Why Memory-Mapped Files are Necessary

Traditional file I/O requires:

  • Multiple kernel-user copies

  • Buffer management

  • Explicit read/write calls

Memory mapping eliminates much of this overhead.

Advantages:

  • Faster access

  • Fewer copies

  • Better performance

Traditional File I/O vs Memory Mapping

Traditional File I/O

Disk → Kernel Buffer → User Buffer

Memory-Mapped I/O

Disk ↔ Virtual Memory Pages

Relationship with Virtual Memory

Memory mapping relies heavily on:

  • Paging

  • Virtual memory

  • Page tables

Operating system maps:

  • File blocks
    to:

  • Virtual memory pages

Important Insight

Memory-mapped files work by treating file contents as virtual memory pages

mmap() System Call

In Linux and UNIX systems, memory mapping commonly uses:

mmap()

Simplified Syntax

mmap(address, length, protection, flags, fd, offset)

Parameters

address

Preferred virtual address.

length

Mapping size.

protection

Read/write/execute permissions.

flags

Mapping behavior.

fd

File descriptor.

offset

File starting position.

Example

ptr = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);

Result

File contents accessible through:

  • ptr memory pointer

Accessing Memory-Mapped Files

After mapping:

char x = ptr[0];

behaves like:

  • Normal memory access

OS transparently loads pages from disk.

Demand Paging in Memory Mapping

Very important concept.

Mapped files generally use:

Demand Paging

Pages loaded:

  • Only when accessed

Advantages:

  • Efficient memory usage

Example

1 GB file mapped:

  • Entire file not loaded immediately

Only accessed portions:

  • Loaded into RAM

Page Faults in Memory Mapping

When mapped page first accessed:

  • Page fault occurs

OS:

  1. Loads required file page

  2. Updates page tables

  3. Resumes execution

Important Insight

Memory-mapped files rely on page faults to load file data lazily

Shared vs Private Mapping

Two major mapping types.

MAP_SHARED

Changes visible to:

  • Other processes

  • Underlying file

MAP_PRIVATE

Private copy-on-write mapping.

Changes:

  • Not written back to original file

Shared Memory Using mmap

Memory mapping also supports:

Interprocess Communication (IPC)

Multiple processes may map:

  • Same file/shared region

Advantages:

  • Fast communication

  • Zero-copy sharing

Copy-on-Write Mechanism

Very important optimization.

Private Mapping

Processes initially share pages.

When process modifies page:

  • OS creates private copy

Advantages:

  • Memory efficient

File Caching Through Memory Mapping

Mapped pages naturally become part of:

  • Page cache

Advantages:

  • Unified caching system

  • Reduced duplication

Advantages of Memory-Mapped Files

1. Faster File Access

Avoids repeated read/write system calls.

2. Reduced Copying

Fewer buffer transfers.

3. Simplified Programming

Files accessed like arrays.

4. Efficient Sharing

Multiple processes share mapped regions.

5. Lazy Loading

Only needed pages loaded.

Important Insight

Memory mapping improves performance by reducing copying and leveraging virtual memory mechanisms

Disadvantages of Memory-Mapped Files

1. Page Fault Overhead

Initial accesses may trigger faults.

2. Address Space Usage

Large mappings consume virtual address space.

3. Complex Debugging

Page-fault behavior less predictable.

4. Synchronization Challenges

Shared mappings require coordination.

Memory-Mapped Files vs Traditional I/O

FeatureTraditional I/OMemory Mapping
Access methodread/writeMemory access
Copy overheadHigherLower
System callsFrequentFewer
Shared accessLimitedEfficient
Programming complexityHigherLower

Memory-Mapped Executables

Very important OS concept.

Executable programs often loaded using:

  • Memory mapping

OS maps:

  • Executable file sections

into:

  • Virtual memory

Advantages:

  • Faster loading

  • Shared code pages

Shared Libraries

Shared libraries heavily use:

  • Memory mapping

Multiple processes share:

  • Same physical library pages

Advantages:

  • Reduced RAM usage

Example

Many processes use:

  • libc.so

Only one physical copy needed.

Database Systems and mmap

Databases heavily use:

  • Memory-mapped files

Advantages:

  • Fast random access

  • Efficient caching

  • Reduced I/O overhead

Example Systems

  • SQLite

  • LMDB

  • Some NoSQL engines

Web Browsers and Memory Mapping

Browsers use memory mapping for:

  • Caching

  • Shared resources

  • Multimedia files

Multimedia Applications

Large media files accessed efficiently through:

  • Memory mapping

Advantages:

  • Streaming efficiency

  • Reduced buffering overhead

Memory Protection in mmap

Mapped pages may have permissions:

  • Read-only

  • Writable

  • Executable

OS enforces:

  • Protection bits

Example

PROT_READ

File Synchronization

Modified mapped pages eventually:

  • Written back to disk

using:

  • Lazy write-back

or:

msync()

Dirty Pages in Memory Mapping

Modified mapped pages called:

Dirty pages

OS must eventually:

  • Flush them to storage

Memory Mapping and NUMA

Modern systems optimize:

  • NUMA-aware page placement

for mapped files.

Advantages:

  • Better locality

  • Reduced memory latency

Security Considerations

Memory mapping must enforce:

  • Access permissions

  • Isolation

  • Protection boundaries

Improper mappings may cause:

  • Security vulnerabilities

Real-World Example

Suppose video editor opens:

  • 4 GB video file

Without mmap:

  • Large buffer copies required

With mmap:

  1. File mapped into virtual memory

  2. Needed portions loaded lazily

  3. OS handles paging automatically

  4. Application accesses frames like memory

Result:

  • Faster processing

  • Reduced overhead

Memory-Mapped I/O vs Device Memory Mapping

Students commonly confuse these.

Memory-Mapped Files

Map:

  • File contents into memory

Memory-Mapped I/O

Map:

  • Hardware device registers into memory

Different concepts.