1. Introduction

At the hardware level, storage devices such as hard disks and SSDs do not understand the concept of files. They only store information as:

  • Blocks

  • Sectors

  • Raw binary data

If users had to manage storage directly:

  • Block numbers would need to be tracked manually

  • Data organization would become difficult

  • Retrieving information would be extremely complex

To solve this problem, the operating system provides an abstraction called a file.

A file is a logical abstraction that hides the complexity of physical storage and provides a structured way to store and retrieve information.

This is one of the most important abstractions provided by an operating system.

2. What is a File?

A file is a named collection of related data stored on secondary storage, together with metadata that describes the file.

Every file consists of three essential components:

  • Name

  • Data

  • Metadata

Important Insight

A file is not merely data.

File = Data + Metadata + Access Interface

The access interface allows users and programs to interact with the stored data efficiently.

3. Logical View vs Physical View

3.1 Logical View (User Perspective)

When users access files:

  • A text file appears as characters

  • A video file appears as frames and audio

  • An executable appears as a program

This is the high-level interpretation of a file.

3.2 Physical View (Operating System Perspective)

The operating system sees:

  • Files divided into blocks

  • Blocks distributed across storage

  • Metadata structures tracking block locations

Key Insight

Files are generally not stored contiguously on disk.

The operating system uses:

  • Allocation structures

  • Indexing mechanisms

  • Metadata records

to locate file contents.

4. File Structure

Different operating systems and applications may interpret file contents differently.

4.1 Byte Stream Model

Used in:

  • UNIX

  • Linux

A file is treated as a sequence of bytes.

Example:

Hello World

is stored simply as bytes.

Characteristics

  • OS does not enforce structure

  • Applications interpret content

  • Highly flexible

Key Insight

Modern operating systems primarily use the byte-stream model.

4.2 Record-Based Files

A file consists of records.

Example:

[Name, Roll Number, Marks]

Each record has a predefined structure.

4.3 Indexed Files

Files maintain indexes for faster retrieval.

Commonly used in:

  • Databases

  • Large storage systems

5. File Attributes (Metadata)

Metadata is information about the file rather than the file content itself.

The operating system stores metadata in structures such as:

  • File Control Blocks (FCB)

  • Inodes (Linux)

Common File Attributes

1. Name

Human-readable file identifier.

Example:

report.txt

2. Identifier

Unique internal identifier used by the OS.

Example:

inode number

3. Location

Information about where file blocks are stored on disk.

4. Size

Length of the file measured in bytes.

5. Permissions

Defines allowed operations:

  • Read

  • Write

  • Execute

6. Timestamps

Stores:

  • Creation time

  • Last modification time

  • Last access time

7. Ownership Information

Stores:

  • User ID

  • Group ID

Important Insight

The operating system does not search entire disks to find files.

Instead, it uses metadata structures for efficient lookup.

6. File Operations

The operating system provides operations that allow programs to interact with files.

Common operations include:

  • Create

  • Open

  • Read

  • Write

  • Append

  • Seek

  • Close

  • Delete

Internal Execution Flow

When a program executes:

read(file, buffer, size);

the operating system performs:

  1. Locate file using directory information

  2. Load metadata (inode/FCB)

  3. Use current file pointer

  4. Map logical offset to physical block

  5. Read data from storage

  6. Copy data into user buffer

Key Insight

File operations involve several layers:

User Program
      ↓
System Call
      ↓
Kernel
      ↓
File System
      ↓
Storage Device

7. File Types

Although the operating system often treats files uniformly, different file types help applications interpret contents correctly.

7.1 Text Files

Contain human-readable characters.

Examples:

.txt
.csv
.log

7.2 Binary Files

Contain machine-readable data.

Examples:

.exe
.class
.bin

7.3 Executable Files

Contain machine instructions that can be loaded and executed.

7.4 Special Files

Used extensively in UNIX systems.

Examples:

  • Device files

  • Pipes

  • Sockets

Key Insight

UNIX follows the philosophy:

Everything is a File

8. File Control Block (FCB) / inode

The operating system maintains a data structure for every file.

This structure is called:

  • File Control Block (FCB)

  • inode (in UNIX/Linux)

Contents of FCB/inode

  • File identifier

  • Size

  • Permissions

  • Disk block locations

  • Owner information

  • Timestamps

Important Insight

Directory → Stores File Names

inode → Stores Metadata

Directories map file names to inode numbers.

9. File Pointer and Open File Table

When a file is opened:

  • The OS creates an entry in the open file table

  • A file pointer is maintained

The file pointer indicates the current position within the file.

Example

Read 100 bytes

After reading:

File Pointer = File Pointer + 100

This allows sequential file access.

10. Why Files Are Important

The file abstraction provides several critical benefits.

10.1 Persistence

Data remains available even after program termination.

10.2 Sharing

Multiple users or processes can access the same file.

10.3 Protection

Access permissions prevent unauthorized usage.

10.4 Abstraction

Users do not need to know:

  • Physical disk layout

  • Block locations

  • Allocation details

The operating system manages these complexities automatically.

Files provide a logical abstraction over raw storage, enabling persistence, protection, sharing, and efficient data management without requiring users to understand physical disk organization.