In a database, a phantom read is a specific kind of isolation anomaly that occurs when a transaction executes the same query twice and sees different rows because new rows have been inserted (or deleted) by another concurrent transaction in between.

In simple terms, the “phantom” is a new row that appears out of nowhere between two similar reads, even though the transaction did not itself insert it.

What Is a Phantom Read?

A phantom read happens under these conditions:

  • Transaction T1 runs a query (for example, “find all employees in department HR”).

  • Transaction T2 inserts a new row that matches the same condition (for example, a new HR employee).

  • Transaction T1 runs the same query again and now sees one more row than before.

From T1’s perspective, the extra row is a phantom because it was not there in the first read but appears in the second.

Example

Suppose:

Table EMPLOYEE(dept, salary)

Transaction T1:

→ finds 5 rows.

Transaction T2:

Transaction T1:

→ now finds 6 rows.

The 6th row is the phantom read.

Why Phantom Read Is a Problem?

Phantom reads violate the consistency of result sets for a transaction that expects a stable view of the data.

For example, if T1 is:

  • Calculating an average salary for HR, or

  • Checking whether there are exactly N HR employees,

then seeing extra rows due to a concurrent insert can make the result incorrect or inconsistent.

Phantom reads are one of the three main read anomalies in isolation levels, along with dirty read and non-repeatable (fuzzy) read.

How Databases Handle Phantom Read

To avoid phantom reads, DBMS can:

  • Use stronger isolation levels like Repeatable Read or Serializable, which prevent new rows from appearing in the same query.

  • Use predicate locks or range locks that lock not just existing rows but also the gaps where new rows could be inserted.

  • Some systems fall back to table-level locking or special phantom-prevention mechanisms in high-isolation modes.

Why It Matters for Beginners

  • Phantom read shows that isolation is not only about existing rows; it also involves newly inserted rows.

  • It helps explain why higher isolation levels are needed when queries use ranges or conditions (for example, WHERE dept = 'HR').

Summary

A phantom read in DBMS occurs when a transaction reads a set of rows matching some condition, another transaction inserts or deletes rows that also match the condition, and the first transaction, when repeating the same read, sees a different set of rows. This anomaly breaks the assumption of a stable result set and is why stronger isolation levels and special locking techniques are needed to guarantee consistent, repeatable query results in concurrent environments.