Identifying entities

Last updated: Jun 9, 2026

Author :

Nakshatra Verma

Entity identification is the absolute first step in Low-Level Design (LLD). Before you draw a single arrow in a class diagram, pick a design pattern, or write a single line of execution logic, you must uncover the foundational components of your system. It is the process of translating ambiguous, real-world requirement descriptions into concrete, structured software building blocks.

Key ideas:

It defines what exists in your system domain before you decide how those pieces interact.
It prevents you from writing messy, unorganized code by grouping properties and behaviors cleanly.
It forms the direct foundation for your UML Class Diagrams.

Real-world analogies:

In a School System, you cannot schedule a class without first identifying the Teacher, the Student, and the Room.
In an E-Commerce System, you cannot process a payment without establishing the Customer, the Order, and the Item.

The Noun-Extraction Technique

The most structured way to find your entities is by analyzing your requirements text like a detective, specifically hunting down the nouns. When an interviewer gives you a problem statement, or when a business analyst hands you a feature specification, it will be written in plain natural language.

When you read a system prompt or problem statement, highlight every single noun. However, remember that not every noun becomes a full software entity. You must filter them into three distinct buckets:

One: Core Entities

Nouns that have a distinct lifecycle, change states over time, and require a unique identity. For example, a User or an Account. A user is born when they register; they can change status from active to suspended, and they must be distinct from every other user in the system.

Two: Attributes

Nouns that simply describe or belong to a core entity. They cannot stand alone. Examples include Age, EmailAddress, or Price. A price cannot hang suspended in the air; it must belong to a Product or a Ticket.

Three: Irrelevant Context

Nouns mentioned in the description that your software does not need to manage or track. For instance, if you are designing a digital bookstore, the physical "shipping box" or the "delivery truck" might be mentioned in the requirement narrative, but unless you are building the actual fleet logistics module, that truck is completely out of scope.

Categorizing Structural Blocks

Once you filter your list of nouns and throw away the noise, you must categorize them based on how they behave inside your system architecture. Not all objects are created equal. In a rich object-oriented ecosystem, objects play distinct roles.

Core Entities

An entity is an object defined primarily by its unique identity rather than its data values. Even if two entities share identical fields across every single property, they remain completely separate if their unique identifiers differ. They are mutable; their values change over time, but their underlying identity stays completely constant.

Example: Two different Customer accounts named "John Doe" who happen to live in the same city are unique because they hold different customerId values. If John Doe updates his phone number, he is still the same entity.

Value Objects

A value object is defined entirely by its internal values. It does not possess a unique system ID. If two value objects contain identical data, they are considered completely equal. Value objects should always be immutable (read-only) to prevent accidental side effects across your system.

Example: An Address containing "123 Main St, New York" or a Currency containing "100 USD". If you have two different order invoices that total "100 USD", you do not care if it is the "same" hundred dollars; you only care about the value. If you want to change the address of a user, you do not modify the string inside the old address; you create a brand-new Address object and swap it in.

Actors

Actors are entities that represent active roles interacting with the system. They are the triggers of behavior. Examples include Admin, Customer, Guest, or SupportAgent. Identifying actors helps you design role-based access control and partition system capabilities appropriately.

Utilities and Services

These are operational objects that do not represent physical things but execute vital business logic or bridge system boundaries. They are usually stateless or manage system resources rather than domain data. Examples include PaymentProcessor, NotificationEngine, DatabaseConnectionPool, or CryptographyUtility.

Step-by-Step Modeling Example

Let us break down a practical example to see exactly how an expert approaches a raw problem description. Consider this simple requirement snippet for a Library System:

"A Library allows Members to search for Books. Each Book has a title, an author, and a unique ISBN number. The system tracks whether a book is available or loaned out. When a book is checked out, a Loan Record is generated with a due date."

Let us execute the entire workflow step-by-step:

Step 1: Extract Nouns

We read the text and underline the structural words: Library, Members, Books, title, author, ISBN, system, Loan Record, due date.

Step 2: Filter and Evaluate

Library/system: These represent the global architectural boundary or the system's overarching context. They do not represent a small, repeatable data entity inside the application memory, but rather the orchestrator.
Members: This is a human actor who interacts with our application. They have a distinct lifecycle (activation, suspension, termination) and a unique identifier (memberCardId). This is a Core Entity.
Books: A book has an identity, moves through various distinct states, and needs to be tracked individually. This is a Core Entity.
title/author: These are descriptions. They cannot exist without a book. Therefore, they are Attributes belonging inside the Book class.
ISBN: This is a special noun. It is an attribute, but it serves as the unique identifier for the Book entity.
Loan Record: This tracks an agreement over time. It has a state (Active, Overdue, Returned), a creation timestamp, and a specific identifier. This is a Core Entity.
Due date: This is a characteristic of the loan period. It belongs to the loan transaction, making it an Attribute inside the Loan Record.

Step 3: Map the Conceptual Structure

Before moving to code, visualize how these pieces anchor together. A Member entity will interact with a Book entity, and that interaction will generate a LoanRecord entity. The values and attributes sit safely tucked inside these objects.

Example Design in C++

Here is how you turn these identified structures into concrete, production-ready, clean C++ code. We will apply strict object encapsulation to protect the integrity of our domain state.

System Usage Demonstration

Key Highlights of the Code

Encapsulation: The variables isbn, title, and status are strictly private. No external class can bypass the rules and modify them directly.
Invariant Protection: The constructor explicitly rejects empty values. An entity must never exist in an invalid state.
Domain-Driven Behavior: We do not write a generic setStatus() Instead, we write explicit, meaningful domain methods like markAsLoaned(). This mirrors real-world library operations.

Common Pitfalls in Entity Modeling

When beginning software architecture design, it is incredibly easy to fall into bad habits that corrupt the domain model. Keep an eye out for these three dangerous anti-patterns:

Pitfall One: Anemic Domain Models

This happens when your entities are designed as simple, brainless data buckets containing nothing but private variables and public getters and setters. If all your real validation and computational logic lives in giant, separate service classes (like a BookService containing endless lines of sequential code), you completely miss out on object-oriented encapsulation. Keep state validation and immediate behaviors localized inside the entity they belong to.

Pitfall Two: God Classes

Avoid creating a single, massive class that tries to know everything and execute everything. For example, a Library class that tracks books, authenticates members, processes late-fee credit card payments, runs search queries, and sends email notifications. A massive class like this is incredibly fragile, hard to test, and breaks the Single Responsibility Principle. Break it down into highly focused, isolated entities and utility engines.

Pitfall Three: Over-Identifying Everything

Do not give a unique system ID to minor data properties. If an object can be completely replaced by another object with identical values without the system caring about tracking the difference, it does not deserve an ID. If you assign database IDs to every address string, coordinate point, or price tag, you create immense performance overhead and code clutter. Keep those as simple, immutable Value Objects.

Summary

Entity identification forms the structural foundation of all low-level systems:

It ensures you focus on defining the core objects, boundaries, and internal rules before writing active code workflows.
The Noun-Extraction process is a highly systematic way to isolate domain objects from minor attributes and out-of-scope context.
Real entities always maintain a distinct, unique identity and control their own state changes through highly encapsulated business methods.