Advanced Data Structure Interview Questions

Last updated: Mar 5, 2026

Author :

ExamAdda Team

1.What is an AVL Tree?

Definition

An AVL Tree is a self-balancing Binary Search Tree (BST) in which the difference between the heights of the left and right subtrees (called the balance factor) of any node is at most 1.

It automatically maintains balance after every insertion and deletion operation.

Key Properties

It is a Binary Search Tree (BST)
Maintains height balance
Balance factor of each node is -1, 0, or +1
Ensures O(log n) time complexity for operations

Balance Factor

The balance factor of a node is:

Balance Factor = Height of Left Subtree − Height of Right Subtree

+1 → Left heavy
0 → Balanced
-1 → Right heavy

If the balance factor goes beyond this range, the tree becomes unbalanced and requires rebalancing.

Rotations in AVL Tree

To restore balance, AVL Trees use rotations:

Left Rotation (RR Case)
Right Rotation (LL Case)
Left-Right Rotation (LR Case)
Right-Left Rotation (RL Case)

These rotations help maintain the height-balanced property.

Time Complexity

Operation	Time Complexity
Search	O(log n)
Insertion	O(log n)
Deletion	O(log n)

Example

During insertion, if a node becomes unbalanced (balance factor < -1 or > +1), appropriate rotations are applied to rebalance the tree.

2.What are AVL Tree Rotations?

Definition

AVL Tree rotations are operations used to rebalance an AVL Tree when it becomes unbalanced after insertion or deletion.

They restructure the tree while maintaining the Binary Search Tree (BST) property.

Types of Rotations

There are four types of AVL rotations:

1. Left Rotation (RR Case)

Applied when the tree is right-heavy (imbalance occurs in the right subtree of the right child).

Condition

Balance factor < -1 and new node inserted in right subtree of right child.

Operation

The right child becomes the new root, and the original root becomes its left child.

2. Right Rotation (LL Case)

Applied when the tree is left-heavy (imbalance occurs in the left subtree of the left child).

Condition

Balance factor > +1 and new node inserted in left subtree of left child.

Operation

The left child becomes the new root, and the original root becomes its right child.

3. Left-Right Rotation (LR Case)

Applied when the tree is left-heavy, but insertion occurs in the right subtree of the left child.

Condition

Balance factor > +1 and new node inserted in right subtree of left child.

Operation

First perform left rotation on left child
Then perform right rotation on the root

4. Right-Left Rotation (RL Case)

Applied when the tree is right-heavy, but insertion occurs in the left subtree of the right child.

Condition

Balance factor < -1 and new node inserted in left subtree of right child.

Operation

First perform right rotation on right child
Then perform left rotation on the root

3. What is a Red-Black Tree?

Definition

A Red-Black Tree is a self-balancing Binary Search Tree (BST) in which each node has an extra attribute called color (Red or Black).

The tree maintains balance using a set of rules, ensuring that operations like search, insertion, and deletion take O(log n) time.

Properties of Red-Black Tree

A Red-Black Tree follows these rules:

Every node is either Red or Black
The root node is always Black
No two consecutive red nodes are allowed (no red node has a red parent or child)
Every path from a node to its descendant NULL nodes has the same number of black nodes (black height)
All leaf nodes (NULL nodes) are considered Black

Key Characteristics

It is a Binary Search Tree (BST)
Maintains approximate balance (not strictly balanced like AVL)
Uses color properties instead of strict height balancing
Guarantees O(log n) time complexity

Time Complexity

Operation	Time Complexity
Search	O(log n)
Insertion	O(log n)
Deletion	O(log n)

4. What are the Properties of Red-Black Trees?

Definition

A Red-Black Tree follows a set of properties that ensure the tree remains balanced, allowing operations to run in O(log n) time.

Properties

Node Color Property
Every node is either Red or Black.
Root Property
The root node is always Black.
Red Property
A Red node cannot have a Red parent or child (no two consecutive Red nodes).
Black Height Property
Every path from a node to its descendant NULL nodes contains the same number of Black nodes.
Leaf Property
All leaf nodes (NULL nodes) are considered Black.

Key Charecteristics

These properties ensure the tree remains balanced
Prevent the tree from becoming skewed
Maintain logarithmic height
Enable efficient search, insertion, and deletion operations

5. What are the Properties of Red-Black Trees?

Definition

A Red-Black Tree follows a set of properties that ensure the tree remains balanced, allowing operations to run in O(log n) time.

Properties

Node Color Property
Every node is either Red or Black.
Root Property
The root node is always Black.
Red Property
A Red node cannot have a Red parent or child (no two consecutive Red nodes).
Black Height Property
Every path from a node to its descendant NULL nodes contains the same number of Black nodes.
Leaf Property
All leaf nodes (NULL nodes) are considered Black.

Key Characteristics

These properties ensure the tree remains balanced
Prevent the tree from becoming skewed
Maintain logarithmic height
Enable efficient search, insertion, and deletion operations

6. What is a B-Tree?

Definition

A B-Tree is a self-balancing multi-way search tree in which each node can have more than two children.

It is designed to efficiently store and manage large amounts of data, especially on disk, by keeping the tree height small.

Key Properties

A node can have multiple keys and multiple children
Keys within a node are sorted
All leaf nodes are at the same level
The tree remains balanced after insertion and deletion
Each node (except root) has a minimum and maximum number of keys

Order of B-Tree

A B-Tree of order m has:

Maximum m children
Maximum m − 1 keys
Minimum ⌈m/2⌉ children (except root)
Minimum ⌈m/2⌉ − 1 keys

Key Characteristics

It is a balanced tree
Has low height, reducing disk access
Suitable for large datasets and external storage
Supports efficient search, insertion, and deletion

Time Complexity

Operation	Time Complexity
Search	O(log n)
Insertion	O(log n)
Deletion	O(log n)

Applications

Used in databases and file systems
Helps in indexing large data efficiently
Minimizes disk reads and writes

7. Where are B-Trees Used in Databases?

Definition

In databases, B-Trees are primarily used for indexing, which allows data to be searched, inserted, and deleted efficiently without scanning the entire table.

Main Uses of B-Trees in Databases

1. Indexing

B-Trees are used to create indexes on database tables.

They store keys in sorted order
Enable fast searching using logarithmic time
Avoid full table scans

Example: Searching a record by ID or username

2. Primary Index (Clustered Index)

Used to store data in sorted order based on primary key
The actual table data is organized using a B-Tree
Improves range queries and sequential access

3. Secondary Index (Non-Clustered Index)

Stores index keys separately from actual data
Points to the location of data rows
Used for faster lookup on non-primary attributes

Example: Searching by email or name

4. Range Queries

B-Trees are efficient for queries like:

WHERE age BETWEEN 20 AND 30
WHERE salary > 50000

Because data is sorted, range-based retrieval is fast.

5. Disk-Based Storage Optimization

B-Trees are designed for minimizing disk I/O
Each node can store multiple keys, reducing tree height
Fewer disk accesses compared to binary trees

Use Cases

Used for database indexing
Supports fast search, insertion, and deletion
Efficient for range queries
Optimized for disk-based storage systems
Used in most database systems like MySQL, PostgreSQL, and Oracle

8. What is a B+ Tree?

Definition

A B+ Tree is a self-balancing multi-way search tree, similar to a B-Tree, in which all actual data records are stored only at the leaf nodes, while internal nodes store only keys for indexing.

It is widely used in databases and file systems for efficient data retrieval.

Key Properties

It is a balanced tree
Internal nodes store only keys (no actual data)
All data is stored in the leaf nodes
Leaf nodes are connected using linked pointers
All leaf nodes are at the same level

Structure

Internal Nodes → Act as index nodes to guide search
Leaf Nodes → Store actual data records and are linked for sequential access

Key Characteristics

Supports efficient searching, insertion, and deletion
Enables fast range queries due to linked leaf nodes
Has lower height, reducing disk access
More efficient than B-Trees for database indexing

Time Complexity

Operation	Time Complexity
Search	O(log n)
Insertion	O(log n)
Deletion	O(log n)

Applications

Used in database indexing
Used in file systems
Efficient for range queries and ordered data retrieval

9. Difference Between B-Tree and B+ Tree

Definition

Both B-Tree and B+ Tree are self-balancing multi-way search trees used in databases and file systems, but they differ in how they store and access data.

Key Differences

Feature	B-Tree	B+ Tree
Data Storage	Data stored in all nodes (internal + leaf)	Data stored only in leaf nodes
Internal Nodes	Store keys and data	Store only keys (index)
Leaf Nodes	May or may not store all data	Store all actual data records
Search Path	Search may end at internal or leaf node	Search always goes to leaf node
Sequential Access	Not efficient	Efficient (linked leaf nodes)
Leaf Node Linking	Not linked	Leaf nodes are linked (like linked list)
Range Queries	Less efficient	Highly efficient
Disk Access	More	Less (better optimized)

Key Points

B-Tree stores data at multiple levels, making it slightly faster for single record access
B+ Tree stores all data at leaf level, making it better for range queries and sequential access
B+ Trees are preferred in databases due to better disk I/O performance

Use Case Difference

B-Tree: Suitable for general-purpose searching
B+ Tree: Widely used in database indexing and file systems due to efficient range queries and sequential traversal

10. What is a Segment Tree?

Definition

A Segment Tree is a tree-based data structure used to efficiently perform range queries and updates on an array.

It divides the array into segments (intervals) and stores information about these segments in a tree structure.

Key Properties

It is a binary tree
Each node represents a range (segment) of the array
The root represents the entire array
Leaf nodes represent individual elements
Used for range queries and point/range updates

Operations Supported

Range Query – Find sum, minimum, maximum, etc., over a given range
Update – Modify a value in the array and update the tree accordingly

Time Complexity

Operation	Time Complexity
Build Tree	O(n)
Range Query	O(log n)
Update	O(log n)

Space Complexity

Requires approximately 4 × n space for implementation

Example Use Cases

Range sum queries
Range minimum/maximum queries
Frequency counting
Interval-based problems

11. What are the Applications of Segment Trees?

Definition

Segment Trees are used in scenarios where we need to perform efficient range queries and updates on an array.

Applications

1. Range Sum Queries

Find the sum of elements in a given range
Example: Sum of elements from index l to r

2. Range Minimum/Maximum Queries

Find the minimum or maximum element in a range
Commonly used in optimization problems

3. Dynamic Updates

Efficiently handle frequent updates in an array
Example: Updating a value and recalculating range queries

4. Interval Queries

Solve problems involving intervals and overlapping ranges
Example: Counting elements in a range

5. Frequency Counting

Count occurrences of elements in a given range
Useful in query-based problems

6. Lazy Propagation (Advanced Use)

Used when performing range updates efficiently
Avoids updating each element individually

Key Points

Ideal for problems with multiple range queries and updates
Reduces time complexity from O(n) to O(log n) per query
Widely used in competitive programming and real-time data processing

12. What is a Fenwick Tree (Binary Indexed Tree)?

Definition

A Fenwick Tree, also known as a Binary Indexed Tree (BIT), is a data structure used to efficiently perform prefix sum queries and updates on an array.

It stores cumulative frequency information in a way that allows both operations to be performed in O(log n) time.

Key Properties

Used for prefix sum and cumulative frequency queries
Supports point updates efficiently
Based on binary representation of indices
Uses an array-based implementation (no explicit tree structure)

Operations Supported

Prefix Sum Query – Compute sum from index 1 to i
Update – Add a value to a specific index

Time Complexity

Operation	Time Complexity
Update	O(log n)
Prefix Query	O(log n)

Space Complexity

Requires O(n) space

Key Characteristics

More space-efficient than Segment Trees
Easier to implement
Best suited for prefix-based queries

Example Use Cases

Prefix sum calculations
Frequency counting
Inversion count in arrays
Cumulative data processing

13. What is a Trie Data Structure?

Definition

A Trie (also called a Prefix Tree) is a tree-based data structure used to store and retrieve strings efficiently, especially for prefix-based operations.

Each node represents a character, and paths from the root to nodes represent words or prefixes.

Key Properties

It is a tree structure for storing strings
Each node represents a character
Words are formed by traversing from root to leaf
Common prefixes are shared among words
Uses a special marker to indicate the end of a word

Operations Supported

Insertion – Add a word to the Trie
Search – Check if a word exists
Prefix Search – Check if any word starts with a given prefix

Time Complexity

Operation	Time Complexity
Insertion	O(L)
Search	O(L)
Prefix Search	O(L)

(L = length of the word)

Key Characteristics

Efficient for string and prefix-based queries
Avoids repeated storage of common prefixes
Faster than hashing in some prefix problems

Example Use Cases

Autocomplete systems
Spell checking
Dictionary implementations
Search engines (prefix matching)

14. What are the Applications of Trie?

Definition

A Trie is mainly used in applications that involve efficient storage and retrieval of strings, especially when dealing with prefix-based operations.

Applications

1. Autocomplete Systems

Suggest words based on a given prefix
Example: Search engines, mobile keyboards

2. Spell Checking

Check if a word exists in a dictionary
Suggest corrections for misspelled words

3. Dictionary Implementation

Store a large set of words efficiently
Support fast search and prefix queries

4. Prefix Search

Find all words starting with a given prefix
Used in search bars and filtering systems

5. IP Routing (Longest Prefix Matching)

Used in networking to find the longest matching prefix
Helps in routing packets efficiently

6. Word Games

Used in games like Scrabble or Boggle
Helps in checking valid words quickly

7. Pattern Matching

Used in problems involving string matching and searching
Can be extended for advanced structures like suffix tries

15. What is a Suffix Tree?

Definition

A Suffix Tree is a compressed trie (prefix tree) that stores all suffixes of a given string.

It allows efficient solving of various string processing problems such as pattern matching and substring queries.

Key Properties

Represents all suffixes of a string
Each edge stores a substring (not just a single character)
No two edges from a node start with the same character
It is a compressed structure, reducing space compared to a normal trie
Built for a string of length n

Structure

Root → Represents empty string
Internal Nodes → Represent shared prefixes
Leaf Nodes → Represent suffixes of the string

Time Complexity

Operation	Time Complexity
Construction	O(n)
Pattern Search	O(m)

(n = length of text, m = length of pattern)

Key Characteristics

Very efficient for substring search problems
Stores all suffixes in a compact form
Faster than naive string matching approaches

Example

For string "banana", suffixes are:

banana
anana
nana
ana
na
a

These are all represented in the suffix tree.

Applications

Pattern matching
Longest repeated substring
Longest common substring
String compression

16. What is a Suffix Array?

Definition

A Suffix Array is a sorted array of all suffixes of a given string, where each element stores the starting index of a suffix in lexicographical (dictionary) order.

Key Properties

Stores indices of suffixes, not the suffix strings themselves
Suffixes are arranged in lexicographically sorted order
More space-efficient than a Suffix Tree
Often used with LCP (Longest Common Prefix) array for optimization

Structure

For a string of length n:

Generate all n suffixes
Sort them lexicographically
Store their starting indices in an array

Example

For string "banana":

Suffixes:

banana (index 0)
anana (index 1)
nana (index 2)
ana (index 3)
na (index 4)
a (index 5)

Sorted suffixes:

a (5)
ana (3)
anana (1)
banana (0)
na (4)
nana (2)

Suffix Array: [5, 3, 1, 0, 4, 2]

Time Complexity

Operation	Time Complexity
Construction	O(n log n) (can be optimized to O(n))
Pattern Search	O(log n) (using binary search)

Key Characteristics

Uses less memory compared to Suffix Trees
Suitable for efficient string searching
Often combined with binary search for pattern matching

Applications

Pattern matching
Longest common prefix problems
String sorting problems
Data compression algorithms

Key Points

Stores sorted suffix indices
More space-efficient than suffix trees
Useful for fast substring and pattern queries

17. What is a Skip List?

Definition

A Skip List is a probabilistic data structure that allows fast search, insertion, and deletion operations by maintaining multiple levels of linked lists.

It works by "skipping" over elements using higher-level links, making operations efficient similar to balanced trees.

Key Properties

Consists of multiple levels of linked lists
The bottom level contains all elements in sorted order
Higher levels act as express lanes for faster traversal
Uses randomization to decide levels of nodes
Maintains elements in sorted order

Structure

Level 0 → Contains all elements (like a normal linked list)
Higher levels → Contain fewer elements, skipping multiple nodes
Each node may appear in multiple levels

Operations Supported

Search – Traverse from top level to bottom
Insertion – Insert element and randomly assign levels
Deletion – Remove element from all levels

Time Complexity (Average Case)

Operation	Time Complexity
Search	O(log n)
Insertion	O(log n)
Deletion	O(log n)

Key Characteristics

Simpler to implement than balanced trees
Uses randomization instead of strict balancing
Provides performance similar to balanced BSTs

Applications

Databases and indexing systems
Ordered data structures
Memory-efficient alternatives to balanced trees

Key Points

Probabilistic alternative to balanced trees
Uses multiple levels for fast traversal
Maintains sorted data with efficient operations

18. What is a Deque and How is it Implemented?

Definition

A Deque (Double-Ended Queue) is a linear data structure in which insertion and deletion can be performed from both ends (front and rear).

Key Properties

Supports operations at both front and rear
Can function as both queue and stack
Maintains elements in a sequential order

Operations Supported

Insert Front – Add element at the front
Insert Rear – Add element at the rear
Delete Front – Remove element from the front
Delete Rear – Remove element from the rear
Peek Front / Rear – Access elements without removing

Types of Deque

Input Restricted Deque – Insertion allowed at only one end, deletion at both
Output Restricted Deque – Deletion allowed at only one end, insertion at both

Implementation

1. Array Implementation

Use a circular array to efficiently utilize space
Maintain two pointers: front and rear
Handle wrap-around using modulo operation

Characteristics

Fast access using indices
Fixed size (unless dynamically resized)

2. Linked List Implementation

Use a doubly linked list
Each node has pointers to previous and next nodes

Characteristics

Dynamic size
Efficient insertion and deletion at both ends

Time Complexity

Operation	Time Complexity
Insert Front	O(1)
Insert Rear	O(1)
Delete Front	O(1)
Delete Rear	O(1)

Applications

Sliding window problems
Implementing queues and stacks
Palindrome checking
Task scheduling

Key Points

Deque allows bidirectional operations
Can be implemented using arrays or doubly linked lists
All operations are O(1)
More flexible than simple queues and stacks

19. What are the Applications of Deque?

Definition

A Deque (Double-Ended Queue) is used in applications where insertion and deletion from both ends are required efficiently.

Applications

1. Sliding Window Problems

Used to efficiently find maximum/minimum in a sliding window
Example: Maximum of all subarrays of size k

2. Implementing Stack and Queue

Can act as both stack (LIFO) and queue (FIFO)
Provides flexibility in operations

3. Palindrome Checking

Compare elements from front and rear simultaneously
Efficient for checking if a string is a palindrome

4. Task Scheduling

Used in systems where tasks can be added or removed from both ends
Example: CPU scheduling, job processing

5. Breadth-First Search (BFS)

Used in graph traversal
Efficiently handles insertion and deletion during traversal

6. Undo/Redo Operations

Stores operations where actions can be added/removed from both ends
Useful in editors and applications

Key Points

Supports operations from both ends
Useful for range-based and dynamic problems
All operations can be performed in O(1) time
Widely used in sliding window and real-time processing problems

20. How is LRU Cache Implemented?

Definition

An LRU (Least Recently Used) Cache is a data structure that stores a limited number of items and removes the least recently used item when the cache reaches its capacity.

Approach

An efficient LRU Cache is implemented using a combination of:

Hash Map (Dictionary) → for O(1) access to elements
Doubly Linked List → to maintain the order of usage

Working Principle

The most recently used (MRU) item is kept at the front
The least recently used (LRU) item is kept at the end
On every access or insertion:

Move the item to the front

When capacity is exceeded:

Remove the item from the end

Operations

1. Get(key)

If key exists:

Move the node to the front
Return value

Else:

Return -1

2. Put(key, value)

If key exists:

Update value
Move node to front

Else:

Insert new node at front
If capacity exceeded:

Remove node from rear (LRU)

Time Complexity

Operation	Time Complexity
Get	O(1)
Put	O(1)

Key Characteristics

Combines fast lookup (hash map) and order tracking (linked list)
Maintains recent usage order efficiently
Ensures constant time operations

Applications

Memory caching
Database query caching
Web browser cache
Operating systems (page replacement)

Key Points

Uses Hash Map + Doubly Linked List
Maintains recently used order
Removes least recently used element when full
All operations are O(1)

21. Which Data Structures are Used in LRU Cache?

Definition

An LRU Cache is implemented using a combination of two data structures to achieve O(1) time complexity for both access and update operations.

Data Structures Used

1. Hash Map (Dictionary)

Stores key → node reference
Enables O(1) lookup to check if a key exists
Helps directly access the corresponding node in the linked list

2. Doubly Linked List

Maintains the order of usage
Stores elements from most recently used (front) to least recently used (rear)
Supports O(1) insertion and deletion

Why This Combination?

Hash Map provides fast access to elements
Doubly Linked List maintains usage order efficiently
Together, they ensure both Get and Put operations run in O(1) time

Key Charecteristics

Uses Hash Map + Doubly Linked List
Hash map handles fast lookup
Linked list handles ordering of elements
Essential for maintaining LRU policy efficiently

22. What is the difference Between Heap and Priority Queue?

Definition

A Heap is a tree-based data structure that satisfies the heap property, while a Priority Queue is an abstract data type (ADT) where elements are processed based on their priority.

Key Differences

Feature	Heap	Priority Queue
Type	Data Structure	Abstract Data Type (ADT)
Purpose	Maintains heap property (min/max)	Retrieves elements based on priority
Implementation	Typically implemented using arrays (binary heap)	Can be implemented using heap, array, or linked list
Ordering	Partial ordering (parent-child relationship)	Elements served based on priority
Operations	Insert, delete, heapify	Insert, delete (highest/lowest priority)

Relationship Between Them

A heap is commonly used to implement a priority queue
Using a heap allows priority queue operations to run in O(log n) time

Key Points

Heap → Implementation (data structure)
Priority Queue → Concept (ADT)
Heap provides efficient way to build priority queues
Priority queues can also be implemented using other structures but less efficiently

23. How Does Java HashMap Handle Collisions?

Definition

A collision in a HashMap occurs when multiple keys generate the same hash index.

Java HashMap handles collisions using a combination of chaining (linked list) and tree-based structure (Red-Black Tree).

Collision Handling Mechanism

1. Hashing and Index Calculation

The key’s hashCode() is computed
It is converted into an index using a hashing function
Multiple keys may map to the same index (bucket)

2. Chaining using Linked List

Each bucket stores elements as a linked list
If multiple keys map to the same index:

They are stored as nodes in the linked list

During search:

Traverse the list to find the matching key

3. Conversion to Red-Black Tree

If the number of elements in a bucket exceeds a threshold (typically 8):

The linked list is converted into a Red-Black Tree

This improves performance from O(n) to O(log n) in worst case

4. Tree to List Conversion

If elements reduce below a threshold (typically 6):

The structure is converted back to a linked list

Time Complexity

Operation	Average Case	Worst Case
Get / Put	O(1)	O(log n)

Key Points

Uses hashing + chaining
Collisions handled via linked list initially
Converts to Red-Black Tree for high collisions
Ensures efficient performance even in worst cases

24. What is Rehashing in HashMap?

Definition

Rehashing is the process of resizing the HashMap and redistributing its elements into a new larger bucket array when the number of elements exceeds a certain limit.

Why Rehashing is Needed

Prevents too many collisions
Maintains efficient performance (O(1) average time)
Avoids long linked lists or trees in buckets

Load Factor

HashMap uses a load factor (default = 0.75)
When:
Number of elements > Capacity × Load Factor
→ Rehashing is triggered

Rehashing Process

Create a new bucket array with larger capacity (usually double)
Recalculate the hash index for each existing key
Insert all elements into the new array based on new indices

Time Complexity

Rehashing takes O(n) time (since all elements are reinserted)
Happens occasionally, so average operations remain efficient

Key Charecteristics

Triggered when load factor threshold is exceeded
Increases capacity of HashMap
Reduces collisions and improves performance
Ensures average O(1) time complexity for operations

25. What is Consistent Hashing?

Definition

Consistent Hashing is a technique used to distribute data across multiple servers (or nodes) such that when a server is added or removed, only a small portion of data needs to be redistributed.

Working

Both data (keys) and servers (nodes) are mapped to a circular hash space (hash ring)
Each key is assigned to the next available server in the clockwise direction

How It Works

Hash all servers and place them on a circular ring
Hash each key (data) and place it on the same ring
Assign each key to the nearest server clockwise

Handling Changes

1. Adding a Server

Only the keys between the new server and its previous node are reassigned
Rest of the data remains unaffected

2. Removing a Server

Its keys are reassigned to the next server on the ring
Minimal data movement

Advantages

Minimizes data redistribution
Provides scalability
Ensures load balancing
Avoids massive reshuffling of data

Applications

Distributed caching systems
Load balancing
Distributed databases
Content delivery networks (CDNs)

Key Characteristics

Uses a hash ring (circular structure)
Maps both servers and data using hash functions
Ensures minimal data movement on scaling
Widely used in distributed systems

26. What is a Distributed Hash Table (DHT)?

Definition

A Distributed Hash Table (DHT) is a decentralized system that stores and retrieves key-value pairs across multiple nodes in a distributed network.

It extends the concept of a hash table by distributing data across different machines instead of storing it in a single location.

Key Properties

Stores data as key → value pairs
Data is distributed across multiple nodes
Each node is responsible for a portion of the key space
Provides scalable and fault-tolerant storage
No central authority (decentralized system)

How It Works

A hash function is used to map keys to positions in a logical space
Each node is assigned a range of keys
When storing or retrieving data:

The system routes the request to the node responsible for that key

Key Characteristics

Supports efficient lookup, insertion, and deletion
Lookup operations typically take O(log n) time
Handles node joins and failures dynamically
Ensures data availability and load distribution

Applications

Peer-to-peer (P2P) networks
Distributed file systems
Distributed caching systems
Content sharing systems (e.g., BitTorrent)

Key Points

Distributed version of a hash table
Provides scalable and decentralized data storage
Uses hashing to distribute data across nodes
Efficient for large-scale distributed systems

27. What is Union-Find (Disjoint Set)?

Definition

Union-Find, also known as a Disjoint Set, is a data structure used to keep track of a set of elements partitioned into disjoint (non-overlapping) subsets.

It efficiently supports operations to find which set an element belongs to and to merge two sets.

Key Operations

1. Find(x)

Returns the representative (parent/root) of the set containing element x
Used to check if two elements belong to the same set

2. Union(x, y)

Merges the sets containing elements x and y
Ensures both elements belong to the same set after operation

Key Properties

Each set has a representative (root node)
Initially, each element is in its own set
Sets are merged using union operation
Uses a tree-based structure internally

Optimizations

Path Compression → Flattens the tree during find operation
Union by Rank/Size → Attaches smaller tree under larger tree

These optimizations make operations nearly constant time (O(α(n))).

Time Complexity

Operation	Time Complexity
Find	O(α(n))
Union	O(α(n))

(α(n) is the inverse Ackermann function, very slow-growing)

Applications

Detecting cycles in graphs
Kruskal’s Algorithm (Minimum Spanning Tree)
Network connectivity problems
Grouping or clustering problems

Key Characteristics

Manages disjoint sets efficiently
Supports find and union operations
Uses path compression and union by rank
Nearly constant time operations

28. What is Path Compression in Union-Find?

Definition

Path Compression is an optimization technique used in Union-Find (Disjoint Set) to make the Find operation faster by flattening the structure of the tree.

Key Idea

During the Find(x) operation, all nodes along the path from x to the root are directly connected to the root
This reduces the height of the tree, making future operations faster

How It Works

When finding the root of an element:

Recursively find the root
Update each node’s parent to point directly to the root

Example

Before path compression:
A chain like → 1 → 2 → 3 → 4

After applying Find(1):
All nodes point directly to root → 1, 2, 3 → 4

Benefits

Reduces tree height significantly
Speeds up future Find operations
Works best when combined with Union by Rank/Size

Time Complexity

Makes operations nearly O(α(n)) (almost constant time)

Key Characteristics

Flattens the tree during Find operation
Improves efficiency of Union-Find
Ensures very fast repeated operations
Commonly used with Union by Rank

29. What is a Strongly Connected Component in a Graph?

Definition

A Strongly Connected Component (SCC) is a maximal group of vertices in a directed graph such that every vertex is reachable from every other vertex in the same group.

Key Idea

In a directed graph, for any two vertices u and v in an SCC:

There exists a path from u → v
And a path from v → u

Key Properties

Applicable only to directed graphs
Each SCC is a maximal set (cannot be extended further)
A graph can have one or multiple SCCs
Every vertex belongs to exactly one SCC

Example

If nodes A, B, C are all reachable from each other, they form one SCC.
Another group of nodes may form a separate SCC.

Algorithms to Find SCC

Kosaraju’s Algorithm
Tarjan’s Algorithm
Kosaraju uses two DFS traversals

Applications

Detecting cycles in directed graphs
Analyzing social networks
Dependency resolution in systems
Compiler design

Key Characteristics

SCC = mutually reachable vertices
Exists only in directed graphs
Forms independent components in a graph
Found using DFS-based algorithms

30. What is Kosaraju’s Algorithm?

Definition

Kosaraju’s Algorithm is a graph algorithm used to find all Strongly Connected Components (SCCs) in a directed graph using two Depth-First Search (DFS) traversals.

Key Idea

The algorithm works by:

Ordering vertices based on finishing times in DFS
Reversing the graph
Running DFS again to identify SCCs

Steps of Algorithm

Perform DFS on the original graph

Push vertices onto a stack in order of their finishing time

Reverse (transpose) the graph

Reverse the direction of all edges

Perform DFS on the reversed graph

Process vertices in the order of the stack (top to bottom)
Each DFS traversal gives one Strongly Connected Component

Key Properties

Uses two DFS traversals
Works only on directed graphs
Identifies all SCCs efficiently

Time Complexity

Operation	Time Complexity
Overall	O(V + E)

(V = number of vertices, E = number of edges)

Applications

Finding strongly connected components
Detecting cycles in directed graphs
Analyzing graph connectivity

Key Characteristics

Based on DFS and graph reversal
Uses a stack to track finishing order
Efficient for large graphs with O(V + E) complexity

31. What is Tarjan’s Algorithm?

Definition

Tarjan’s Algorithm is a graph algorithm used to find all Strongly Connected Components (SCCs) in a directed graph using a single Depth-First Search (DFS) traversal.

Key Idea

The algorithm assigns each node:

A discovery time (disc) → when the node is first visited
A low value (low) → the smallest discovery time reachable from that node

Using these values, it identifies SCCs during DFS.

Steps of Algorithm

Perform DFS traversal of the graph
Assign each node:

disc[u] = discovery time
low[u] = minimum reachable discovery time

Maintain a stack to store visited nodes
For each node:

Update low values based on neighbors
If disc[u] == low[u], it is the root of an SCC

Pop nodes from the stack until the current node is reached → forms one SCC

Key Properties

Uses single DFS traversal
Maintains a stack to track nodes in current SCC
Efficient and avoids graph reversal

Time Complexity

Operation	Time Complexity
Overall	O(V + E)

(V = number of vertices, E = number of edges)

Applications

Finding Strongly Connected Components
Cycle detection in directed graphs
Dependency analysis

Key Points

Uses disc and low arrays
Identifies SCCs in one DFS pass
More efficient than algorithms requiring multiple traversals
Based on DFS and stack mechanism

32. What is Floyd-Warshall Algorithm?

Definition

The Floyd-Warshall Algorithm is a dynamic programming algorithm used to find the shortest paths between all pairs of vertices in a weighted graph.

It works for both directed and undirected graphs, and can handle negative edge weights (but not negative cycles).

Key Idea

It gradually improves the shortest path between every pair of vertices
Uses an intermediate vertex to check if a shorter path exists

Core Concept

d[i][j] = \min(d[i][j],\ d[i][k] + d[k][j])

This means:

Either keep the current shortest path from i to j
Or update it using an intermediate vertex k

Steps of Algorithm

Initialize a distance matrix with given edge weights
Set distance of a node to itself as 0
For each vertex k:

Update all pairs (i, j) using the formula above

Repeat until all vertices are considered as intermediates

Key Properties

Works on weighted graphs
Can detect negative cycles
Computes shortest paths between all pairs of nodes

Time Complexity

Operation	Time Complexity
Overall	O(V³)

Applications

All-pairs shortest path problems
Network routing
Graph analysis
Detecting negative cycles

Key Characteristics

Based on dynamic programming
Uses matrix representation
Handles negative weights (not negative cycles)
Simple but less efficient for very large graphs

33. What is A* Search Algorithm?

Definition

A* (A-star) is a pathfinding and graph traversal algorithm used to find the shortest path from a source node to a target node using both actual cost and heuristic estimates.

Key Idea

A* selects the next node based on a function that combines:

Actual cost from start (g(n))
Estimated cost to goal (h(n))

Evaluation Function

f(n) = g(n) + h(n)

f(n) → Total estimated cost
g(n) → Cost from start to current node
h(n) → Heuristic estimate to goal

How It Works

Start from the source node
Maintain an open list (nodes to explore)
Select node with lowest f(n)
Expand neighbors and update costs
Repeat until the goal node is reached

Key Properties

Uses both actual distance and heuristic
Guarantees shortest path if heuristic is admissible
More efficient than Dijkstra when heuristic is good

Time Complexity

Depends on implementation and heuristic
Worst case: O(b^d)
(b = branching factor, d = depth)

Applications

Pathfinding in maps and games
Navigation systems (GPS)
Robotics and AI
Maze solving

Key Points

Combines Dijkstra + Greedy Best-First Search
Uses heuristic function for efficiency
Finds optimal path with proper heuristic
Widely used in AI and pathfinding problems

34. What is Network Flow Problem?

Definition

The Network Flow Problem is a problem of finding the maximum amount of flow that can be sent from a source node to a sink node in a directed graph, subject to capacity constraints on edges.

Key Components

Source (s) → Starting node where flow originates
Sink (t) → Destination node where flow ends
Capacity → Maximum flow allowed through an edge
Flow → Amount of data passing through edges

Key Constraints

Flow on an edge cannot exceed its capacity
Flow entering a node equals flow leaving it (flow conservation)
Flow moves from source to sink through valid paths

Objective

Maximize the total flow from source to sink

Key Algorithms

Ford-Fulkerson Algorithm
Edmonds-Karp Algorithm (BFS-based)
Dinic’s Algorithm

Time Complexity

Depends on the algorithm used
Example: Edmonds-Karp → O(VE²)

Applications

Network bandwidth optimization
Traffic and transportation systems
Bipartite matching problems
Supply chain and logistics

Key Characteristics

Directed graph with capacity constraints
Goal is to find maximum flow
Uses augmenting paths
Fundamental problem in graph theory and optimization

35. What is Ford-Fulkerson Algorithm?

Definition

The Ford-Fulkerson Algorithm is a method used to find the maximum flow in a flow network by repeatedly finding augmenting paths from the source to the sink.

Key Idea

Start with zero flow
Find a path from source to sink with available capacity
Push flow through that path
Repeat until no more augmenting paths exist

Key Concept

Augmenting Path → A path from source to sink where all edges have remaining capacity
Residual Graph → Graph showing remaining capacity after each flow update

Steps of Algorithm

Initialize flow in all edges as 0
While there exists an augmenting path:

Find the minimum residual capacity (bottleneck) in the path
Add this flow to all edges in the path
Update the residual graph

Repeat until no augmenting path is found

Time Complexity

O(E × max_flow)
Depends on the value of maximum flow

Key Properties

Works on directed graphs with capacities
Uses residual graph for updates
May not be efficient for large capacities

Applications

Network flow problems
Bipartite matching
Resource allocation
Traffic systems

Key Characteristics

Based on finding augmenting paths repeatedly
Uses residual capacity concept
Forms the basis for advanced algorithms like Edmonds-Karp
Computes maximum flow from source to sink

36. What is Edmonds-Karp Algorithm?

Definition

The Edmonds-Karp Algorithm is an implementation of the Ford-Fulkerson Algorithm used to find the maximum flow in a flow network, where the augmenting paths are found using Breadth-First Search (BFS).

Key Idea

Instead of choosing any augmenting path, it always selects the shortest path (in terms of number of edges) from source to sink
This ensures better and predictable performance

How It Works

Start with zero flow
Use BFS to find the shortest augmenting path in the residual graph
Find the minimum capacity (bottleneck) along the path
Add flow along the path
Update the residual graph
Repeat until no augmenting path exists

Key Properties

Uses BFS for path selection
Guarantees finding shortest augmenting paths
Improves efficiency over basic Ford-Fulkerson

Time Complexity

Operation	Time Complexity
Overall	O(V × E²)

(V = vertices, E = edges)

Applications

Maximum flow problems
Bipartite matching
Network routing and traffic systems

Key Points

A specific implementation of Ford-Fulkerson
Uses BFS instead of DFS
Ensures polynomial time complexity
Works with residual graphs and augmenting paths

37. What is Edmonds-Karp Algorithm?

Definition

The Edmonds-Karp Algorithm is an implementation of the Ford-Fulkerson Algorithm used to find the maximum flow in a flow network, where the augmenting paths are found using Breadth-First Search (BFS).

Key Idea

Instead of choosing any augmenting path, it always selects the shortest path (in terms of number of edges) from source to sink
This ensures better and predictable performance

How It Works

Start with zero flow
Use BFS to find the shortest augmenting path in the residual graph
Find the minimum capacity (bottleneck) along the path
Add flow along the path
Update the residual graph
Repeat until no augmenting path exists

Key Properties

Uses BFS for path selection
Guarantees finding shortest augmenting paths
Improves efficiency over basic Ford-Fulkerson

Time Complexity

Operation	Time Complexity
Overall	O(V × E²)

(V = vertices, E = edges)

Applications

Maximum flow problems
Bipartite matching
Network routing and traffic systems

Key Points

A specific implementation of Ford-Fulkerson
Uses BFS instead of DFS
Ensures polynomial time complexity
Works with residual graphs and augmenting paths

38. What is Topological Sorting Used For?

Definition

Topological Sorting is used to order the vertices of a Directed Acyclic Graph (DAG) such that for every directed edge u → v, node u appears before v in the ordering.

Uses of Topological Sorting

1. Task Scheduling

Determines the correct order of tasks based on dependencies
Example: Completing tasks where some tasks must be done before others

2. Course Scheduling

Used to find a valid order of courses based on prerequisites
Example: Course A must be completed before Course B

3. Dependency Resolution

Resolves dependencies in systems like:

Build systems
Package managers

4. Job Scheduling in Operating Systems

Helps in scheduling jobs where certain jobs depend on others

5. Detecting Cycles in Directed Graphs

If topological sorting is not possible, the graph contains a cycle

Key Characteristics

Applicable only to Directed Acyclic Graphs (DAGs)
Ensures dependency order is maintained
Used in scheduling and dependency problems
Can be implemented using DFS or BFS (Kahn’s Algorithm)

39. What is Dynamic Programming Optimization?

Definition

Dynamic Programming (DP) Optimization refers to techniques used to improve the efficiency of DP solutions by reducing their time and/or space complexity.

Key Idea

Standard DP solutions may have high time complexity (e.g., O(n²), O(n³))
Optimization techniques aim to reduce redundant computations and speed up transitions

Common DP Optimization Techniques

1. Memoization and Tabulation Optimization

Avoid recomputation by storing results
Convert recursive DP to iterative (bottom-up) for better performance

2. Space Optimization

Reduce space from O(n²) to O(n) or O(1)
Example: Using rolling arrays instead of full DP tables

3. Binary Search Optimization

Apply binary search within DP transitions
Used when states are monotonic

4. Knuth Optimization

Reduces time complexity of certain DP problems from O(n³) to O(n²)
Applicable when quadrangle inequality and monotonicity hold

5. Divide and Conquer Optimization

Splits DP computation into smaller parts
Reduces complexity for specific problems

6. Convex Hull Trick

Optimizes DP involving linear functions
Reduces complexity using geometric properties

Key Characteristics

Applied to optimize existing DP solutions
Requires identifying patterns or properties in transitions
Often reduces time complexity significantly
Focuses on improving DP efficiency
Uses mathematical and structural insights
Reduces time and/or space complexity
Important for solving large-scale DP problems efficiently

Applications

Matrix chain multiplication optimization
Range DP problems
Optimization problems in competitive programming

40. What is Memoization?

Definition

Memoization is a technique used in dynamic programming where the results of expensive function calls are stored (cached) so that they can be reused when the same inputs occur again.

Key Idea

Avoid recomputing the same subproblems
Store results in a cache (array/map)
Return stored result if already computed

How It Works

When a function is called:

Check if result is already stored

If yes:

Return the stored value

If no:

Compute the result
Store it in memory
Return the result

Key Properties

Typically used with recursive solutions
Converts exponential time complexity to polynomial
Uses extra space for caching results

Example

In Fibonacci:

Without memoization → repeated calculations
With memoization → each value computed only once

Time Complexity

Reduced from O(2ⁿ) to O(n) in many cases

Key Points

Stores results of previous computations
Eliminates redundant work
Improves efficiency of recursive algorithms
Core concept in dynamic programming

41. What is Tabulation in Dynamic Programming?

Definition

Tabulation is a bottom-up approach in dynamic programming where problems are solved by iteratively filling a table (array) starting from base cases and building up to the final solution.

Key Idea

Solve smaller subproblems first
Store results in a table (DP array)
Use these results to compute larger subproblems

How It Works

Initialize a DP table with base cases
Iterate through all subproblems in a systematic order
Use previously computed values to fill the table
The final answer is stored in the table

Key Properties

Uses an iterative approach (no recursion)
Avoids function call overhead
Ensures each subproblem is solved only once

Example

Fibonacci using tabulation:

Start with base cases:
dp[0] = 0, dp[1] = 1
Fill table:
dp[i] = dp[i-1] + dp[i-2]

Time Complexity

Typically O(n) (depends on problem)

Key Characteristics

Bottom-up DP approach
Uses a table to store results
Eliminates recursion
Efficient and easy to implement for many problems

42. What is the Difference Between Greedy and Backtracking?

Definition

Greedy algorithms make the locally optimal choice at each step hoping to achieve a global optimum, while Backtracking explores all possible solutions and selects the valid or optimal one by undoing choices when needed.

Key Differences

Feature	Greedy	Backtracking
Approach	Chooses best option at each step	Explores all possible solutions
Decision Making	Local optimum	Global optimum (via exploration)
Reconsideration	Does not reconsider choices	Backtracks (undoes choices)
Complexity	Usually efficient (O(n), O(log n))	Often exponential (O(2ⁿ), O(n!))
Implementation	Simple and fast	More complex (uses recursion)
Guarantee	May not always give optimal solution	Guarantees correct solution

Examples

Greedy:

Activity Selection
Huffman Coding
Dijkstra (without negative weights)

Backtracking:

N-Queens
Sudoku Solver
Permutations / Subsets

Key Points

Greedy: Fast, simple, but not always optimal
Backtracking: Slower, but guarantees correct solution
Use greedy when problem has greedy-choice property
Use backtracking when all possibilities must be explored

43. What is Space-Time Tradeoff?

Definition

The space-time tradeoff is a concept in computer science where an algorithm uses more memory (space) to reduce execution time, or uses less memory at the cost of increased execution time.

Key Idea

You can save time by using extra space
Or save space by doing more computations (taking more time)

Examples

1. Using Extra Space to Save Time

Hashing (Hash Map)

Lookup in O(1) time
Uses extra memory

2. Saving Space but Increasing Time

Brute Force Search

No extra memory
Takes O(n) time

3. Dynamic Programming

Stores intermediate results (extra space)
Avoids recomputation → reduces time

Key Characteristics

Tradeoff between memory usage and speed
Choice depends on problem constraints
Important in system design and optimization
More space → faster execution
Less space → slower execution
Balance depends on requirements and constraints

Applications

Caching systems
Dynamic programming
Database indexing
Algorithm optimization

44. What is Tail Recursion?

Definition

Tail Recursion is a type of recursion where the recursive call is the last operation performed in the function, and there is no pending work after the call returns.

Key Idea

The function returns the result of the recursive call directly
No computation is done after the recursive call

Example

Tail Recursive Function (Factorial)

def factorial(n, result=1):

if n == 0:

return result

return factorial(n - 1, n * result)

Here, the recursive call is the last statement, and no further work is needed after it.

Non-Tail Recursive Example

def factorial(n):

if n == 0:

return 1

return n * factorial(n - 1)

Here, multiplication happens after the recursive call, so it is not tail recursion.

Key Properties

Recursive call is the final operation
No pending operations after the call
Can be optimized by the compiler into iteration (tail call optimization)

Benefits

Reduces stack space usage
Prevents stack overflow in deep recursion
Can be as efficient as loops

Key Points

Last operation is the recursive call
No work after recursion
Enables tail call optimization (TCO)
Improves space efficiency

45. What is Iterative Deepening Search?

Definition

Iterative Deepening Search (IDS) is a graph traversal algorithm that combines the advantages of Depth-First Search (DFS) and Breadth-First Search (BFS) by performing depth-limited DFS repeatedly with increasing depth limits.

Key Idea

Perform DFS with a depth limit
Increase the depth limit step by step
Repeat until the goal node is found

How It Works

Start with depth limit = 0
Perform Depth-Limited DFS
If goal is not found:

Increase depth limit by 1

Repeat until the goal is reached

Key Properties

Combines space efficiency of DFS
Ensures optimal solution like BFS (for uniform cost)
Explores nodes level by level

Time Complexity

O(b^d)
(b = branching factor, d = depth of solution)

Space Complexity

O(d) (same as DFS)

Applications

AI search problems
Game tree exploration
Pathfinding in unknown depth graphs

Key Points

Repeated depth-limited DFS
Finds optimal solution like BFS
Uses less memory than BFS
Suitable when depth of solution is unknown

46. What is the Difference Between BFS, DFS, and Dijkstra?

Definition

BFS (Breadth-First Search): Traverses a graph level by level using a queue
DFS (Depth-First Search): Traverses as deep as possible before backtracking using a stack/recursion
Dijkstra’s Algorithm: Finds the shortest path from a source node to all other nodes in a weighted graph

Key Differences

Feature	BFS	DFS	Dijkstra
Type	Traversal Algorithm	Traversal Algorithm	Shortest Path Algorithm
Data Structure	Queue	Stack / Recursion	Priority Queue (Min Heap)
Graph Type	Unweighted	Unweighted	Weighted (non-negative weights)
Goal	Visit all nodes level-wise	Explore paths deeply	Find shortest distances
Shortest Path	Yes (only in unweighted graphs)	No guarantee	Yes (weighted graphs)
Approach	Level-by-level	Depth-first	Greedy (minimum distance first)

Time Complexity

Algorithm	Time Complexity
BFS	O(V + E)
DFS	O(V + E)
Dijkstra	O((V + E) log V)

Use Cases

BFS: Shortest path in unweighted graphs, level-order traversal
DFS: Cycle detection, topological sorting, backtracking
Dijkstra: Shortest path in weighted graphs (e.g., maps, routing)

Key Characteristics

BFS: Uses queue, explores level-wise
DFS: Uses stack/recursion, explores depth-wise
Dijkstra: Uses priority queue, finds minimum distance paths
BFS & DFS are traversal algorithms, while Dijkstra is a shortest path algorithm

47. How Do You Detect Cycles in Directed Graphs?

Definition

Cycle detection in a directed graph involves checking whether there exists a path that starts and ends at the same node following edge directions.

Methods to Detect Cycles

1. Using DFS (Recursion Stack)

Key Idea

Track visited nodes and maintain a recursion stack
If a node is visited again while still in the recursion stack → cycle exists

Steps

Mark node as visited
Add node to recursion stack
Recur for all neighbors:

If neighbor is not visited → DFS
If neighbor is in recursion stack → cycle detected

Remove node from recursion stack after DFS

2. Using Topological Sorting (Kahn’s Algorithm)

Key Idea

A directed graph has a cycle if topological sorting is not possible

Steps

Compute in-degree of all nodes
Add nodes with in-degree = 0 to a queue
Process nodes and reduce in-degrees
If all nodes are processed → no cycle
If some nodes remain → cycle exists

Time Complexity

Method	Time Complexity
DFS	O(V + E)
Kahn’s Algorithm	O(V + E)

Key Characteristics

Applicable only to directed graphs
DFS with recursion stack is most commonly used
Topological sort failure indicates a cycle
Both methods run in O(V + E) time

48. How Do You Check if a Binary Tree is a BST?

Definition

To check if a binary tree is a Binary Search Tree (BST), we verify that for every node:

All nodes in the left subtree are less than the node
All nodes in the right subtree are greater than the node

Method 1: Using Range (Min-Max Approach)

Key Idea

Each node must lie within a valid range (min, max)
Update range while traversing the tree

Steps

Start with range (-∞, +∞)
For each node:

Check if value is within range
Recursively check:

Left subtree → range (min, node.value)
Right subtree → range (node.value, max)

Method 2: Inorder Traversal

Key Idea

Inorder traversal of a BST gives a sorted sequence

Steps

Perform inorder traversal
Check if elements are in strictly increasing order

Time Complexity

Method	Time Complexity
Both methods	O(n)

Space Complexity

O(h) (recursion stack, where h = height of tree)

Key Characteristics

Use range validation for correctness
Inorder traversal should be sorted
Must check entire subtree, not just immediate children
Works in linear time O(n)

49. How Do You Find the Lowest Common Ancestor (LCA) in a Tree?

Definition

The Lowest Common Ancestor (LCA) of two nodes in a tree is the lowest (deepest) node that has both nodes as descendants.

Method 1: General Binary Tree (DFS Approach)

Key Idea

Traverse the tree recursively
If a node matches either of the target nodes, return it
If both left and right subtrees return non-null → current node is LCA

Steps

If root is null, return null
If root equals p or q, return root
Recursively search in left and right subtrees
If both sides return non-null → current node is LCA
Else return the non-null result

Method 2: Binary Search Tree (BST)

Key Idea

Use BST property to find LCA efficiently

Steps

If both nodes are less than root → move to left subtree
If both nodes are greater than root → move to right subtree
Else → current node is LCA

Time Complexity

Case	Time Complexity
Binary Tree	O(n)
BST	O(h)

(h = height of tree)

Space Complexity

O(h) (recursion stack)

Key Characteristics

LCA is the lowest node common to both nodes
Use DFS for general trees
Use BST properties for optimization
Works in linear or logarithmic time depending on tree type

50. How Do You Serialize and Deserialize a Binary Tree?

Definition

Serialization is the process of converting a binary tree into a string or array representation, while Deserialization is the process of reconstructing the tree from that representation.

Key Idea

Preserve both node values and structure
Use a special marker (e.g., "null") to represent missing nodes

Method: Using Level Order Traversal (BFS)

Serialization Steps

Use a queue for level-order traversal
For each node:

Add its value to the result
If node is null, add "null"

Continue until all nodes are processed

Deserialization Steps

Read values from the serialized data
Create the root node
Use a queue to reconstruct the tree
For each node:

Assign left and right children using next values
Add non-null children to the queue

Alternative Method: Using Preorder Traversal (DFS)

Serialization

Traverse in preorder (root → left → right)
Add "null" for missing nodes

Deserialization

Reconstruct tree recursively using the same order

Time Complexity

Operation	Time Complexity
Serialize	O(n)
Deserialize	O(n)

Key Points

Must store structure + values
Use "null" markers for missing nodes
Can use BFS or DFS approaches
Ensures accurate reconstruction of the tree

51. What is the Diameter of a Binary Tree?

Definition

The diameter of a binary tree is the length of the longest path between any two nodes in the tree.

This path may or may not pass through the root.

Key Idea

The longest path between two nodes can be:

Entirely in the left subtree
Entirely in the right subtree
Passing through the current node

Approach

For each node:

Compute:

Height of left subtree
Height of right subtree

Diameter through that node = left height + right height
Take the maximum over all nodes

Algorithm

Traverse the tree using DFS
For each node:

Calculate left and right heights
Update maximum diameter

Return height to parent

Time Complexity

Operation	Time Complexity
Overall	O(n)

Space Complexity

O(h) (recursion stack, where h = height of tree)

Key Points

Diameter = longest path between any two nodes
May or may not pass through root
Computed using height of subtrees
Solved efficiently using DFS in O(n) time

52. How Do You Detect a Loop in a Linked List?

Definition

Loop detection in a linked list involves checking whether the list contains a cycle, i.e., a node that points back to a previous node instead of NULL.

Method: Floyd’s Cycle Detection (Tortoise and Hare)

Key Idea

Use two pointers:

Slow pointer → moves one step at a time
Fast pointer → moves two steps at a time

If a cycle exists, both pointers will eventually meet

Steps

Initialize:

slow = head
fast = head

Traverse the list:

Move slow by 1 step
Move fast by 2 steps

If at any point:

slow == fast → cycle detected

If fast reaches NULL:

No cycle exists

Time Complexity

Operation	Time Complexity
Detection	O(n)