1.What is an AVL Tree?

Definition

An AVL Tree is a self-balancing Binary Search Tree (BST) in which the difference between the heights of the left and right subtrees (called the balance factor) of any node is at most 1.

It automatically maintains balance after every insertion and deletion operation.


Key Properties

  • It is a Binary Search Tree (BST)

  • Maintains height balance

  • Balance factor of each node is -1, 0, or +1

  • Ensures O(log n) time complexity for operations


Balance Factor

The balance factor of a node is:

Balance Factor = Height of Left Subtree − Height of Right Subtree

  • +1 → Left heavy

  • 0 → Balanced

  • -1 → Right heavy

If the balance factor goes beyond this range, the tree becomes unbalanced and requires rebalancing.


Rotations in AVL Tree

To restore balance, AVL Trees use rotations:

  • Left Rotation (RR Case)

  • Right Rotation (LL Case)

  • Left-Right Rotation (LR Case)

  • Right-Left Rotation (RL Case)

These rotations help maintain the height-balanced property.


Time Complexity

Operation

Time Complexity

Search

O(log n)

Insertion

O(log n)

Deletion

O(log n)


Example

During insertion, if a node becomes unbalanced (balance factor < -1 or > +1), appropriate rotations are applied to rebalance the tree.

2.What are AVL Tree Rotations?

Definition

AVL Tree rotations are operations used to rebalance an AVL Tree when it becomes unbalanced after insertion or deletion.

They restructure the tree while maintaining the Binary Search Tree (BST) property.


Types of Rotations

There are four types of AVL rotations:


1. Left Rotation (RR Case)

Applied when the tree is right-heavy (imbalance occurs in the right subtree of the right child).

Condition

Balance factor < -1 and new node inserted in right subtree of right child.

Operation

The right child becomes the new root, and the original root becomes its left child.


2. Right Rotation (LL Case)

Applied when the tree is left-heavy (imbalance occurs in the left subtree of the left child).

Condition

Balance factor > +1 and new node inserted in left subtree of left child.

Operation

The left child becomes the new root, and the original root becomes its right child.


3. Left-Right Rotation (LR Case)

Applied when the tree is left-heavy, but insertion occurs in the right subtree of the left child.

Condition

Balance factor > +1 and new node inserted in right subtree of left child.

Operation

  • First perform left rotation on left child

  • Then perform right rotation on the root


4. Right-Left Rotation (RL Case)

Applied when the tree is right-heavy, but insertion occurs in the left subtree of the right child.

Condition

Balance factor < -1 and new node inserted in left subtree of right child.

Operation

  • First perform right rotation on right child

  • Then perform left rotation on the root


3. What is a Red-Black Tree?

Definition

A Red-Black Tree is a self-balancing Binary Search Tree (BST) in which each node has an extra attribute called color (Red or Black).

The tree maintains balance using a set of rules, ensuring that operations like search, insertion, and deletion take O(log n) time.


Properties of Red-Black Tree

A Red-Black Tree follows these rules:

  1. Every node is either Red or Black

  2. The root node is always Black

  3. No two consecutive red nodes are allowed (no red node has a red parent or child)

  4. Every path from a node to its descendant NULL nodes has the same number of black nodes (black height)

  5. All leaf nodes (NULL nodes) are considered Black


Key Characteristics

  • It is a Binary Search Tree (BST)

  • Maintains approximate balance (not strictly balanced like AVL)

  • Uses color properties instead of strict height balancing

  • Guarantees O(log n) time complexity


Time Complexity

Operation

Time Complexity

Search

O(log n)

Insertion

O(log n)

Deletion

O(log n)


4. What are the Properties of Red-Black Trees?

Definition

A Red-Black Tree follows a set of properties that ensure the tree remains balanced, allowing operations to run in O(log n) time.

Properties

  1. Node Color Property
    Every node is either Red or Black.

  2. Root Property
    The root node is always Black.

  3. Red Property
    A Red node cannot have a Red parent or child (no two consecutive Red nodes).

  4. Black Height Property
    Every path from a node to its descendant NULL nodes contains the same number of Black nodes.

  5. Leaf Property
    All leaf nodes (NULL nodes) are considered Black.


Key Charecteristics

  • These properties ensure the tree remains balanced

  • Prevent the tree from becoming skewed

  • Maintain logarithmic height

  • Enable efficient search, insertion, and deletion operations

5. What are the Properties of Red-Black Trees?

Definition

A Red-Black Tree follows a set of properties that ensure the tree remains balanced, allowing operations to run in O(log n) time.


Properties

  1. Node Color Property
    Every node is either Red or Black.

  2. Root Property
    The root node is always Black.

  3. Red Property
    A Red node cannot have a Red parent or child (no two consecutive Red nodes).

  4. Black Height Property
    Every path from a node to its descendant NULL nodes contains the same number of Black nodes.

  5. Leaf Property
    All leaf nodes (NULL nodes) are considered Black.


Key Characteristics

  • These properties ensure the tree remains balanced

  • Prevent the tree from becoming skewed

  • Maintain logarithmic height

  • Enable efficient search, insertion, and deletion operations

6. What is a B-Tree?

Definition

A B-Tree is a self-balancing multi-way search tree in which each node can have more than two children.

It is designed to efficiently store and manage large amounts of data, especially on disk, by keeping the tree height small.


Key Properties

  • A node can have multiple keys and multiple children

  • Keys within a node are sorted

  • All leaf nodes are at the same level

  • The tree remains balanced after insertion and deletion

  • Each node (except root) has a minimum and maximum number of keys


Order of B-Tree

A B-Tree of order m has:

  • Maximum m children

  • Maximum m − 1 keys

  • Minimum ⌈m/2⌉ children (except root)

  • Minimum ⌈m/2⌉ − 1 keys


Key Characteristics

  • It is a balanced tree

  • Has low height, reducing disk access

  • Suitable for large datasets and external storage

  • Supports efficient search, insertion, and deletion


Time Complexity

Operation

Time Complexity

Search

O(log n)

Insertion

O(log n)

Deletion

O(log n)


Applications

  • Used in databases and file systems

  • Helps in indexing large data efficiently

  • Minimizes disk reads and writes

7. Where are B-Trees Used in Databases?

Definition

In databases, B-Trees are primarily used for indexing, which allows data to be searched, inserted, and deleted efficiently without scanning the entire table.


Main Uses of B-Trees in Databases

1. Indexing

B-Trees are used to create indexes on database tables.

  • They store keys in sorted order

  • Enable fast searching using logarithmic time

  • Avoid full table scans

Example: Searching a record by ID or username


2. Primary Index (Clustered Index)

  • Used to store data in sorted order based on primary key

  • The actual table data is organized using a B-Tree

  • Improves range queries and sequential access


3. Secondary Index (Non-Clustered Index)

  • Stores index keys separately from actual data

  • Points to the location of data rows

  • Used for faster lookup on non-primary attributes

Example: Searching by email or name


4. Range Queries

B-Trees are efficient for queries like:

  • WHERE age BETWEEN 20 AND 30

  • WHERE salary > 50000

Because data is sorted, range-based retrieval is fast.


5. Disk-Based Storage Optimization

  • B-Trees are designed for minimizing disk I/O

  • Each node can store multiple keys, reducing tree height

  • Fewer disk accesses compared to binary trees


Use Cases 

  • Used for database indexing

  • Supports fast search, insertion, and deletion

  • Efficient for range queries

  • Optimized for disk-based storage systems

  • Used in most database systems like MySQL, PostgreSQL, and Oracle

8. What is a B+ Tree?

Definition

A B+ Tree is a self-balancing multi-way search tree, similar to a B-Tree, in which all actual data records are stored only at the leaf nodes, while internal nodes store only keys for indexing.

It is widely used in databases and file systems for efficient data retrieval.


Key Properties

  • It is a balanced tree

  • Internal nodes store only keys (no actual data)

  • All data is stored in the leaf nodes

  • Leaf nodes are connected using linked pointers

  • All leaf nodes are at the same level


Structure

  • Internal Nodes → Act as index nodes to guide search

  • Leaf Nodes → Store actual data records and are linked for sequential access


Key Characteristics

  • Supports efficient searching, insertion, and deletion

  • Enables fast range queries due to linked leaf nodes

  • Has lower height, reducing disk access

  • More efficient than B-Trees for database indexing


Time Complexity

Operation

Time Complexity

Search

O(log n)

Insertion

O(log n)

Deletion

O(log n)


Applications

  • Used in database indexing

  • Used in file systems

  • Efficient for range queries and ordered data retrieval

9. Difference Between B-Tree and B+ Tree

Definition

Both B-Tree and B+ Tree are self-balancing multi-way search trees used in databases and file systems, but they differ in how they store and access data.


Key Differences

Feature

B-Tree

B+ Tree

Data Storage

Data stored in all nodes (internal + leaf)

Data stored only in leaf nodes

Internal Nodes

Store keys and data

Store only keys (index)

Leaf Nodes

May or may not store all data

Store all actual data records

Search Path

Search may end at internal or leaf node

Search always goes to leaf node

Sequential Access

Not efficient

Efficient (linked leaf nodes)

Leaf Node Linking

Not linked

Leaf nodes are linked (like linked list)

Range Queries

Less efficient

Highly efficient

Disk Access

More

Less (better optimized)


Key Points

  • B-Tree stores data at multiple levels, making it slightly faster for single record access

  • B+ Tree stores all data at leaf level, making it better for range queries and sequential access

  • B+ Trees are preferred in databases due to better disk I/O performance


Use Case Difference

  • B-Tree: Suitable for general-purpose searching

  • B+ Tree: Widely used in database indexing and file systems due to efficient range queries and sequential traversal

10. What is a Segment Tree?

Definition

A Segment Tree is a tree-based data structure used to efficiently perform range queries and updates on an array.

It divides the array into segments (intervals) and stores information about these segments in a tree structure.


Key Properties

  • It is a binary tree

  • Each node represents a range (segment) of the array

  • The root represents the entire array

  • Leaf nodes represent individual elements

  • Used for range queries and point/range updates


Operations Supported

  • Range Query – Find sum, minimum, maximum, etc., over a given range

  • Update – Modify a value in the array and update the tree accordingly


Time Complexity

Operation

Time Complexity

Build Tree

O(n)

Range Query

O(log n)

Update

O(log n)


Space Complexity

  • Requires approximately 4 × n space for implementation


Example Use Cases

  • Range sum queries

  • Range minimum/maximum queries

  • Frequency counting

  • Interval-based problems

11. What are the Applications of Segment Trees?

Definition

Segment Trees are used in scenarios where we need to perform efficient range queries and updates on an array.


Applications

1. Range Sum Queries

  • Find the sum of elements in a given range

  • Example: Sum of elements from index l to r


2. Range Minimum/Maximum Queries

  • Find the minimum or maximum element in a range

  • Commonly used in optimization problems


3. Dynamic Updates

  • Efficiently handle frequent updates in an array

  • Example: Updating a value and recalculating range queries


4. Interval Queries

  • Solve problems involving intervals and overlapping ranges

  • Example: Counting elements in a range


5. Frequency Counting

  • Count occurrences of elements in a given range

  • Useful in query-based problems


6. Lazy Propagation (Advanced Use)

  • Used when performing range updates efficiently

  • Avoids updating each element individually


Key Points

  • Ideal for problems with multiple range queries and updates

  • Reduces time complexity from O(n) to O(log n) per query

  • Widely used in competitive programming and real-time data processing

12. What is a Fenwick Tree (Binary Indexed Tree)?

Definition

A Fenwick Tree, also known as a Binary Indexed Tree (BIT), is a data structure used to efficiently perform prefix sum queries and updates on an array.

It stores cumulative frequency information in a way that allows both operations to be performed in O(log n) time.


Key Properties

  • Used for prefix sum and cumulative frequency queries

  • Supports point updates efficiently

  • Based on binary representation of indices

  • Uses an array-based implementation (no explicit tree structure)


Operations Supported

  • Prefix Sum Query – Compute sum from index 1 to i

  • Update – Add a value to a specific index


Time Complexity

Operation

Time Complexity

Update

O(log n)

Prefix Query

O(log n)


Space Complexity

  • Requires O(n) space


Key Characteristics

  • More space-efficient than Segment Trees

  • Easier to implement

  • Best suited for prefix-based queries


Example Use Cases

  • Prefix sum calculations

  • Frequency counting

  • Inversion count in arrays

  • Cumulative data processing

13. What is a Trie Data Structure?

Definition

A Trie (also called a Prefix Tree) is a tree-based data structure used to store and retrieve strings efficiently, especially for prefix-based operations.

Each node represents a character, and paths from the root to nodes represent words or prefixes.


Key Properties

  • It is a tree structure for storing strings

  • Each node represents a character

  • Words are formed by traversing from root to leaf

  • Common prefixes are shared among words

  • Uses a special marker to indicate the end of a word


Operations Supported

  • Insertion – Add a word to the Trie

  • Search – Check if a word exists

  • Prefix Search – Check if any word starts with a given prefix


Time Complexity

Operation

Time Complexity

Insertion

O(L)

Search

O(L)

Prefix Search

O(L)

(L = length of the word)


Key Characteristics

  • Efficient for string and prefix-based queries

  • Avoids repeated storage of common prefixes

  • Faster than hashing in some prefix problems


Example Use Cases

  • Autocomplete systems

  • Spell checking

  • Dictionary implementations

  • Search engines (prefix matching)


14. What are the Applications of Trie?

Definition

A Trie is mainly used in applications that involve efficient storage and retrieval of strings, especially when dealing with prefix-based operations.


Applications

1. Autocomplete Systems

  • Suggest words based on a given prefix

  • Example: Search engines, mobile keyboards


2. Spell Checking

  • Check if a word exists in a dictionary

  • Suggest corrections for misspelled words


3. Dictionary Implementation

  • Store a large set of words efficiently

  • Support fast search and prefix queries


4. Prefix Search

  • Find all words starting with a given prefix

  • Used in search bars and filtering systems


5. IP Routing (Longest Prefix Matching)

  • Used in networking to find the longest matching prefix

  • Helps in routing packets efficiently


6. Word Games

  • Used in games like Scrabble or Boggle

  • Helps in checking valid words quickly


7. Pattern Matching

  • Used in problems involving string matching and searching

  • Can be extended for advanced structures like suffix tries


15. What is a Suffix Tree?

Definition

A Suffix Tree is a compressed trie (prefix tree) that stores all suffixes of a given string.

It allows efficient solving of various string processing problems such as pattern matching and substring queries.


Key Properties

  • Represents all suffixes of a string

  • Each edge stores a substring (not just a single character)

  • No two edges from a node start with the same character

  • It is a compressed structure, reducing space compared to a normal trie

  • Built for a string of length n


Structure

  • Root → Represents empty string

  • Internal Nodes → Represent shared prefixes

  • Leaf Nodes → Represent suffixes of the string


Time Complexity

Operation

Time Complexity

Construction

O(n)

Pattern Search

O(m)

(n = length of text, m = length of pattern)


Key Characteristics

  • Very efficient for substring search problems

  • Stores all suffixes in a compact form

  • Faster than naive string matching approaches


Example

For string "banana", suffixes are:

  • banana

  • anana

  • nana

  • ana

  • na

  • a

These are all represented in the suffix tree.


Applications

  • Pattern matching

  • Longest repeated substring

  • Longest common substring

  • String compression


16. What is a Suffix Array?

Definition

A Suffix Array is a sorted array of all suffixes of a given string, where each element stores the starting index of a suffix in lexicographical (dictionary) order.


Key Properties

  • Stores indices of suffixes, not the suffix strings themselves

  • Suffixes are arranged in lexicographically sorted order

  • More space-efficient than a Suffix Tree

  • Often used with LCP (Longest Common Prefix) array for optimization


Structure

For a string of length n:

  • Generate all n suffixes

  • Sort them lexicographically

  • Store their starting indices in an array


Example

For string "banana":

Suffixes:

  • banana (index 0)

  • anana (index 1)

  • nana (index 2)

  • ana (index 3)

  • na (index 4)

  • a (index 5)

Sorted suffixes:

  • a (5)

  • ana (3)

  • anana (1)

  • banana (0)

  • na (4)

  • nana (2)

Suffix Array: [5, 3, 1, 0, 4, 2]


Time Complexity

Operation

Time Complexity

Construction

O(n log n) (can be optimized to O(n))

Pattern Search

O(log n) (using binary search)


Key Characteristics

  • Uses less memory compared to Suffix Trees

  • Suitable for efficient string searching

  • Often combined with binary search for pattern matching


Applications

  • Pattern matching

  • Longest common prefix problems

  • String sorting problems

  • Data compression algorithms


Key Points

  • Stores sorted suffix indices

  • More space-efficient than suffix trees

  • Useful for fast substring and pattern queries

17. What is a Skip List?

Definition

A Skip List is a probabilistic data structure that allows fast search, insertion, and deletion operations by maintaining multiple levels of linked lists.

It works by "skipping" over elements using higher-level links, making operations efficient similar to balanced trees.


Key Properties

  • Consists of multiple levels of linked lists

  • The bottom level contains all elements in sorted order

  • Higher levels act as express lanes for faster traversal

  • Uses randomization to decide levels of nodes

  • Maintains elements in sorted order


Structure

  • Level 0 → Contains all elements (like a normal linked list)

  • Higher levels → Contain fewer elements, skipping multiple nodes

  • Each node may appear in multiple levels


Operations Supported

  • Search – Traverse from top level to bottom

  • Insertion – Insert element and randomly assign levels

  • Deletion – Remove element from all levels


Time Complexity (Average Case)

Operation

Time Complexity

Search

O(log n)

Insertion

O(log n)

Deletion

O(log n)


Key Characteristics

  • Simpler to implement than balanced trees

  • Uses randomization instead of strict balancing

  • Provides performance similar to balanced BSTs


Applications

  • Databases and indexing systems

  • Ordered data structures

  • Memory-efficient alternatives to balanced trees


Key Points

  • Probabilistic alternative to balanced trees

  • Uses multiple levels for fast traversal

  • Maintains sorted data with efficient operations

18. What is a Deque and How is it Implemented?

Definition

A Deque (Double-Ended Queue) is a linear data structure in which insertion and deletion can be performed from both ends (front and rear).


Key Properties

  • Supports operations at both front and rear

  • Can function as both queue and stack

  • Maintains elements in a sequential order


Operations Supported

  • Insert Front – Add element at the front

  • Insert Rear – Add element at the rear

  • Delete Front – Remove element from the front

  • Delete Rear – Remove element from the rear

  • Peek Front / Rear – Access elements without removing


Types of Deque

  • Input Restricted Deque – Insertion allowed at only one end, deletion at both

  • Output Restricted Deque – Deletion allowed at only one end, insertion at both


Implementation

1. Array Implementation

  • Use a circular array to efficiently utilize space

  • Maintain two pointers: front and rear

  • Handle wrap-around using modulo operation

Characteristics

  • Fast access using indices

  • Fixed size (unless dynamically resized)


2. Linked List Implementation

  • Use a doubly linked list

  • Each node has pointers to previous and next nodes

Characteristics

  • Dynamic size

  • Efficient insertion and deletion at both ends


Time Complexity

Operation

Time Complexity

Insert Front

O(1)

Insert Rear

O(1)

Delete Front

O(1)

Delete Rear

O(1)


Applications

  • Sliding window problems

  • Implementing queues and stacks

  • Palindrome checking

  • Task scheduling


Key Points

  • Deque allows bidirectional operations

  • Can be implemented using arrays or doubly linked lists

  • All operations are O(1)

  • More flexible than simple queues and stacks

19. What are the Applications of Deque?

Definition

A Deque (Double-Ended Queue) is used in applications where insertion and deletion from both ends are required efficiently.


Applications

1. Sliding Window Problems

  • Used to efficiently find maximum/minimum in a sliding window

  • Example: Maximum of all subarrays of size k


2. Implementing Stack and Queue

  • Can act as both stack (LIFO) and queue (FIFO)

  • Provides flexibility in operations


3. Palindrome Checking

  • Compare elements from front and rear simultaneously

  • Efficient for checking if a string is a palindrome


4. Task Scheduling

  • Used in systems where tasks can be added or removed from both ends

  • Example: CPU scheduling, job processing


5. Breadth-First Search (BFS)

  • Used in graph traversal

  • Efficiently handles insertion and deletion during traversal


6. Undo/Redo Operations

  • Stores operations where actions can be added/removed from both ends

  • Useful in editors and applications


Key Points

  • Supports operations from both ends

  • Useful for range-based and dynamic problems

  • All operations can be performed in O(1) time

  • Widely used in sliding window and real-time processing problems

20. How is LRU Cache Implemented?

Definition

An LRU (Least Recently Used) Cache is a data structure that stores a limited number of items and removes the least recently used item when the cache reaches its capacity.


Approach

An efficient LRU Cache is implemented using a combination of:

  • Hash Map (Dictionary) → for O(1) access to elements

  • Doubly Linked List → to maintain the order of usage


Working Principle

  • The most recently used (MRU) item is kept at the front

  • The least recently used (LRU) item is kept at the end

  • On every access or insertion:

    • Move the item to the front

  • When capacity is exceeded:

    • Remove the item from the end


Operations

1. Get(key)

  • If key exists:

    • Move the node to the front

    • Return value

  • Else:

    • Return -1


2. Put(key, value)

  • If key exists:

    • Update value

    • Move node to front

  • Else:

    • Insert new node at front

    • If capacity exceeded:

      • Remove node from rear (LRU)


Time Complexity

Operation

Time Complexity

Get

O(1)

Put

O(1)


Key Characteristics

  • Combines fast lookup (hash map) and order tracking (linked list)

  • Maintains recent usage order efficiently

  • Ensures constant time operations


Applications

  • Memory caching

  • Database query caching

  • Web browser cache

  • Operating systems (page replacement)


Key Points

  • Uses Hash Map + Doubly Linked List

  • Maintains recently used order

  • Removes least recently used element when full

  • All operations are O(1)

21. Which Data Structures are Used in LRU Cache?

Definition

An LRU Cache is implemented using a combination of two data structures to achieve O(1) time complexity for both access and update operations.


Data Structures Used

1. Hash Map (Dictionary)

  • Stores key → node reference

  • Enables O(1) lookup to check if a key exists

  • Helps directly access the corresponding node in the linked list


2. Doubly Linked List

  • Maintains the order of usage

  • Stores elements from most recently used (front) to least recently used (rear)

  • Supports O(1) insertion and deletion


Why This Combination?

  • Hash Map provides fast access to elements

  • Doubly Linked List maintains usage order efficiently

  • Together, they ensure both Get and Put operations run in O(1) time


Key Charecteristics

  • Uses Hash Map + Doubly Linked List

  • Hash map handles fast lookup

  • Linked list handles ordering of elements

  • Essential for maintaining LRU policy efficiently

22. What is the difference Between Heap and Priority Queue?

Definition

A Heap is a tree-based data structure that satisfies the heap property, while a Priority Queue is an abstract data type (ADT) where elements are processed based on their priority.


Key Differences

Feature

Heap

Priority Queue

Type

Data Structure

Abstract Data Type (ADT)

Purpose

Maintains heap property (min/max)

Retrieves elements based on priority

Implementation

Typically implemented using arrays (binary heap)

Can be implemented using heap, array, or linked list

Ordering

Partial ordering (parent-child relationship)

Elements served based on priority

Operations

Insert, delete, heapify

Insert, delete (highest/lowest priority)


Relationship Between Them

  • A heap is commonly used to implement a priority queue

  • Using a heap allows priority queue operations to run in O(log n) time


Key Points

  • Heap → Implementation (data structure)

  • Priority Queue → Concept (ADT)

  • Heap provides efficient way to build priority queues

  • Priority queues can also be implemented using other structures but less efficiently

23. How Does Java HashMap Handle Collisions?

Definition

A collision in a HashMap occurs when multiple keys generate the same hash index.

Java HashMap handles collisions using a combination of chaining (linked list) and tree-based structure (Red-Black Tree).


Collision Handling Mechanism

1. Hashing and Index Calculation

  • The key’s hashCode() is computed

  • It is converted into an index using a hashing function

  • Multiple keys may map to the same index (bucket)


2. Chaining using Linked List

  • Each bucket stores elements as a linked list

  • If multiple keys map to the same index:

    • They are stored as nodes in the linked list

  • During search:

    • Traverse the list to find the matching key


3. Conversion to Red-Black Tree

  • If the number of elements in a bucket exceeds a threshold (typically 8):

    • The linked list is converted into a Red-Black Tree

  • This improves performance from O(n) to O(log n) in worst case


4. Tree to List Conversion

  • If elements reduce below a threshold (typically 6):

    • The structure is converted back to a linked list


Time Complexity

Operation

Average Case

Worst Case

Get / Put

O(1)

O(log n)


Key Points

  • Uses hashing + chaining

  • Collisions handled via linked list initially

  • Converts to Red-Black Tree for high collisions

  • Ensures efficient performance even in worst cases

24. What is Rehashing in HashMap?

Definition

Rehashing is the process of resizing the HashMap and redistributing its elements into a new larger bucket array when the number of elements exceeds a certain limit.


Why Rehashing is Needed

  • Prevents too many collisions

  • Maintains efficient performance (O(1) average time)

  • Avoids long linked lists or trees in buckets


Load Factor

  • HashMap uses a load factor (default = 0.75)

  • When:
    Number of elements > Capacity × Load Factor
    → Rehashing is triggered


Rehashing Process

  1. Create a new bucket array with larger capacity (usually double)

  2. Recalculate the hash index for each existing key

  3. Insert all elements into the new array based on new indices


Time Complexity

  • Rehashing takes O(n) time (since all elements are reinserted)

  • Happens occasionally, so average operations remain efficient


Key Charecteristics

  • Triggered when load factor threshold is exceeded

  • Increases capacity of HashMap

  • Reduces collisions and improves performance

  • Ensures average O(1) time complexity for operations

25. What is Consistent Hashing?

Definition

Consistent Hashing is a technique used to distribute data across multiple servers (or nodes) such that when a server is added or removed, only a small portion of data needs to be redistributed.

Working

  • Both data (keys) and servers (nodes) are mapped to a circular hash space (hash ring)

  • Each key is assigned to the next available server in the clockwise direction


How It Works

  1. Hash all servers and place them on a circular ring

  2. Hash each key (data) and place it on the same ring

  3. Assign each key to the nearest server clockwise


Handling Changes

1. Adding a Server

  • Only the keys between the new server and its previous node are reassigned

  • Rest of the data remains unaffected


2. Removing a Server

  • Its keys are reassigned to the next server on the ring

  • Minimal data movement


Advantages

  • Minimizes data redistribution

  • Provides scalability

  • Ensures load balancing

  • Avoids massive reshuffling of data


Applications

  • Distributed caching systems

  • Load balancing

  • Distributed databases

  • Content delivery networks (CDNs)


Key Characteristics

  • Uses a hash ring (circular structure)

  • Maps both servers and data using hash functions

  • Ensures minimal data movement on scaling

  • Widely used in distributed systems

26. What is a Distributed Hash Table (DHT)?

Definition

A Distributed Hash Table (DHT) is a decentralized system that stores and retrieves key-value pairs across multiple nodes in a distributed network.

It extends the concept of a hash table by distributing data across different machines instead of storing it in a single location.


Key Properties

  • Stores data as key → value pairs

  • Data is distributed across multiple nodes

  • Each node is responsible for a portion of the key space

  • Provides scalable and fault-tolerant storage

  • No central authority (decentralized system)


How It Works

  1. A hash function is used to map keys to positions in a logical space

  2. Each node is assigned a range of keys

  3. When storing or retrieving data:

    • The system routes the request to the node responsible for that key


Key Characteristics

  • Supports efficient lookup, insertion, and deletion

  • Lookup operations typically take O(log n) time

  • Handles node joins and failures dynamically

  • Ensures data availability and load distribution


Applications

  • Peer-to-peer (P2P) networks

  • Distributed file systems

  • Distributed caching systems

  • Content sharing systems (e.g., BitTorrent)


Key Points

  • Distributed version of a hash table

  • Provides scalable and decentralized data storage

  • Uses hashing to distribute data across nodes

  • Efficient for large-scale distributed systems

27. What is Union-Find (Disjoint Set)?

Definition

Union-Find, also known as a Disjoint Set, is a data structure used to keep track of a set of elements partitioned into disjoint (non-overlapping) subsets.

It efficiently supports operations to find which set an element belongs to and to merge two sets.


Key Operations

1. Find(x)

  • Returns the representative (parent/root) of the set containing element x

  • Used to check if two elements belong to the same set


2. Union(x, y)

  • Merges the sets containing elements x and y

  • Ensures both elements belong to the same set after operation


Key Properties

  • Each set has a representative (root node)

  • Initially, each element is in its own set

  • Sets are merged using union operation

  • Uses a tree-based structure internally


Optimizations

  • Path Compression → Flattens the tree during find operation

  • Union by Rank/Size → Attaches smaller tree under larger tree

These optimizations make operations nearly constant time (O(α(n))).


Time Complexity

Operation

Time Complexity

Find

O(α(n))

Union

O(α(n))

(α(n) is the inverse Ackermann function, very slow-growing)


Applications

  • Detecting cycles in graphs

  • Kruskal’s Algorithm (Minimum Spanning Tree)

  • Network connectivity problems

  • Grouping or clustering problems


Key Characteristics

  • Manages disjoint sets efficiently

  • Supports find and union operations

  • Uses path compression and union by rank

  • Nearly constant time operations

28. What is Path Compression in Union-Find?

Definition

Path Compression is an optimization technique used in Union-Find (Disjoint Set) to make the Find operation faster by flattening the structure of the tree.


Key Idea

  • During the Find(x) operation, all nodes along the path from x to the root are directly connected to the root

  • This reduces the height of the tree, making future operations faster


How It Works

  • When finding the root of an element:

    • Recursively find the root

    • Update each node’s parent to point directly to the root


Example

Before path compression:
A chain like → 1 → 2 → 3 → 4

After applying Find(1):
All nodes point directly to root → 1, 2, 3 → 4


Benefits

  • Reduces tree height significantly

  • Speeds up future Find operations

  • Works best when combined with Union by Rank/Size


Time Complexity

  • Makes operations nearly O(α(n)) (almost constant time)


Key Characteristics

  • Flattens the tree during Find operation

  • Improves efficiency of Union-Find

  • Ensures very fast repeated operations

  • Commonly used with Union by Rank

29. What is a Strongly Connected Component in a Graph?

Definition

A Strongly Connected Component (SCC) is a maximal group of vertices in a directed graph such that every vertex is reachable from every other vertex in the same group.


Key Idea

  • In a directed graph, for any two vertices u and v in an SCC:

    • There exists a path from u → v

    • And a path from v → u


Key Properties

  • Applicable only to directed graphs

  • Each SCC is a maximal set (cannot be extended further)

  • A graph can have one or multiple SCCs

  • Every vertex belongs to exactly one SCC


Example

If nodes A, B, C are all reachable from each other, they form one SCC.
Another group of nodes may form a separate SCC.


Algorithms to Find SCC

  • Kosaraju’s Algorithm

  • Tarjan’s Algorithm

  • Kosaraju uses two DFS traversals


Applications

  • Detecting cycles in directed graphs

  • Analyzing social networks

  • Dependency resolution in systems

  • Compiler design


Key Characteristics

  • SCC = mutually reachable vertices

  • Exists only in directed graphs

  • Forms independent components in a graph

  • Found using DFS-based algorithms

30. What is Kosaraju’s Algorithm?

Definition

Kosaraju’s Algorithm is a graph algorithm used to find all Strongly Connected Components (SCCs) in a directed graph using two Depth-First Search (DFS) traversals.


Key Idea

The algorithm works by:

  • Ordering vertices based on finishing times in DFS

  • Reversing the graph

  • Running DFS again to identify SCCs


Steps of Algorithm

  1. Perform DFS on the original graph

    • Push vertices onto a stack in order of their finishing time

  2. Reverse (transpose) the graph

    • Reverse the direction of all edges

  3. Perform DFS on the reversed graph

    • Process vertices in the order of the stack (top to bottom)

    • Each DFS traversal gives one Strongly Connected Component


Key Properties

  • Uses two DFS traversals

  • Works only on directed graphs

  • Identifies all SCCs efficiently


Time Complexity

Operation

Time Complexity

Overall

O(V + E)

(V = number of vertices, E = number of edges)


Applications

  • Finding strongly connected components

  • Detecting cycles in directed graphs

  • Analyzing graph connectivity


Key Characteristics

  • Based on DFS and graph reversal

  • Uses a stack to track finishing order

  • Efficient for large graphs with O(V + E) complexity

31. What is Tarjan’s Algorithm?

Definition

Tarjan’s Algorithm is a graph algorithm used to find all Strongly Connected Components (SCCs) in a directed graph using a single Depth-First Search (DFS) traversal.


Key Idea

The algorithm assigns each node:

  • A discovery time (disc) → when the node is first visited

  • A low value (low) → the smallest discovery time reachable from that node

Using these values, it identifies SCCs during DFS.


Steps of Algorithm

  1. Perform DFS traversal of the graph

  2. Assign each node:

    • disc[u] = discovery time

    • low[u] = minimum reachable discovery time

  3. Maintain a stack to store visited nodes

  4. For each node:

    • Update low values based on neighbors

    • If disc[u] == low[u], it is the root of an SCC

  5. Pop nodes from the stack until the current node is reached → forms one SCC


Key Properties

  • Uses single DFS traversal

  • Maintains a stack to track nodes in current SCC

  • Efficient and avoids graph reversal


Time Complexity

Operation

Time Complexity

Overall

O(V + E)

(V = number of vertices, E = number of edges)


Applications

  • Finding Strongly Connected Components

  • Cycle detection in directed graphs

  • Dependency analysis


Key Points

  • Uses disc and low arrays

  • Identifies SCCs in one DFS pass

  • More efficient than algorithms requiring multiple traversals

  • Based on DFS and stack mechanism

32. What is Floyd-Warshall Algorithm?

Definition

The Floyd-Warshall Algorithm is a dynamic programming algorithm used to find the shortest paths between all pairs of vertices in a weighted graph.

It works for both directed and undirected graphs, and can handle negative edge weights (but not negative cycles).


Key Idea

  • It gradually improves the shortest path between every pair of vertices

  • Uses an intermediate vertex to check if a shorter path exists


Core Concept

d[i][j] = \min(d[i][j],\ d[i][k] + d[k][j])

This means:

  • Either keep the current shortest path from i to j

  • Or update it using an intermediate vertex k


Steps of Algorithm

  1. Initialize a distance matrix with given edge weights

  2. Set distance of a node to itself as 0

  3. For each vertex k:

    • Update all pairs (i, j) using the formula above

  4. Repeat until all vertices are considered as intermediates


Key Properties

  • Works on weighted graphs

  • Can detect negative cycles

  • Computes shortest paths between all pairs of nodes


Time Complexity

Operation

Time Complexity

Overall

O(V³)


Applications

  • All-pairs shortest path problems

  • Network routing

  • Graph analysis

  • Detecting negative cycles


Key Characteristics

  • Based on dynamic programming

  • Uses matrix representation

  • Handles negative weights (not negative cycles)

  • Simple but less efficient for very large graphs

33. What is A* Search Algorithm?

Definition

A* (A-star) is a pathfinding and graph traversal algorithm used to find the shortest path from a source node to a target node using both actual cost and heuristic estimates.


Key Idea

A* selects the next node based on a function that combines:

  • Actual cost from start (g(n))

  • Estimated cost to goal (h(n))


Evaluation Function

f(n) = g(n) + h(n)

  • f(n) → Total estimated cost

  • g(n) → Cost from start to current node

  • h(n) → Heuristic estimate to goal


How It Works

  1. Start from the source node

  2. Maintain an open list (nodes to explore)

  3. Select node with lowest f(n)

  4. Expand neighbors and update costs

  5. Repeat until the goal node is reached


Key Properties

  • Uses both actual distance and heuristic

  • Guarantees shortest path if heuristic is admissible

  • More efficient than Dijkstra when heuristic is good


Time Complexity

  • Depends on implementation and heuristic

  • Worst case: O(b^d)
    (b = branching factor, d = depth)


Applications

  • Pathfinding in maps and games

  • Navigation systems (GPS)

  • Robotics and AI

  • Maze solving


Key Points

  • Combines Dijkstra + Greedy Best-First Search

  • Uses heuristic function for efficiency

  • Finds optimal path with proper heuristic

  • Widely used in AI and pathfinding problems

34. What is Network Flow Problem?

Definition

The Network Flow Problem is a problem of finding the maximum amount of flow that can be sent from a source node to a sink node in a directed graph, subject to capacity constraints on edges.


Key Components

  • Source (s) → Starting node where flow originates

  • Sink (t) → Destination node where flow ends

  • Capacity → Maximum flow allowed through an edge

  • Flow → Amount of data passing through edges


Key Constraints

  • Flow on an edge cannot exceed its capacity

  • Flow entering a node equals flow leaving it (flow conservation)

  • Flow moves from source to sink through valid paths


Objective

  • Maximize the total flow from source to sink


Key Algorithms

  • Ford-Fulkerson Algorithm

  • Edmonds-Karp Algorithm (BFS-based)

  • Dinic’s Algorithm


Time Complexity

  • Depends on the algorithm used

  • Example: Edmonds-Karp → O(VE²)


Applications

  • Network bandwidth optimization

  • Traffic and transportation systems

  • Bipartite matching problems

  • Supply chain and logistics


Key Characteristics

  • Directed graph with capacity constraints

  • Goal is to find maximum flow

  • Uses augmenting paths

  • Fundamental problem in graph theory and optimization


35. What is Ford-Fulkerson Algorithm?

Definition

The Ford-Fulkerson Algorithm is a method used to find the maximum flow in a flow network by repeatedly finding augmenting paths from the source to the sink.


Key Idea

  • Start with zero flow

  • Find a path from source to sink with available capacity

  • Push flow through that path

  • Repeat until no more augmenting paths exist


Key Concept

  • Augmenting Path → A path from source to sink where all edges have remaining capacity

  • Residual Graph → Graph showing remaining capacity after each flow update


Steps of Algorithm

  1. Initialize flow in all edges as 0

  2. While there exists an augmenting path:

    • Find the minimum residual capacity (bottleneck) in the path

    • Add this flow to all edges in the path

    • Update the residual graph

  3. Repeat until no augmenting path is found


Time Complexity

  • O(E × max_flow)

  • Depends on the value of maximum flow


Key Properties

  • Works on directed graphs with capacities

  • Uses residual graph for updates

  • May not be efficient for large capacities


Applications

  • Network flow problems

  • Bipartite matching

  • Resource allocation

  • Traffic systems


Key Characteristics

  • Based on finding augmenting paths repeatedly

  • Uses residual capacity concept

  • Forms the basis for advanced algorithms like Edmonds-Karp

  • Computes maximum flow from source to sink

36. What is Edmonds-Karp Algorithm?

Definition

The Edmonds-Karp Algorithm is an implementation of the Ford-Fulkerson Algorithm used to find the maximum flow in a flow network, where the augmenting paths are found using Breadth-First Search (BFS).


Key Idea

  • Instead of choosing any augmenting path, it always selects the shortest path (in terms of number of edges) from source to sink

  • This ensures better and predictable performance


How It Works

  1. Start with zero flow

  2. Use BFS to find the shortest augmenting path in the residual graph

  3. Find the minimum capacity (bottleneck) along the path

  4. Add flow along the path

  5. Update the residual graph

  6. Repeat until no augmenting path exists


Key Properties

  • Uses BFS for path selection

  • Guarantees finding shortest augmenting paths

  • Improves efficiency over basic Ford-Fulkerson


Time Complexity

Operation

Time Complexity

Overall

O(V × E²)

(V = vertices, E = edges)


Applications

  • Maximum flow problems

  • Bipartite matching

  • Network routing and traffic systems


Key Points

  • A specific implementation of Ford-Fulkerson

  • Uses BFS instead of DFS

  • Ensures polynomial time complexity

  • Works with residual graphs and augmenting paths

37. What is Edmonds-Karp Algorithm?

Definition

The Edmonds-Karp Algorithm is an implementation of the Ford-Fulkerson Algorithm used to find the maximum flow in a flow network, where the augmenting paths are found using Breadth-First Search (BFS).


Key Idea

  • Instead of choosing any augmenting path, it always selects the shortest path (in terms of number of edges) from source to sink

  • This ensures better and predictable performance


How It Works

  1. Start with zero flow

  2. Use BFS to find the shortest augmenting path in the residual graph

  3. Find the minimum capacity (bottleneck) along the path

  4. Add flow along the path

  5. Update the residual graph

  6. Repeat until no augmenting path exists


Key Properties

  • Uses BFS for path selection

  • Guarantees finding shortest augmenting paths

  • Improves efficiency over basic Ford-Fulkerson


Time Complexity

Operation

Time Complexity

Overall

O(V × E²)

(V = vertices, E = edges)


Applications

  • Maximum flow problems

  • Bipartite matching

  • Network routing and traffic systems


Key Points

  • A specific implementation of Ford-Fulkerson

  • Uses BFS instead of DFS

  • Ensures polynomial time complexity

  • Works with residual graphs and augmenting paths

38. What is Topological Sorting Used For?

Definition

Topological Sorting is used to order the vertices of a Directed Acyclic Graph (DAG) such that for every directed edge u → v, node u appears before v in the ordering.


Uses of Topological Sorting

1. Task Scheduling

  • Determines the correct order of tasks based on dependencies

  • Example: Completing tasks where some tasks must be done before others


2. Course Scheduling

  • Used to find a valid order of courses based on prerequisites

  • Example: Course A must be completed before Course B


3. Dependency Resolution

  • Resolves dependencies in systems like:

    • Build systems

    • Package managers


4. Job Scheduling in Operating Systems

  • Helps in scheduling jobs where certain jobs depend on others


5. Detecting Cycles in Directed Graphs

  • If topological sorting is not possible, the graph contains a cycle


Key Characteristics

  • Applicable only to Directed Acyclic Graphs (DAGs)

  • Ensures dependency order is maintained

  • Used in scheduling and dependency problems

  • Can be implemented using DFS or BFS (Kahn’s Algorithm)

39. What is Dynamic Programming Optimization?

Definition

Dynamic Programming (DP) Optimization refers to techniques used to improve the efficiency of DP solutions by reducing their time and/or space complexity.


Key Idea

  • Standard DP solutions may have high time complexity (e.g., O(n²), O(n³))

  • Optimization techniques aim to reduce redundant computations and speed up transitions


Common DP Optimization Techniques

1. Memoization and Tabulation Optimization

  • Avoid recomputation by storing results

  • Convert recursive DP to iterative (bottom-up) for better performance


2. Space Optimization

  • Reduce space from O(n²) to O(n) or O(1)

  • Example: Using rolling arrays instead of full DP tables


3. Binary Search Optimization

  • Apply binary search within DP transitions

  • Used when states are monotonic


4. Knuth Optimization

  • Reduces time complexity of certain DP problems from O(n³) to O(n²)

  • Applicable when quadrangle inequality and monotonicity hold


5. Divide and Conquer Optimization

  • Splits DP computation into smaller parts

  • Reduces complexity for specific problems


6. Convex Hull Trick

  • Optimizes DP involving linear functions

  • Reduces complexity using geometric properties


Key Characteristics

  • Applied to optimize existing DP solutions

  • Requires identifying patterns or properties in transitions

  • Often reduces time complexity significantly

  • Focuses on improving DP efficiency

  • Uses mathematical and structural insights

  • Reduces time and/or space complexity

  • Important for solving large-scale DP problems efficiently


Applications

  • Matrix chain multiplication optimization

  • Range DP problems

  • Optimization problems in competitive programming

40. What is Memoization?

Definition

Memoization is a technique used in dynamic programming where the results of expensive function calls are stored (cached) so that they can be reused when the same inputs occur again.


Key Idea

  • Avoid recomputing the same subproblems

  • Store results in a cache (array/map)

  • Return stored result if already computed


How It Works

  1. When a function is called:

    • Check if result is already stored

  2. If yes:

    • Return the stored value

  3. If no:

    • Compute the result

    • Store it in memory

    • Return the result


Key Properties

  • Typically used with recursive solutions

  • Converts exponential time complexity to polynomial

  • Uses extra space for caching results


Example

In Fibonacci:

Without memoization → repeated calculations
With memoization → each value computed only once


Time Complexity

  • Reduced from O(2ⁿ) to O(n) in many cases


Key Points

  • Stores results of previous computations

  • Eliminates redundant work

  • Improves efficiency of recursive algorithms

  • Core concept in dynamic programming

41. What is Tabulation in Dynamic Programming?

Definition

Tabulation is a bottom-up approach in dynamic programming where problems are solved by iteratively filling a table (array) starting from base cases and building up to the final solution.


Key Idea

  • Solve smaller subproblems first

  • Store results in a table (DP array)

  • Use these results to compute larger subproblems


How It Works

  1. Initialize a DP table with base cases

  2. Iterate through all subproblems in a systematic order

  3. Use previously computed values to fill the table

  4. The final answer is stored in the table


Key Properties

  • Uses an iterative approach (no recursion)

  • Avoids function call overhead

  • Ensures each subproblem is solved only once


Example

Fibonacci using tabulation:

  • Start with base cases:
    dp[0] = 0, dp[1] = 1

  • Fill table:
    dp[i] = dp[i-1] + dp[i-2]


Time Complexity

  • Typically O(n) (depends on problem)


Key Characteristics

  • Bottom-up DP approach

  • Uses a table to store results

  • Eliminates recursion

  • Efficient and easy to implement for many problems

42. What is the Difference Between Greedy and Backtracking?

Definition

Greedy algorithms make the locally optimal choice at each step hoping to achieve a global optimum, while Backtracking explores all possible solutions and selects the valid or optimal one by undoing choices when needed.


Key Differences

Feature

Greedy

Backtracking

Approach

Chooses best option at each step

Explores all possible solutions

Decision Making

Local optimum

Global optimum (via exploration)

Reconsideration

Does not reconsider choices

Backtracks (undoes choices)

Complexity

Usually efficient (O(n), O(log n))

Often exponential (O(2ⁿ), O(n!))

Implementation

Simple and fast

More complex (uses recursion)

Guarantee

May not always give optimal solution

Guarantees correct solution


Examples

  • Greedy:

    • Activity Selection

    • Huffman Coding

    • Dijkstra (without negative weights)

  • Backtracking:

    • N-Queens

    • Sudoku Solver

    • Permutations / Subsets


Key Points

  • Greedy: Fast, simple, but not always optimal

  • Backtracking: Slower, but guarantees correct solution

  • Use greedy when problem has greedy-choice property

  • Use backtracking when all possibilities must be explored

43. What is Space-Time Tradeoff?

Definition

The space-time tradeoff is a concept in computer science where an algorithm uses more memory (space) to reduce execution time, or uses less memory at the cost of increased execution time.


Key Idea

  • You can save time by using extra space

  • Or save space by doing more computations (taking more time)


Examples

1. Using Extra Space to Save Time

  • Hashing (Hash Map)

    • Lookup in O(1) time

    • Uses extra memory


2. Saving Space but Increasing Time

  • Brute Force Search

    • No extra memory

    • Takes O(n) time


3. Dynamic Programming

  • Stores intermediate results (extra space)

  • Avoids recomputation → reduces time


Key Characteristics

  • Tradeoff between memory usage and speed

  • Choice depends on problem constraints

  • Important in system design and optimization

  • More space → faster execution

  • Less space → slower execution

  • Balance depends on requirements and constraints


Applications

  • Caching systems

  • Dynamic programming

  • Database indexing

  • Algorithm optimization

44. What is Tail Recursion?

Definition

Tail Recursion is a type of recursion where the recursive call is the last operation performed in the function, and there is no pending work after the call returns.


Key Idea

  • The function returns the result of the recursive call directly

  • No computation is done after the recursive call


Example

Tail Recursive Function (Factorial)

def factorial(n, result=1):

    if n == 0:

        return result

    return factorial(n - 1, n * result)


Here, the recursive call is the last statement, and no further work is needed after it.


Non-Tail Recursive Example

def factorial(n):

    if n == 0:

        return 1

    return n * factorial(n - 1)


Here, multiplication happens after the recursive call, so it is not tail recursion.


Key Properties

  • Recursive call is the final operation

  • No pending operations after the call

  • Can be optimized by the compiler into iteration (tail call optimization)


Benefits

  • Reduces stack space usage

  • Prevents stack overflow in deep recursion

  • Can be as efficient as loops


Key Points

  • Last operation is the recursive call

  • No work after recursion

  • Enables tail call optimization (TCO)

  • Improves space efficiency

45. What is Iterative Deepening Search?

Definition

Iterative Deepening Search (IDS) is a graph traversal algorithm that combines the advantages of Depth-First Search (DFS) and Breadth-First Search (BFS) by performing depth-limited DFS repeatedly with increasing depth limits.


Key Idea

  • Perform DFS with a depth limit

  • Increase the depth limit step by step

  • Repeat until the goal node is found


How It Works

  1. Start with depth limit = 0

  2. Perform Depth-Limited DFS

  3. If goal is not found:

    • Increase depth limit by 1

  4. Repeat until the goal is reached


Key Properties

  • Combines space efficiency of DFS

  • Ensures optimal solution like BFS (for uniform cost)

  • Explores nodes level by level


Time Complexity

  • O(b^d)
    (b = branching factor, d = depth of solution)


Space Complexity

  • O(d) (same as DFS)


Applications

  • AI search problems

  • Game tree exploration

  • Pathfinding in unknown depth graphs


Key Points

  • Repeated depth-limited DFS

  • Finds optimal solution like BFS

  • Uses less memory than BFS

  • Suitable when depth of solution is unknown

46. What is the Difference Between BFS, DFS, and Dijkstra?

Definition

  • BFS (Breadth-First Search): Traverses a graph level by level using a queue

  • DFS (Depth-First Search): Traverses as deep as possible before backtracking using a stack/recursion

  • Dijkstra’s Algorithm: Finds the shortest path from a source node to all other nodes in a weighted graph


Key Differences

Feature

BFS

DFS

Dijkstra

Type

Traversal Algorithm

Traversal Algorithm

Shortest Path Algorithm

Data Structure

Queue

Stack / Recursion

Priority Queue (Min Heap)

Graph Type

Unweighted

Unweighted

Weighted (non-negative weights)

Goal

Visit all nodes level-wise

Explore paths deeply

Find shortest distances

Shortest Path

Yes (only in unweighted graphs)

No guarantee

Yes (weighted graphs)

Approach

Level-by-level

Depth-first

Greedy (minimum distance first)


Time Complexity

Algorithm

Time Complexity

BFS

O(V + E)

DFS

O(V + E)

Dijkstra

O((V + E) log V)


Use Cases

  • BFS: Shortest path in unweighted graphs, level-order traversal

  • DFS: Cycle detection, topological sorting, backtracking

  • Dijkstra: Shortest path in weighted graphs (e.g., maps, routing)


Key Characteristics

  • BFS: Uses queue, explores level-wise

  • DFS: Uses stack/recursion, explores depth-wise

  • Dijkstra: Uses priority queue, finds minimum distance paths

  • BFS & DFS are traversal algorithms, while Dijkstra is a shortest path algorithm

47. How Do You Detect Cycles in Directed Graphs?

Definition

Cycle detection in a directed graph involves checking whether there exists a path that starts and ends at the same node following edge directions.


Methods to Detect Cycles

1. Using DFS (Recursion Stack)

Key Idea

  • Track visited nodes and maintain a recursion stack

  • If a node is visited again while still in the recursion stack → cycle exists

Steps

  1. Mark node as visited

  2. Add node to recursion stack

  3. Recur for all neighbors:

    • If neighbor is not visited → DFS

    • If neighbor is in recursion stack → cycle detected

  4. Remove node from recursion stack after DFS


2. Using Topological Sorting (Kahn’s Algorithm)

Key Idea

  • A directed graph has a cycle if topological sorting is not possible

Steps

  1. Compute in-degree of all nodes

  2. Add nodes with in-degree = 0 to a queue

  3. Process nodes and reduce in-degrees

  4. If all nodes are processed → no cycle

  5. If some nodes remain → cycle exists


Time Complexity

Method

Time Complexity

DFS

O(V + E)

Kahn’s Algorithm

O(V + E)


Key Characteristics

  • Applicable only to directed graphs

  • DFS with recursion stack is most commonly used

  • Topological sort failure indicates a cycle

  • Both methods run in O(V + E) time

48. How Do You Check if a Binary Tree is a BST?

Definition

To check if a binary tree is a Binary Search Tree (BST), we verify that for every node:

  • All nodes in the left subtree are less than the node

  • All nodes in the right subtree are greater than the node


Method 1: Using Range (Min-Max Approach)

Key Idea

  • Each node must lie within a valid range (min, max)

  • Update range while traversing the tree

Steps

  1. Start with range (-∞, +∞)

  2. For each node:

    • Check if value is within range

    • Recursively check:

      • Left subtree → range (min, node.value)

      • Right subtree → range (node.value, max)


Method 2: Inorder Traversal

Key Idea

  • Inorder traversal of a BST gives a sorted sequence

Steps

  1. Perform inorder traversal

  2. Check if elements are in strictly increasing order


Time Complexity

Method

Time Complexity

Both methods

O(n)


Space Complexity

  • O(h) (recursion stack, where h = height of tree)


Key Characteristics

  • Use range validation for correctness

  • Inorder traversal should be sorted

  • Must check entire subtree, not just immediate children

  • Works in linear time O(n)

49. How Do You Find the Lowest Common Ancestor (LCA) in a Tree?

Definition

The Lowest Common Ancestor (LCA) of two nodes in a tree is the lowest (deepest) node that has both nodes as descendants.


Method 1: General Binary Tree (DFS Approach)

Key Idea

  • Traverse the tree recursively

  • If a node matches either of the target nodes, return it

  • If both left and right subtrees return non-null → current node is LCA

Steps

  1. If root is null, return null

  2. If root equals p or q, return root

  3. Recursively search in left and right subtrees

  4. If both sides return non-null → current node is LCA

  5. Else return the non-null result


Method 2: Binary Search Tree (BST)

Key Idea

  • Use BST property to find LCA efficiently

Steps

  1. If both nodes are less than root → move to left subtree

  2. If both nodes are greater than root → move to right subtree

  3. Else → current node is LCA


Time Complexity

Case

Time Complexity

Binary Tree

O(n)

BST

O(h)

(h = height of tree)


Space Complexity

  • O(h) (recursion stack)


Key Characteristics

  • LCA is the lowest node common to both nodes

  • Use DFS for general trees

  • Use BST properties for optimization

  • Works in linear or logarithmic time depending on tree type

50. How Do You Serialize and Deserialize a Binary Tree?

Definition

Serialization is the process of converting a binary tree into a string or array representation, while Deserialization is the process of reconstructing the tree from that representation.


Key Idea

  • Preserve both node values and structure

  • Use a special marker (e.g., "null") to represent missing nodes


Method: Using Level Order Traversal (BFS)

Serialization Steps

  1. Use a queue for level-order traversal

  2. For each node:

    • Add its value to the result

    • If node is null, add "null"

  3. Continue until all nodes are processed


Deserialization Steps

  1. Read values from the serialized data

  2. Create the root node

  3. Use a queue to reconstruct the tree

  4. For each node:

    • Assign left and right children using next values

    • Add non-null children to the queue


Alternative Method: Using Preorder Traversal (DFS)

Serialization

  • Traverse in preorder (root → left → right)

  • Add "null" for missing nodes

Deserialization

  • Reconstruct tree recursively using the same order


Time Complexity

Operation

Time Complexity

Serialize

O(n)

Deserialize

O(n)


Key Points

  • Must store structure + values

  • Use "null" markers for missing nodes

  • Can use BFS or DFS approaches

  • Ensures accurate reconstruction of the tree

51. What is the Diameter of a Binary Tree?

Definition

The diameter of a binary tree is the length of the longest path between any two nodes in the tree.

This path may or may not pass through the root.


Key Idea

  • The longest path between two nodes can be:

    • Entirely in the left subtree

    • Entirely in the right subtree

    • Passing through the current node


Approach

For each node:

  • Compute:

    • Height of left subtree

    • Height of right subtree

  • Diameter through that node = left height + right height

  • Take the maximum over all nodes


Algorithm

  1. Traverse the tree using DFS

  2. For each node:

    • Calculate left and right heights

    • Update maximum diameter

  3. Return height to parent


Time Complexity

Operation

Time Complexity

Overall

O(n)


Space Complexity

  • O(h) (recursion stack, where h = height of tree)


Key Points

  • Diameter = longest path between any two nodes

  • May or may not pass through root

  • Computed using height of subtrees

  • Solved efficiently using DFS in O(n) time

52. How Do You Detect a Loop in a Linked List?

Definition

Loop detection in a linked list involves checking whether the list contains a cycle, i.e., a node that points back to a previous node instead of NULL.


Method: Floyd’s Cycle Detection (Tortoise and Hare)

Key Idea

  • Use two pointers:

    • Slow pointer → moves one step at a time

    • Fast pointer → moves two steps at a time

  • If a cycle exists, both pointers will eventually meet


Steps

  1. Initialize:

    • slow = head

    • fast = head

  2. Traverse the list:

    • Move slow by 1 step

    • Move fast by 2 steps

  3. If at any point:

    • slow == fastcycle detected

  4. If fast reaches NULL:

    • No cycle exists


Time Complexity

Operation

Time Complexity

Detection

O(n)


Space Complexity

  • O(1) (no extra space used)


Alternative Method

Using Hash Set

  • Store visited nodes in a set

  • If a node is visited again → cycle exists


Key Points

  • Most efficient method: Floyd’s Algorithm

  • Uses two-pointer technique

  • Works in O(n) time and O(1) space

  • Detects cycle without modifying the list