Advanced Data Structure Interview Questions

Last updated: Mar 5, 2026

Author :

ExamAdda Team

1. What is an AVL Tree?

Definition

An AVL Tree is a self-balancing binary search tree (BST) in which the balance factor (difference between heights of left and right subtrees) of every node is at most 1.

Explanation

Ensures tree height remains logarithmic
Balance Factor = height(left) − height(right)
Values must be -1, 0, or 1
If violated, rotations are applied

Key Points

Strictly balanced compared to Red-Black Trees
Guarantees faster lookups
May require more rotations during insert/delete

Pseudo Code (Insertion with Balancing)

insert(node, key):

if node == NULL:

return new Node(key)

if key < node.data:

node.left = insert(node.left, key)

else:

node.right = insert(node.right, key)

updateHeight(node)

balance = height(node.left) - height(node.right)

if balance > 1:

if key < node.left.data:

return rightRotate(node)

else:

node.left = leftRotate(node.left)

return rightRotate(node)

if balance < -1:

if key > node.right.data:

return leftRotate(node)

else:

node.right = rightRotate(node.right)

return leftRotate(node)

return node

2. What are AVL Tree rotations?

Definition

Rotations are operations used to rebalance an AVL Tree when its balance factor goes out of range.

Types

LL Rotation (Right Rotation)
RR Rotation (Left Rotation)
LR Rotation (Left + Right)
RL Rotation (Right + Left)

Key Idea

Rearrange nodes while preserving BST property
Restore balance in O(1) time per rotation

Pseudo Code (Right Rotation)

rightRotate(y):

x = y.left

T2 = x.right

x.right = y

y.left = T2

updateHeight(y)

updateHeight(x)

return x

3. What is a Red-Black Tree?

Definition

A Red-Black Tree is a self-balancing BST where nodes are colored red or black to maintain balance.

Explanation

Less strict balancing than AVL
Ensures height ≤ 2 log n
Uses coloring + rotations

Key Points

Fewer rotations than AVL
Used in standard libraries (TreeMap, TreeSet)

Pseudo Code (Basic Fix)

if uncle is RED:

recolor(parent, uncle, grandparent)

else:

perform rotation

4. What are the properties of Red-Black Trees?

Properties

Every node is red or black
Root is black
No two consecutive red nodes
Same number of black nodes on all paths
Leaves are black (NULL nodes)

Key Idea

Maintains approximate balance
Prevents skewed trees

5. What is a B-Tree?

Definition

A B-Tree is a multi-way self-balancing search tree where nodes can have multiple keys and children.

Explanation

Designed for disk storage systems
Reduces height using high branching factor

Key Points

All leaves at same level
Nodes contain multiple keys
Efficient for large data

Pseudo Code (Search)

search(node, key):

find index i

if key == node.keys[i]:

return found

if leaf:

return not found

return search(node.children[i], key)

6. Where are B-Trees used in databases?

Explanation

Used in database indexing
Reduces number of disk reads
Efficient for range queries

Key Points

Used in MySQL, PostgreSQL indexing
Used in file systems (NTFS, HFS+)

7. What is a B+ Tree?

Definition

A B+ Tree is a variation of B-Tree where all data is stored in leaf nodes, and internal nodes store only keys.

Explanation

Leaf nodes are linked
Efficient for range queries and sequential access

Key Points

Faster search in databases
Used in indexing systems

8. Difference between B-Tree and B+ Tree

Key Differences

Feature	B-Tree	B+ Tree
Data Storage	Internal + Leaf	Leaf only
Leaf Nodes	Not linked	Linked
Range Queries	Slower	Faster

9. What is a Segment Tree?

Definition

A Segment Tree is a binary tree used for efficient range queries and updates.

Explanation

Each node stores information about a segment (range)
Useful for sum, min, max queries

Pseudo Code (Query)

query(node, start, end, l, r):

if r < start or end < l:

return 0

if l <= start and end <= r:

return tree[node]

mid = (start + end) // 2

return query(left) + query(right)

10. What are the applications of Segment Trees?

Applications

Range sum queries
Range minimum queries
Lazy propagation

Key Idea

Efficient for multiple queries + updates

11. What is a Fenwick Tree (Binary Indexed Tree)?

Definition

A Fenwick Tree is used to efficiently compute prefix sums and updates.

Explanation

Uses binary indexing
Compact and efficient

Pseudo Code

update(i, val):

while i <= n:

BIT[i] += val

i += i & (-i)

12. What is a Trie Data Structure?

Definition

A Trie is a tree structure used to store strings based on prefixes.

Explanation

Each node represents a character
Efficient for prefix search

Pseudo Code

search(word):

node = root

for char in word:

if char not in node.children:

return false

node = node.children[char]

return node.isEnd

13. What are the applications of Trie?

Applications

Autocomplete
Spell check
Dictionary search

Key Idea

Fast prefix matching → O(L)

14. What is a Suffix Tree?

Definition

A Suffix Tree is a compressed trie containing all suffixes of a string.

Explanation

Supports fast pattern matching
Built in linear time

Key Points

Used in string processing problems

15. What is a Suffix Array?

Definition

A Suffix Array is a sorted array of all suffixes of a string.

Explanation

Stores indices of suffixes
Uses sorting + binary search

Pseudo Code

suffixes = []

for i in range(n):

suffixes.append((string[i:], i))

sort(suffixes)

Key Points

Space-efficient alternative to suffix tree
Used in string matching

16. What is a Skip List?

Definition

A Skip List is a probabilistic data structure that allows fast search, insertion, and deletion by maintaining multiple levels of linked lists.

Explanation

Consists of multiple layers where each higher level skips more elements
Each element may appear in multiple levels based on probability
Higher levels act as express lanes

Working Principle

Start from the top level
Move forward while next element is smaller
Drop down levels until the target is found

Time Complexity

Average → O(log n)
Worst → O(n)

Pseudo Code

search(key):

node = head

for level from top to bottom:

while node.next[level] != NULL and node.next[level].value < key:

node = node.next[level]

node = node.next[0]

return node.value == key

17. What is a Deque and how is it implemented?

Definition

A Deque (Double Ended Queue) is a data structure that allows insertion and deletion from both front and rear.

Implementation

Using circular array
Using doubly linked list

Working Principle

Maintain two pointers: front and rear
Wrap around using modulo in array implementation

Pseudo Code

insertFront(x):

front = (front - 1 + size) % size

arr[front] = x

insertRear(x):

rear = (rear + 1) % size

arr[rear] = x

18. What are the applications of Deque?

Applications

Sliding window problems
Palindrome checking
Undo/Redo systems
Task scheduling

Explanation

Supports both stack and queue operations
Highly flexible for real-time processing

19. How is LRU Cache implemented?

Definition

An LRU Cache removes the least recently used element when capacity is exceeded.

Implementation

HashMap + Doubly Linked List

Working Principle

HashMap → O(1) lookup
Doubly Linked List → maintain order

Pseudo Code

get(key):

if key not in map:

return -1

move node to front

return value

put(key, value):

if key exists:

update and move to front

else:

if capacity full:

remove tail

insert at front

20. Which data structures are used in LRU Cache?

HashMap → for fast lookup
Doubly Linked List → to track usage order

Explanation

Map stores key → node
List keeps most recent at front

21. What is the difference between heap and priority queue?

Definition

A heap is a data structure, while a priority queue is an abstract concept implemented using a heap.

Key Differences

Feature	Heap	Priority Queue
Type	Data structure	Abstract data type
Implementation	Binary heap	Heap/array
Purpose	Maintain order	Process by priority

Explanation

Heap is commonly used to implement priority queue

22. How does Java HashMap handle collisions?

Explanation

Uses chaining (linked list) initially
Converts to Red-Black Tree when collisions increase

Working Principle

If bucket size exceeds threshold → convert to tree
Improves worst-case performance

23. What is rehashing in HashMap?

Definition

Rehashing is the process of resizing the hash table and redistributing elements.

Working Principle

Create new table with larger size
Recompute hash values
Insert elements again

Reason

Maintain performance
Reduce collisions

24. What is consistent hashing?

Definition

Consistent hashing is a technique used in distributed systems to minimize data movement when nodes change.

Explanation

Maps keys and nodes on a circular ring
Each key assigned to nearest node

Benefits

Scalable
Minimal re-distribution

25. What is a distributed hash table?

Definition

A Distributed Hash Table (DHT) distributes data across multiple nodes using hashing.

Explanation

Each node stores part of data
Lookup done using hash

Applications

Peer-to-peer systems
Distributed storage

26. What is union-find (disjoint set)?

Definition

Union-Find is a data structure used to manage disjoint sets.

Operations

Find → find root
Union → merge sets

Pseudo Code

find(x):

if parent[x] != x:

return find(parent[x])

return x

27. What is path compression in union-find?

Definition

Path compression is an optimization that flattens the tree structure.

Working Principle

While finding root, update parent to root

Pseudo Code

find(x):

if parent[x] != x:

parent[x] = find(parent[x])

return parent[x]

Benefit

Reduces complexity to nearly O(1)

28. What is a strongly connected component in graph?

Definition

A Strongly Connected Component (SCC) is a subgraph where every vertex is reachable from every other vertex.

Explanation

Applicable to directed graphs
Ensures mutual reachability

Applications

Network analysis
Graph decomposition

29. What is Kosaraju’s Algorithm?

Definition

Kosaraju’s Algorithm is used to find Strongly Connected Components in a graph.

Steps

Perform DFS and store nodes in stack
Reverse the graph
Perform DFS in order of stack

Pseudo Code

dfs(node):

mark visited

for neighbor:

dfs(neighbor)

push node to stack

reverse graph

while stack not empty:

node = pop

if not visited:

dfs(reverse_graph)

Time Complexity

O(V + E)

30. What is Tarjan’s Algorithm?

Definition

Tarjan’s Algorithm is an efficient algorithm used to find all Strongly Connected Components (SCCs) in a directed graph using a single DFS traversal.

Explanation

Uses DFS traversal with timestamps
Maintains two arrays:

disc[] → discovery time of node
low[] → lowest reachable node

Uses a stack to track current path

Working Principle

During DFS, assign discovery time
Update low values using back edges
If disc[node] == low[node], it is the root of an SCC

Pseudo Code

dfs(u):

disc[u] = low[u] = time++

push u to stack

mark u as inStack

for v in neighbors:

if not visited:

dfs(v)

low[u] = min(low[u], low[v])

else if v in stack:

low[u] = min(low[u], disc[v])

if disc[u] == low[u]:

while stack not empty:

pop nodes → SCC

Time Complexity

O(V + E)

31. What is Floyd-Warshall Algorithm?

Definition

The Floyd-Warshall Algorithm is used to find the shortest paths between all pairs of vertices in a weighted graph.

Explanation

Works for directed and undirected graphs
Handles negative weights (but no negative cycles)
Uses dynamic programming

Working Principle

Gradually updates shortest paths using intermediate vertices

Pseudo Code

for k in range(n):

for i in range(n):

for j in range(n):

dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])

Time Complexity

O(n³)

32. What is A* search algorithm?

Definition

A* (A-star) is a pathfinding algorithm that finds the shortest path using heuristics.

Explanation

Combines:

g(n) → cost from start
h(n) → estimated cost to goal

Total cost → f(n) = g(n) + h(n)

Working Principle

Always expands node with lowest f(n)
Uses priority queue

Applications

GPS navigation
Game pathfinding

33. What is network flow problem?

Definition

The network flow problem involves finding the maximum flow from a source to a sink in a graph with capacity constraints.

Explanation

Each edge has a capacity
Flow must not exceed capacity
Flow into node = flow out (except source/sink)

Applications

Traffic networks
Data routing
Bipartite matching

34. What is Ford-Fulkerson Algorithm?

Definition

The Ford-Fulkerson Algorithm is used to find the maximum flow in a flow network.

Working Principle

Find augmenting paths from source to sink
Push flow along path
Update residual capacities

Pseudo Code

while path exists from source to sink:

find minimum capacity in path

update residual graph

add flow

Time Complexity

O(E × max_flow)

35. What is Edmonds-Karp Algorithm?

Definition

The Edmonds-Karp Algorithm is an implementation of Ford-Fulkerson that uses BFS to find shortest augmenting paths.

Explanation

Always chooses shortest path (in terms of edges)
Improves performance and guarantees termination

Pseudo Code

while BFS finds path:

find bottleneck capacity

update residual graph

Time Complexity

O(V × E²)

36. What is topological sorting used for?

Definition

Topological sorting is used to order vertices in a Directed Acyclic Graph (DAG) such that dependencies are respected.

Applications

Task scheduling
Course prerequisite planning
Build systems

Explanation

Ensures that a node appears before all nodes dependent on it

37. What is dynamic programming optimization?

Definition

Dynamic programming optimization refers to techniques used to reduce time or space complexity in DP problems.

Techniques

Space optimization (reduce DP table size)
Memoization improvements
Divide and conquer DP

Explanation

Focuses on eliminating redundant computations

38. What is memoization?

Definition

Memoization is a top-down DP technique where results of recursive calls are stored to avoid recomputation.

Explanation

Uses recursion + caching
Stores computed values in array/map

Pseudo Code

dp(n):

if n in cache:

return cache[n]

cache[n] = dp(n-1) + dp(n-2)

return cache[n]

Time Complexity

Reduced from exponential → O(n)

39. What is tabulation in dynamic programming?

Definition

Tabulation is a bottom-up DP approach where the solution is built iteratively using a table.

Explanation

Start from base cases
Fill table step by step

Pseudo Code

dp[0] = 0

dp[1] = 1

for i in range(2, n):

dp[i] = dp[i-1] + dp[i-2]

Advantage

No recursion overhead
More efficient than memoization

40. What is the difference between greedy and backtracking?

Definition

Greedy and backtracking are two problem-solving approaches with different strategies.

Key Differences

Feature	Greedy	Backtracking
Approach	Local optimal choice	Try all possibilities
Optimality	Not always optimal	Guarantees optimal
Complexity	Faster	Slower
Example	Kruskal	N-Queens

Explanation

Greedy → no reconsideration
Backtracking → explores and backtracks

41. What is space-time tradeoff?

Definition

Space-time tradeoff is a concept where extra memory is used to reduce execution time, or vice versa.

Explanation

More memory → faster computation
Less memory → slower computation

Examples

Hashing (more space, faster lookup)
Recursion vs iteration

42. What is tail recursion?

Definition

Tail recursion is a form of recursion where the recursive call is the last operation in the function.

Explanation

No work after recursive call
Can be optimized into iteration

Pseudo Code

fact(n, result):

if n == 0:

return result

return fact(n-1, n * result)

43. What is iterative deepening search?

Definition

Iterative Deepening Search (IDS) is a search technique that combines DFS and BFS, by repeatedly performing DFS with increasing depth limits.

Explanation

Start with depth = 0
Increase depth gradually
Combines DFS memory efficiency with BFS completeness

Pseudo Code

for depth = 0 to max:

result = DFS(root, depth)

if found:

return result

Advantages

Uses less memory than BFS
Guarantees shortest path like BFS

44. What is the difference between BFS, DFS, and Dijkstra?

Definition

BFS, DFS, and Dijkstra are graph traversal/pathfinding algorithms that differ in approach and purpose.

Key Differences

Feature	BFS	DFS	Dijkstra
Type	Traversal	Traversal	Shortest path
Approach	Level-wise	Depth-first	Greedy
Data Structure	Queue	Stack/Recursion	Priority Queue
Shortest Path	Yes (unweighted)	No guarantee	Yes (weighted)
Graph Type	Unweighted	Any	Weighted (non-negative)

Explanation

BFS → explores neighbors first, finds shortest path in unweighted graphs
DFS → explores deeply, useful for traversal and backtracking
Dijkstra → computes shortest path using weights

Pseudo Code (BFS)

queue.push(source)

while queue not empty:

node = queue.pop()

for neighbor:

if not visited:

queue.push(neighbor)

Pseudo Code (Dijkstra)

dist[source] = 0

push into priority queue

while pq not empty:

node = extract min

for neighbor:

if dist[u] + weight < dist[v]:

update distance

45. How do you detect cycles in directed graphs?

Definition

Cycle detection in directed graphs checks whether there exists a path that starts and ends at the same node.

Approach (DFS + Recursion Stack)

Maintain:

visited[]
recStack[]

Working Principle

If a node is revisited while still in recursion stack → cycle exists

Pseudo Code

dfs(node):

visited[node] = true

recStack[node] = true

for neighbor:

if not visited:

if dfs(neighbor):

return true

else if recStack[neighbor]:

return true

recStack[node] = false

return false

Time Complexity

O(V + E)

46. How do you check if a binary tree is a BST?

Definition

A binary tree is a BST if for every node:

Left subtree contains smaller values
Right subtree contains larger values

Approach (Range Validation)

Each node must lie within a valid range

Pseudo Code

isBST(node, min, max):

if node == NULL:

return true

if node.data <= min or node.data >= max:

return false

return isBST(node.left, min, node.data) and

isBST(node.right, node.data, max)

Time Complexity

O(n)

47. How do you find the Lowest Common Ancestor in a tree?

Definition

The Lowest Common Ancestor (LCA) is the lowest node in a tree that has both given nodes as descendants.

Approach (Recursive)

If current node matches one of the nodes → return it
Search left and right
If both return non-null → current node is LCA

Pseudo Code

LCA(root, p, q):

if root == NULL or root == p or root == q:

return root

left = LCA(root.left, p, q)

right = LCA(root.right, p, q)

if left != NULL and right != NULL:

return root

return left if left else right

Time Complexity

O(n)

48. How do you serialize and deserialize a binary tree?

Definition

Serialization → Convert tree into a string
Deserialization → Reconstruct tree from string

Approach (Preorder Traversal)

Use special marker (e.g., NULL) for empty nodes

Pseudo Code (Serialize)

serialize(root):

if root == NULL:

return "N"

return root.val + "," +

serialize(root.left) + "," +

serialize(root.right)

Pseudo Code (Deserialize)

deserialize(data):

if value == "N":

return NULL

node = new Node(value)

node.left = deserialize(next)

node.right = deserialize(next)

return node

Time Complexity

O(n)

49. What is the diameter of a binary tree?

Definition

The diameter of a binary tree is the length of the longest path between any two nodes.

Explanation

Path may or may not pass through root
Measured in terms of edges or nodes

Approach

For each node:

Compute height of left and right subtree
Diameter = max(left height + right height)

Pseudo Code

diameter(root):

if root == NULL:

return 0

left = height(root.left)

right = height(root.right)

return max(left + right,

diameter(root.left),

diameter(root.right))

Optimized Approach

Compute height and diameter together in one traversal

Time Complexity

Naive → O(n²)
Optimized → O(n)

50. How do you detect a loop in a linked list?

Definition

Detecting a loop means checking if a linked list has a cycle where a node points back to a previous node.

Approach (Floyd’s Cycle Detection)

Use two pointers:

Slow pointer (1 step)
Fast pointer (2 steps)

Working Principle

If there is a cycle, both pointers will eventually meet

Pseudo Code

detectLoop(head):

slow = head

fast = head

while fast and fast.next:

slow = slow.next

fast = fast.next.next

if slow == fast:

return true

return false

Time Complexity

O(n)
Space → O(1)