1. What is a Transaction in DBMS?

A transaction is a sequence of one or more database operations executed as a single logical unit of work.

Key Idea

  • Either all operations succeed or none are applied

Characteristics

  • Begins with BEGIN / START TRANSACTION
  • Ends with COMMIT or ROLLBACK

Example (Bank Transfer)

Key Insight

  • Ensures data consistency and reliability

2. What are the ACID Properties of a Transaction?

ACID ensures reliable transaction processing.

A – Atomicity

  • All operations execute completely or not at all
  • Failure → rollback

C – Consistency

  • Database moves from one valid state to another
  • Constraints must be maintained

I – Isolation

  • Transactions run independently
  • No interference from other transactions

D – Durability

  • Once committed → changes are permanent
  • Survive system crash

Example

Bank transfer must either:

  • Deduct + add → success
  • OR → rollback fully

3. What is Transaction Management?

Transaction management is the process of handling transactions to ensure ACID properties.

Key Responsibilities

  • Start and end transactions
  • Maintain consistency
  • Handle failures and recovery

Components

  • Transaction Manager
  • Log Manager
  • Recovery Manager

Goal

  • Ensure safe execution of concurrent transactions

4. What is Concurrency Control?

Concurrency control ensures that multiple transactions execute simultaneously without conflict.

Purpose

  • Maintain data consistency
  • Avoid anomalies

Example

Two users updating the same bank balance simultaneously

5. Why is Concurrency Control Required?

Problems Without It

1. Lost Update

  • One transaction overwrites another

2. Dirty Read

  • Reading uncommitted data

3. Non-Repeatable Read

  • Same query returns different results

4. Phantom Read

  • New rows appear during execution

Conclusion

  • Ensures correct and predictable results

6. What are the Different Types of Locks in DBMS?

Locks are used to control access to data during transactions.

Main Types

1. Shared Lock (S Lock)

  • Read-only access

2. Exclusive Lock (X Lock)

  • Read + write access

Other Lock Types

  • Binary Lock (Locked/Unlocked)
  • Intent Locks (for hierarchy)
  • Update Lock (intermediate lock)

7. What is a Shared Lock?

Definition

A shared lock (S lock) allows multiple transactions to read a resource simultaneously.

Key Properties

  • No modification allowed
  • Multiple readers allowed

Example

  • Many users viewing the same data

8. What is an Exclusive Lock?

An exclusive lock (X lock) allows a transaction to read and modify data.

Key Properties

  • Only one transaction allowed
  • Blocks both reads and writes from others

Example

  • Updating account balance

9. What is a Deadlock in DBMS?

A deadlock occurs when two or more transactions are waiting indefinitely for each other’s resources.

Condition

  • Circular waiting

Example

  • T1 locks A, waits for B
  • T2 locks B, waits for A

→ Deadlock occurs

10. How Can Deadlocks be Prevented?

1. Lock Ordering

  • Acquire locks in a fixed order

2. Timeout

  • Abort transaction if it waits too long

3. Deadlock Prevention Algorithms

Wait-Die

  • Older transaction waits, younger aborts

Wound-Wait

  • Older transaction aborts younger

4. Resource Allocation Strategy

  • Avoid circular wait

5. Detection & Recovery

  • Detect deadlock → rollback one transaction

11. What is Deadlock Detection?

Deadlock detection is a technique used by DBMS to identify deadlocks after they occur.

How It Works

  • The system maintains a Wait-For Graph (WFG)
  • Nodes → Transactions
  • Edges → Waiting relationships

👉 If a cycle exists → deadlock detected

Steps

  1. Build wait-for graph
  2. Check for cycles
  3. If found → deadlock exists

Resolution

  • Abort one or more transactions
  • Release locks

Key Insight

  • Detection = Allow deadlock, then fix it

12. What is Deadlock Prevention?

Deadlock prevention ensures that deadlocks never occur by restricting system behavior.

Approach

Break at least one of the four necessary conditions of deadlock:

  1. Mutual exclusion
  2. Hold and wait
  3. No preemption
  4. Circular wait

Techniques

1. Resource Ordering

  • Request resources in fixed order

2. Preemption

  • Force release of resources

3. No Hold and Wait

  • Request all resources at once

Key Insight

  • Prevention = Avoid possibility of deadlock

14. What is Deadlock Avoidance?

Deadlock avoidance ensures the system never enters an unsafe state.

Concept

  • Before granting a request → check if it's safe

Safe State

  • System can complete all transactions without deadlock

Algorithm

  • Banker’s Algorithm

Key Idea

  • Grant request only if:
    → System remains in safe state

Difference from Prevention

PreventionAvoidance
RestrictiveDynamic decision
No deadlocks possibleAvoid unsafe states

15. What is a Transaction Log?

A transaction log is a file that records all changes made to the database.

Contents

  • Transaction start
  • Data changes (before/after values)
  • Commit / rollback status

Purpose

  • Recovery after crash
  • Maintain durability

16. What is Recovery Management?

Definition

Recovery management ensures the database is restored to a consistent state after failure.

Types of Failures

  • System crash
  • Power failure
  • Disk failure

Techniques

1. Log-Based Recovery

  • Uses transaction logs

2. Shadow Paging

  • Maintain old + new copies

Goal

  • Ensure Atomicity + Durability

17. What is a Checkpoint in DBMS?

Definition

A checkpoint is a mechanism to save the current state of the database to reduce recovery time.

How It Works

  • Flush all changes to disk
  • Mark a checkpoint in log

Benefits

  • Faster recovery
  • Reduce log scanning

Example

Instead of replaying entire log → start from checkpoint

18. What is Durability in DBMS?

Definition

Durability ensures that once a transaction is committed, its changes are permanently stored.

Key Points

  • Survives system crash
  • Stored in disk/log

Example

  • Money transferred → must not disappear

Ensured By

  • Logs
  • Checkpoints
  • Stable storage

19. What is Isolation in DBMS?

Isolation ensures that transactions execute independently without interference.

Goal

  • Prevent data inconsistency

Example

Two users updating same account:

  • Should not see intermediate results

Key Insight

  • Makes concurrent execution appear serial

20. What are Isolation Levels?

Isolation levels define the degree of visibility between transactions.

1. Read Uncommitted

  • Can read uncommitted data
  • Allows dirty reads

2. Read Committed

  • Only committed data visible
  • Prevents dirty reads

3. Repeatable Read

  • Same query gives same result
  • Prevents non-repeatable reads

4. Serializable (Highest Level)

  • Full isolation
  • Equivalent to serial execution

Comparison Table

LevelDirty ReadNon-Repeatable ReadPhantom Read
Read Uncommitted❌ Yes❌ Yes❌ Yes
Read Committed✅ No❌ Yes❌ Yes
Repeatable Read✅ No✅ No❌ Yes
Serializable✅ No✅ No✅ No

21. What are Dirty Reads?

A dirty read occurs when a transaction reads data that has been modified but not yet committed by another transaction.

Problem

  • If the first transaction rolls back → the read data becomes invalid

Example

  • T1 updates balance → not committed
  • T2 reads updated balance
  • T1 rolls back → T2 used incorrect data

Prevention

  • Use Read Committed isolation level or higher

22. What are Non-Repeatable Reads?

A non-repeatable read occurs when the same query returns different results within the same transaction.

Cause

  • Another transaction modifies data between reads

Example

  • T1 reads salary = 5000
  • T2 updates salary to 7000 and commits
  • T1 reads again → 7000

Prevention

  • Use Repeatable Read isolation level

23. What are Phantom Reads?

A phantom read occurs when new rows appear or disappear during a transaction.

Cause

  • Another transaction inserts or deletes rows

Example

  • T1: SELECT all employees with salary > 5000
  • T2: Inserts new employee with salary 6000
  • T1 runs again → extra row appears

Prevention

  • Use Serializable isolation level

24. What is Normalization?

Normalization is the process of organizing data to reduce redundancy and improve data integrity.

Goals

  • Eliminate duplicate data
  • Ensure logical data storage
  • Prevent anomalies

Key Idea

  • Divide large tables into smaller, related tables

25. What are Normal Forms?

Normal forms are a set of rules used to evaluate and improve database design.

Main Types

  1. 1NF (First Normal Form)
  2. 2NF (Second Normal Form)
  3. 3NF (Third Normal Form)
  4. BCNF (Boyce-Codd Normal Form)

26. Explain 1NF (First Normal Form)

Rule

  • All attributes must contain atomic (indivisible) values

Requirements

  • No repeating groups
  • Each field contains single value

Violation Example

IDPhones
1123, 456

1NF Solution

IDPhone
1123
1456

27. Explain 2NF (Second Normal Form)

Rule

  • Must be in 1NF
  • No partial dependency

Partial Dependency

  • When non-key attribute depends on part of a composite key

Example

Student_IDCourse_IDStudent_Name
  • Student_Name depends only on Student_ID

Solution

Split into:

  • Student(Student_ID, Student_Name)
  • Enrollment(Student_ID, Course_ID)

28. Explain 3NF (Third Normal Form)

Rule

  • Must be in 2NF
  • No transitive dependency

Transitive Dependency

  • Non-key attribute depends on another non-key attribute

Example

| Student_ID | Dept_ID | Dept_Name |

  • Dept_Name depends on Dept_ID

Solution

  • Student(Student_ID, Dept_ID)
  • Department(Dept_ID, Dept_Name)

29. Explain BCNF (Boyce-Codd Normal Form)

BCNF is a stronger version of 3NF.

Rule

  • Every determinant must be a candidate key

Why Needed?

  • Handles anomalies not covered by 3NF

Example

If:

  • A → B
  • B → A

Both must be candidate keys

30. What is Data Redundancy and How is it Reduced?

Data redundancy is the unnecessary duplication of data in a database.

Problems Caused

  • Increased storage
  • Data inconsistency
  • Update anomalies

How to Reduce Redundancy

1. Normalization

  • Split tables logically

2. Proper Key Design

  • Use primary and foreign keys

3. Avoid Duplicate Storage

  • Store data in one place

4. Use Relationships

  • Link tables instead of repeating data

Example

Instead of:
| Student | Course | Course_Name |

Use:

  • Student table
  • Course table
  • Relationship table 

31. What is a Functional Dependency (FD)?

A functional dependency describes a relationship where one attribute uniquely determines another.

Notation

  • ABA \rightarrow B
    → Attribute A determines B

Key Idea

  • If two rows have the same A → they must have the same B

Example

  • Student_ID → Name
    → Each ID corresponds to one Name

Types

  • Trivial FD: A → A
  • Non-Trivial FD: A → B (B not part of A)

32. What is a Multivalued Dependency (MVD)?

A multivalued dependency occurs when one attribute determines multiple independent values of another attribute.

Notation

  • ABA \twoheadrightarrow B

Key Idea

  • B values are independent of other attributes

Example

Student:

  • Courses and Hobbies independent
StudentCourseHobby
ADBMSMusic
AOSMusic

Problem

  • Redundant combinations

Solution

  • Decompose into separate tables

33. What is Join Dependency (JD)?

A join dependency exists when a table can be reconstructed by joining multiple smaller tables.

Key Idea

  • Table = Join of projections

Example

Table R(A, B, C):

  • Can be split into (A, B) and (B, C)
  • Joined back without loss

Use

  • Leads to 5NF (Fifth Normal Form)

34. What is Indexing in DBMS?

Definition

Indexing is a technique to improve the speed of data retrieval.

Concept

  • Similar to index in a book
  • Avoids full table scan

Benefits

  • Faster queries
  • Efficient searching

Trade-offs

  • Extra storage
  • Slower inserts/updates

35. What are the Different Types of Indexes?

1. Primary Index

  • Based on primary key

2. Secondary Index

  • On non-key columns

3. Clustered Index

  • Data stored physically in sorted order

4. Non-Clustered Index

  • Separate structure

5. Composite Index

  • Based on multiple columns

6. Unique Index

  • Enforces uniqueness

36. What is a Clustered Index?

A clustered index determines the physical order of data in a table.

Key Characteristics

  • Data is stored sorted
  • Only one clustered index per table

Example

  • Table sorted by Student_ID

Advantage

  • Faster range queries

37. What is a Non-Clustered Index?

A non-clustered index stores index data separately from actual table data.

Structure

  • Contains pointers to actual rows

Key Characteristics

  • Multiple allowed per table
  • Does not affect physical storage

Example

  • Index on Name column

38. Difference: Clustered vs Non-Clustered Index

FeatureClustered IndexNon-Clustered Index
StoragePhysical orderSeparate structure
Count per tableOneMultiple
SpeedFaster for rangeFaster for lookup
Data locationSame as tablePointer-based

Key Insight

  • Clustered → actual data sorted
  • Non-clustered → reference to data

39. What is Query Optimization?

Query optimization is the process of selecting the most efficient execution plan for a query.

Goal

  • Minimize execution time
  • Reduce resource usage

Techniques

  • Index usage
  • Join reordering
  • Predicate pushdown

Example

  • Choosing index scan vs full table scan

40. What is Query Processing?

Query processing is the complete process of executing a SQL query.

Steps Involved

1. Parsing

  • Check syntax and validity

2. Translation

  • Convert into internal representation

3. Optimization

  • Choose best execution plan

4. Execution

  • Run the query

Flow

SQL Query → Parser → Optimizer → Execution Engine → Result

41. What is a Stored Procedure?

A stored procedure is a precompiled collection of SQL statements stored in the database and executed as a single unit.

Key Characteristics

  • Stored inside the DBMS
  • Can accept parameters (IN, OUT, INOUT)
  • Supports control logic (loops, conditions)

Example

Key Insight

  • Reusable, efficient, and centralized business logic

42. What are the Advantages of Stored Procedures?

1. Performance Improvement

  • Precompiled → faster execution

2. Code Reusability

  • Write once, use multiple times

3. Security

  • Restrict direct table access

4. Reduced Network Traffic

  • Execute logic on server side

5. Maintainability

  • Centralized logic → easy updates

44. What is a Trigger?

Definition

A trigger is a special type of stored procedure that automatically executes in response to specific events.

Trigger Events

  • INSERT
  • UPDATE
  • DELETE

Types

  • BEFORE Trigger
  • AFTER Trigger

Example

Key Insight

  • Event-driven automation

45. Difference Between Trigger and Stored Procedure

FeatureStored ProcedureTrigger
ExecutionManual callAutomatic
InvocationExplicitEvent-based
ControlUser-controlledSystem-controlled
Use CaseBusiness logicAuditing, validation

46. What is a Materialized View?

A materialized view is a view that stores data physically instead of just the query.

Key Characteristics

  • Stores actual data
  • Requires periodic refresh
  • Improves performance

Example

  • Precomputed sales summary

Use Case

  • Reporting systems
  • Data warehousing

47. Difference Between View and Materialized View

FeatureViewMaterialized View
StorageVirtualPhysical
Data freshnessAlways updatedNeeds refresh
PerformanceSlower (computed each time)Faster (precomputed)

Key Insight

  • View = dynamic
  • Materialized view = cached

48. What is Database Partitioning?

Partitioning divides a large table into smaller, manageable pieces.

Purpose

  • Improve performance
  • Easier maintenance
  • Better scalability

Types

  • Horizontal
  • Vertical

49. Difference Between Horizontal and Vertical Partitioning

Horizontal Partitioning

Definition

  • Rows are divided into partitions

Example

  • Students split by region

Vertical Partitioning

Definition

  • Columns are divided into partitions

Example

  • Separate personal info and academic info

Comparison

FeatureHorizontal PartitioningVertical Partitioning
DivisionRowsColumns
Use CaseLarge datasetsColumn optimization
ExampleRegion-wise dataSensitive columns

50. What is Database Replication?

Database replication is the process of copying and maintaining data across multiple databases.

Purpose

  • Improve availability
  • Enhance performance
  • Provide fault tolerance

Types

1. Master-Slave Replication

  • One master, multiple read replicas

2. Master-Master Replication

  • Multiple writable nodes

Benefits

  • Load balancing
  • Backup & disaster recovery
  • High availability

Challenges

  • Data consistency
  • Replication lag 

51. Types of Database Replication

Database replication involves copying data across multiple servers to ensure availability, performance, and fault tolerance.

1. Master-Slave (Primary–Replica) Replication

  • One master handles writes
  • Replicas handle read operations

Advantages

  • Simple setup
  • Good for read-heavy systems

Limitation

  • Single point of write failure

2. Master-Master Replication

  • Multiple nodes can read and write

Advantages

  • High availability
  • No single point of failure

Challenges

  • Conflict resolution

3. Synchronous Replication

  • Data written to all replicas at the same time

Pros

  • Strong consistency

Cons

  • Slower performance

4. Asynchronous Replication

  • Data replicated after commit

Pros

  • Faster writes

Cons

  • Possible data lag

5. Snapshot Replication

  • Copies entire dataset at intervals

6. Transactional Replication

  • Replicates changes in real-time

52. What is Database Sharding?

Sharding is a technique of splitting a large database into smaller, distributed pieces (shards) across multiple servers.

Key Idea

  • Each shard contains part of the data

Example

Users split by:

  • Region
  • User_ID range

Benefits

  • Horizontal scalability
  • Improved performance

Challenge

  • Complex queries across shards

53. Difference Between Sharding and Partitioning

FeatureShardingPartitioning
ScopeAcross multiple serversWithin a single database
TypeHorizontal scalingLogical division
ComplexityHighLower
Use CaseLarge distributed systemsPerformance optimization

Key Insight

  • Partitioning = inside DB
  • Sharding = across servers

54. What is OLTP (Online Transaction Processing)?

OLTP systems are designed to handle real-time transactional operations.

Characteristics

  • Large number of short transactions
  • High concurrency
  • Fast response time

Examples

  • Banking systems
  • E-commerce orders

55. What is OLAP (Online Analytical Processing)?

OLAP systems are designed for complex queries and data analysis.

Characteristics

  • Large datasets
  • Complex aggregations
  • Read-heavy workload

Examples

  • Business intelligence
  • Data analytics dashboards

56. Difference Between OLTP and OLAP

FeatureOLTPOLAP
PurposeTransactionsAnalysis
QueriesSimple, fastComplex, long-running
DataCurrentHistorical
UsersEnd-usersAnalysts

58. What is Data Warehousing?

A data warehouse is a centralized repository for storing historical and analytical data.

Key Characteristics

  • Subject-oriented
  • Integrated
  • Time-variant
  • Non-volatile

Purpose

  • Support decision-making
  • Enable OLAP operations

Architecture Components

  • Data sources
  • ETL (Extract, Transform, Load)
  • Warehouse storage
  • BI tools

59. Difference Between ER Diagram and Relational Schema

FeatureER DiagramRelational Schema
RepresentationGraphicalTabular
LevelConceptual designLogical design
ComponentsEntities, attributes, relationsTables, columns, keys
PurposeDesign understandingImplementation

Key Insight

  • ER Diagram → Design phase
  • Schema → Implementation phase

60. What is UNION?

UNION is an SQL operator used to combine results of two or more SELECT queries.

Key Rules

  • Same number of columns
  • Compatible data types
  • Removes duplicates

Types

1. UNION

  • Removes duplicates

2. UNION ALL

  • Includes duplicates

Example

Key Insight

  • UNION → unique results
  • UNION ALL → faster, includes duplicates 

61. What is UNION ALL?

Definition

UNION ALL is an SQL operator used to combine results of two or more SELECT queries including duplicates.

Key Characteristics

  • Does not remove duplicates
  • Faster than UNION (no duplicate check)
  • Requires same number of columns and compatible data types

Example

Key Insight

  • Use when duplicate values are acceptable and performance matters

62. Difference Between UNION and UNION ALL

FeatureUNIONUNION ALL
Duplicate RowsRemovedAllowed
PerformanceSlower (deduplication)Faster
Use CaseUnique results neededAll data needed

Key Insight

  • UNION = DISTINCT
  • UNION ALL = RAW merge

63. What is CHAR vs VARCHAR?

Both are used to store string data, but differ in storage behavior.

CHAR

  • Fixed-length
  • Pads extra spaces

VARCHAR

  • Variable-length
  • Stores only actual data

Comparison

FeatureCHARVARCHAR
LengthFixedVariable
StorageAlways full lengthActual data only
PerformanceFaster (fixed size)Slightly slower
Use CaseFixed values (e.g., Gender)Dynamic text (e.g., Name)

Example

64. What is a Cursor and When is it Used?

A cursor is a database object used to process query results row by row.

Why Needed?

  • SQL is set-based
  • Cursor allows procedural logic

Types

  • Implicit Cursor
  • Explicit Cursor

When to Use

  • Complex row-by-row operations
  • Conditional logic per record
  • Data transformations

Drawback

  • Slower than set-based operations

65. What is Database Schema Evolution?

Schema evolution is the process of modifying database structure over time without breaking existing systems.

Examples

  • Adding new columns
  • Changing data types
  • Renaming tables

Challenges

  • Backward compatibility
  • Data migration
  • Application impact

Best Practices

  • Use versioning
  • Apply migrations carefully
  • Maintain compatibility

66. What is ORM (Object Relational Mapping)?

ORM is a technique that maps database tables to programming language objects.

Purpose

  • Simplify database interaction
  • Avoid writing raw SQL

How It Works

  • Table → Class
  • Row → Object
  • Column → Attribute

Examples

  • Hibernate (Java)
  • Sequelize (Node.js)
  • Django ORM (Python)

Advantages

  • Faster development
  • Cleaner code

Disadvantages

  • Performance overhead
  • Less control over queries

67. What is Referential Action (CASCADE, SET NULL)?

Referential actions define what happens to child table records when parent data changes.

1. CASCADE

  • Automatically update/delete child rows

Example

  • Delete parent → delete all related child rows

2. SET NULL

  • Set foreign key to NULL

Example

  • Parent deleted → child reference becomes NULL

Other Actions

  • RESTRICT → Prevent action
  • NO ACTION → Similar to restrict

68. What is a Many-to-Many Relationship and How is it Implemented?

A many-to-many (M:N) relationship occurs when:

  • One entity relates to multiple records of another, and vice versa

Example

  • Students ↔ Courses
  • One student → many courses
  • One course → many students

Implementation

Step 1: Create Two Tables

  • Student
  • Course

Step 2: Create Junction Table

  • Enrollment

Structure

Key Features

  • Junction table contains foreign keys
  • Often uses composite primary key

Key Insight

  • M:N relationships are always broken into two 1:N relationships