1Q. What are multi-document ACID transactions in MongoDB?

Multi-document ACID transactions in MongoDB allow multiple operations across documents and collections to be executed atomically, ensuring ACID properties.

ACID Properties:

  • Atomicity → All operations succeed or fail together
  • Consistency → Database remains valid
  • Isolation → Transactions are isolated from others
  • Durability → Data persists after commit

Key Points:

  • Introduced in MongoDB 4.0 (replica sets), 4.2 (sharded clusters)
  • Supports multiple collections and documents
  • Uses sessions to manage transactions

Example:

const session = db.getMongo().startSession();
session.startTransaction();

db.users.insertOne({ name: "Jitendra" }, { session });
db.orders.insertOne({ item: "Book" }, { session });

session.commitTransaction();
session.endSession();

Use Case:

  • Banking systems
  • Order processing systems

2Q. How do transactions work in replica sets?

In a replica set, transactions ensure consistent and atomic operations across multiple nodes.

Key Points:

  • All writes go to the primary node
  • Transaction operations are recorded in oplog
  • Secondaries replicate committed transactions
  • Uses two-phase commit internally

Working Flow:

  1. Client starts a session
  2. Transaction begins on primary
  3. Operations executed
  4. Commit request sent
  5. Data written and replicated
  6. Transaction marked committed

Important:

  • Uses majority write concern for durability
  • Ensures consistency across nodes

3Q. How do transactions work in sharded clusters?

Transactions in sharded clusters allow atomic operations across multiple shards (distributed data).

Key Points:

  • Introduced in MongoDB 4.2
  • Supports cross-shard transactions
  • Uses two-phase commit protocol

Working Flow:

  1. Transaction starts via mongos
  2. Operations sent to relevant shards
  3. Each shard executes locally
  4. Coordinator shard manages commit
  5. All shards commit or abort

Challenges:

  • Higher latency than replica sets
  • More complex coordination

4Q. What are limitations of MongoDB transactions?

MongoDB transactions have certain constraints that affect performance and scalability.

Key Limitations:

  • Higher overhead compared to single operations
  • Timeout limit (~60 seconds default)
  • Increased memory usage
  • Not ideal for large batch operations
  • Slower in sharded clusters

Other Constraints:

  • Limited number of operations per transaction
  • Locks resources during execution
  • Performance impact under heavy load

Best Practice:

  • Keep transactions short and small

5Q. Explain MongoDB replication architecture in detail

MongoDB replication architecture is based on replica sets, where multiple nodes maintain copies of the same data.

Components:

  • Primary Node → handles all writes
  • Secondary Nodes → replicate data
  • Arbiter (optional) → participates in elections

Architecture Flow:

Primary → Oplog → Secondary Nodes

Key Points:

  • Asynchronous replication
  • Ensures redundancy and fault tolerance
  • Supports automatic failover

Data Flow:

  1. Write operation goes to primary
  2. Logged in oplog
  3. Secondary nodes read oplog
  4. Apply changes

6Q. What happens during replica set elections?

Replica set election is the process of selecting a new primary when the current primary fails.

Key Points:

  • Triggered when primary becomes unavailable
  • Secondary nodes vote to elect new primary
  • Based on priority and freshness of data

Election Process:

  1. Primary fails
  2. Secondaries detect failure
  3. Election initiated
  4. Nodes vote
  5. Node with majority votes becomes primary

Important Factors:

  • Node priority
  • Replication lag
  • Network latency

Result:

  • New primary takes over
  • System continues operation

7Q. What is majority write concern?

Majority write concern ensures that a write operation is acknowledged only after it is committed to a majority of replica set members.

Key Points:

  • Ensures strong data durability
  • Prevents data loss during failover
  • Required for transactions

Example:

db.users.insertOne(
{ name: "Jitendra" },
{ writeConcern: { w: "majority" } }
)

How it Works:

  • Write sent to primary
  • Replicated to secondary nodes
  • Acknowledged after majority confirms

Trade-off:

  • High reliability → slower performance
  • Low reliability → faster performance

8Q. How does MongoDB ensure durability in MongoDB?

Durability ensures that once a write operation is acknowledged, the data will not be lost even in case of crashes or failures.

Key Mechanisms:

  • Journaling
  • Write Concern (majority)
  • Replication

How it Works:

  1. Write operation occurs
  2. Data written to journal file (WAL)
  3. Then written to data files
  4. Replicated to secondary nodes
  5. Acknowledged after durability guarantees

Key Points:

  • Journaling enables crash recovery
  • Majority write concern ensures replication
  • Data persists even after system failure

9Q. WiredTiger vs MMAPv1 – deep comparison

WiredTiger and MMAPv1 are MongoDB storage engines that manage how data is stored and accessed.

Comparison Table:

FeatureWiredTigerMMAPv1
Default EngineYesNo (deprecated)
CompressionSupportedNot supported
LockingDocument-levelCollection-level
PerformanceHighModerate
ConcurrencyHighLimited
Memory UsageEfficientLess efficient
JournalingYesYes

WiredTiger Key Points:

  • Uses document-level locking
  • Supports compression (Snappy, Zlib)
  • Better concurrency and performance
  • Uses cache for memory optimization

MMAPv1 Key Points:

  • Uses collection-level locking
  • No compression
  • Poor concurrency
  • Deprecated in modern MongoDB

Conclusion:

 WiredTiger is faster, more efficient, and preferred for production systems.

10Q. How does MongoDB handle concurrency?

Concurrency in MongoDB refers to how multiple operations are handled simultaneously without conflicts.

Key Mechanisms:

  • Locking system
  • WiredTiger storage engine
  • Multi-version concurrency control (MVCC)

Key Points:

  • Supports multiple reads and writes concurrently
  • Uses fine-grained locking
  • Avoids blocking operations

How it Works:

  • Each operation gets a snapshot of data
  • Writes do not block reads
  • Conflicts are minimized

11Q. What is document-level locking?

Document-level locking allows MongoDB to lock only the specific document being modified, instead of locking the entire collection.

Key Points:

  • Enabled by WiredTiger
  • Improves concurrency
  • Multiple operations can run in parallel

Example:

  • Two users update different documents → both succeed simultaneously

Benefit:

  • Faster performance
  • Reduced contention

12Q. How does WiredTiger cache work?

WiredTiger cache is an in-memory cache used to store frequently accessed data for faster read/write operations.

Key Points:

  • Default cache size ≈ 50% of system RAM
  • Stores frequently accessed documents and indexes
  • Uses eviction policy to remove old data

How it Works:

  1. Data loaded into cache
  2. Reads served from memory (fast)
  3. Writes buffered in cache
  4. Periodically flushed to disk

Important Concept:

  • Dirty Data → modified data in cache not yet written to disk

13Q. How does MongoDB manage memory?

MongoDB manages memory using the WiredTiger cache and OS-level memory management.

Key Points:

  • Uses WiredTiger cache for active data
  • Relies on OS for file system caching
  • Automatically adjusts memory usage

Memory Usage Components:

  • WiredTiger cache
  • Indexes
  • Connections
  • Aggregation operations

Best Practice:

  • Keep working set in RAM for optimal performance

14Q. What is MapReduce and when should it be avoided?

MapReduce is a data processing model used to process large datasets using map and reduce functions.

Key Points:

  • Uses JavaScript functions
  • Processes data in two steps:
    • Map → transform data
    • Reduce → aggregate data

Example:

db.collection.mapReduce(
function() { emit(this.category, this.amount); },
function(key, values) { return Array.sum(values); }
)

When to Avoid:

  • Slower than aggregation framework
  • Not optimized for performance
  • Deprecated for most use cases

Use Case:

  • Complex custom computations (rare cases)

15Q. MapReduce vs Aggregation Framework

Both are used for data processing, but aggregation framework is the modern and efficient approach.

Comparison Table:

FeatureMapReduceAggregation Framework
PerformanceSlowFast
LanguageJavaScriptNative operators
ComplexityHighLow
OptimizationLimitedHighly optimized
Use CaseComplex logicMost data processing

Key Points:

  • Aggregation is preferred in modern MongoDB
  • MapReduce is rarely used today
  • Aggregation is faster and easier

Conclusion:

Use Aggregation Framework instead of MapReduce in most cases.

16Q. How does full-text search work in MongoDB?

Full-text search in MongoDB allows searching text content within documents using text indexes and text search queries.

Key Points:

  • Uses text indexes on string fields
  • Supports keyword-based search
  • Performs tokenization and stemming
  • Returns results based on relevance score

How it Works:

  1. Text index created on fields
  2. MongoDB tokenizes words (breaks into terms)
  3. Removes stop words (e.g., “the”, “is”)
  4. Applies stemming (running → run)
  5. Matches search query against indexed terms

Example:

db.articles.find({ $text: { $search: "mongodb database" } })

Important:

  • Supports ranking using score
  • Case-insensitive search

17Q. What is a text index?

A text index is a special index type used to support text search queries in MongoDB.

Key Points:

  • Created on string fields
  • Enables $text queries
  • Stores tokenized words instead of raw text
  • Only one text index per collection (can include multiple fields)

Example:

db.articles.createIndex({ title: "text", content: "text" })

Features:

  • Supports language-specific rules
  • Provides relevance scoring

18Q. What is Atlas Search?

MongoDB Atlas Search is an advanced full-text search feature in MongoDB Atlas powered by Apache Lucene.

Key Points:

  • More powerful than basic text search
  • Supports fuzzy search, autocomplete, synonyms
  • Built into MongoDB Atlas
  • No separate search engine required

Features:

  • Relevance ranking
  • Highlighting
  • Complex queries (phrase, wildcard)

Use Case:

  • E-commerce search
  • Advanced filtering systems
  • Search-as-you-type

19Q. How does MongoDB handle large file storage?

MongoDB handles large files using GridFS, which splits files into smaller chunks and stores them across collections.

Key Points:

  • Used for files larger than 16MB
  • Stores files in chunks
  • Efficient retrieval and storage
  • Avoids BSON size limitation

Storage Structure:

  • fs.files → metadata
  • fs.chunks → actual file data

Use Cases:

  • Video storage
  • Image storage
  • File systems

20Q. GridFS internal working

GridFS internally stores large files by dividing them into smaller chunks and managing them across collections.

Key Points:

  • Default chunk size: 255KB
  • Each chunk stored as separate document
  • Files reconstructed during retrieval

Internal Flow:

  1. File uploaded
  2. Split into chunks
  3. Stored in fs.chunks
  4. Metadata stored in fs.files
  5. File retrieved by combining chunks

Example Structure:

fs.files → file metadata
fs.chunks → binary data chunks

Benefit:

  • Efficient handling of large files
  • Supports streaming

21Q. What is mongos?

mongos is a query router in MongoDB that directs client requests to the appropriate shard in a sharded cluster.

Key Points:

  • Acts as entry point for clients
  • Does not store data
  • Routes queries based on shard key
  • Works with config servers

Role:

  • Receives query
  • Determines target shard
  • Sends request
  • Merges results

22Q. Role of config servers

Config servers store metadata and configuration information about the sharded cluster.

Key Points:

  • Store shard mapping information
  • Maintain chunk distribution
  • Essential for cluster operation
  • Usually deployed as replica set

Data Stored:

  • Shard details
  • Chunk locations
  • Database configuration

Importance:

  • mongos depends on config servers
  • Without config servers, cluster cannot function

23Q. How does query routing work in sharded clusters?

Query routing is the process by which MongoDB directs queries to the correct shard(s) using mongos.

Key Points:

  • Uses shard key to identify target shard
  • Minimizes unnecessary data access
  • Improves performance

Working Flow:

  1. Client sends query to mongos
  2. mongos checks config server metadata
  3. Determines which shard(s) contain data
  4. Sends query to relevant shard(s)
  5. Collects and merges results
  6. Returns final response

Types of Queries:

🔹 Targeted Query:

  • Query includes shard key
  • Sent to specific shard

🔹 Scatter-Gather Query:

  • Query without shard key
  • Sent to all shards

Optimization Tip:

  • Always include shard key in queries for better performance

24Q. How do you reshard a collection in MongoDB?

Resharding is the process of changing the shard key of an existing collection to improve data distribution and performance.

Key Points:

  • Introduced in MongoDB 5.0
  • Allows changing shard key without downtime
  • Data is redistributed automatically
  • Uses background process

How it Works:

  1. New shard key is defined
  2. MongoDB creates temporary collection
  3. Data copied and redistributed
  4. Writes synchronized between old & new
  5. Switch happens seamlessly

Command:

sh.reshardCollection("db.collection", { newShardKey: 1 })

Use Case:

  • Fix poor shard key
  • Improve performance and scalability

25Q. What are common shard key design mistakes?

Shard key mistakes lead to uneven data distribution, hotspots, and poor performance.

Common Mistakes:

🔹 Low Cardinality

  • Few unique values → uneven distribution

🔹 Monotonically Increasing Keys

  • Example: timestamps, auto-increment IDs
  • Causes hot shard problem

🔹 Not Including in Queries

  • Queries without shard key → scatter-gather

🔹 Large Chunk Sizes

  • Leads to inefficient balancing

Best Practices:

  • Choose high cardinality field
  • Ensure even distribution
  • Frequently used in queries

26Q. How do you migrate data from SQL to MongoDB?

Data migration is the process of converting relational data into document-based structure.

Key Steps:

  1. Analyze Schema
    • Identify tables and relationships
  2. Design MongoDB Schema
    • Use embedding or referencing
  3. Transform Data
    • Convert rows → documents
  4. Migrate Data
    • Use tools or scripts

Tools:

  • mongoimport
  • ETL tools (like Talend, Apache NiFi)
  • Custom scripts

Example Transformation:

SQL:

Users + Orders (JOIN)

MongoDB:

{
name: "Jitendra",
orders: [...]
}

27Q. How do you perform rolling upgrades?

Rolling upgrade is a process of upgrading MongoDB nodes one at a time without downtime.

Key Points:

  • Upgrade secondary nodes first
  • Primary upgraded last
  • Ensures continuous availability

Steps:

  1. Upgrade secondary nodes
  2. Restart each secondary
  3. Step down primary
  4. Upgrade former primary
  5. Verify cluster health

Benefit:

  • Zero service interruption
  • Maintains availability

28Q. Zero-downtime MongoDB upgrade strategy

A strategy to upgrade MongoDB without affecting application availability.

Key Approach:

  • Use replica sets or sharded clusters
  • Upgrade nodes sequentially

Steps:

  1. Ensure replication is healthy
  2. Upgrade secondaries
  3. Perform primary step-down
  4. Upgrade primary
  5. Monitor system

Key Points:

  • No downtime for users
  • Requires proper monitoring
  • Backup before upgrade

29Q. How do you secure MongoDB in production?

Securing MongoDB involves protecting data from unauthorized access and ensuring safe communication.

Key Security Measures:

  • Enable authentication
  • Use role-based access control (RBAC)
  • Enable TLS/SSL encryption
  • Restrict network access (IP whitelist)
  • Disable unnecessary ports
  • Enable auditing

Best Practices:

  • Never expose database publicly
  • Use strong passwords
  • Regularly update MongoDB

30Q. Authentication mechanisms in MongoDB

Authentication verifies the identity of users accessing MongoDB.

Types:

SCRAM (Default)

  • Username/password-based

X.509

  • Certificate-based authentication

 LDAP

  • Enterprise authentication

 Kerberos

  • Network authentication protocol

Key Points:

  • Ensures only authorized users access data
  • Integrated with RBAC

31Q. Role-based access control (RBAC)

RBAC restricts access based on user roles and permissions.

Key Points:

  • Users assigned roles
  • Roles define permissions
  • Fine-grained access control

Example Roles:

  • read
  • readWrite
  • dbAdmin
  • clusterAdmin

Example:

db.createUser({
user: "admin",
pwd: "password",
roles: ["readWrite"]
})

32Q. How does TLS/SSL work in MongoDB?

TLS/SSL encrypts data transmitted between MongoDB clients and servers to ensure secure communication.

Key Points:

  • Prevents data interception
  • Uses certificates for encryption
  • Supports mutual authentication

How it Works:

  1. Client connects to server
  2. SSL handshake occurs
  3. Certificates verified
  4. Secure encrypted connection established

Benefit:

  • Data security in transit
  • Protection against attacks

33Q. How does MongoDB handle disaster recovery?

Disaster recovery ensures data can be restored after failures like crashes, data loss, or outages.

Key Strategies:

  • Replication (replica sets)
  • Regular backups (mongodump)
  • Point-in-time recovery (oplog)
  • Multi-region deployment

Recovery Process:

  1. Detect failure
  2. Failover to secondary
  3. Restore from backup if needed
  4. Sync data

Best Practices:

  • Maintain backups
  • Test recovery process
  • Use geographically distributed clusters
34Q. How do you troubleshoot slow queries in MongoDB?

Troubleshooting slow queries involves identifying and resolving performance bottlenecks in query execution.

Key Steps:

 Use explain()

  • Analyze query execution plan
  • Check for COLLSCAN vs IXSCAN
db.users.find({ age: 25 }).explain("executionStats")

 Check Index Usage

  • Ensure proper indexes exist
  • Use compound indexes if needed

Analyze Query Patterns

  • Avoid unnecessary fields
  • Use projection

 Monitor Metrics

  • Query execution time
  • CPU, memory, disk usage

 Enable Profiling

db.setProfilingLevel(2)

Common Fixes:

  • Add indexes
  • Optimize query structure
  • Reduce data scanned

35Q. How does MongoDB handle high write throughput?

MongoDB handles high write throughput using scaling, batching, and efficient storage mechanisms.

Key Techniques:

  • Horizontal scaling (sharding)
  • WiredTiger storage engine
  • Bulk writes
  • Asynchronous replication

Key Points:

  • Writes go to primary node
  • Buffered in memory (cache)
  • Flushed to disk efficiently

Optimization Tips:

  • Use unordered bulk writes
  • Choose good shard key
  • Reduce write concern if acceptable

36Q. MongoDB in microservices architecture

MongoDB is widely used in microservices as a flexible, scalable database per service.

Key Points:

  • Each service can have its own database
  • Supports independent scaling
  • Flexible schema fits evolving services

Benefits:

  • Loose coupling
  • Faster development
  • Independent deployments

Example:

  • User Service → MongoDB (users)
  • Order Service → MongoDB (orders)

37Q. One database per service – pros & cons

Each microservice owns its own database, ensuring data isolation.

Pros:

  • Independent scaling
  • Better fault isolation
  • No cross-service dependencies

Cons:

  • Data duplication
  • Complex joins across services
  • Distributed transactions required

Key Point:

  • Encourages event-driven architecture

38Q. MongoDB with containers & Kubernetes

MongoDB can be deployed using containers (Docker) and orchestrated with Kubernetes for scalability and automation.

Key Points:

  • Use StatefulSets in Kubernetes
  • Persistent volumes for storage
  • Replica sets for high availability

Benefits:

  • Easy deployment
  • Auto-scaling
  • Self-healing systems

Tools:

  • Kubernetes Operator for MongoDB
  • Helm charts

39Q. How does MongoDB handle schema evolution?

Schema evolution is the ability to modify data structure over time without downtime.

Key Points:

  • Schema-less design allows flexible changes
  • Documents can have different structures
  • No migration required for small changes

Strategies:

  • Versioning fields
  • Backward compatibility
  • Gradual migration

Example:

// Old document
{ name: "Jitendra" }

// New document
{ name: "Jitendra", age: 22 }

Benefit:

  • Faster development cycles
  • Easy feature updates

40Q. Common production deployment best practices

Best practices ensure MongoDB runs efficiently, securely, and reliably in production.

Key Practices:

 Performance

  • Use proper indexes
  • Optimize queries
  • Monitor regularly

Scalability

  • Use sharding for large datasets
  • Choose proper shard key

High Availability

  • Use replica sets
  • Enable automatic failover

 Security

  • Enable authentication
  • Use TLS/SSL
  • Apply RBAC

Backup & Recovery

  • Regular backups
  • Test restore process

 Monitoring

  • Use tools like mongostat, mongotop
  • Track performance metrics

Golden Rule:

Design based on query patterns, not just data structure