Beginner Level
1. You are given a list with duplicate values. How would you remove duplicates while preserving order?Removing duplicates while preserving order means keeping the first occurrence of each element in the list and removing later repetitions without changing the original order of elements.
Efficient Approach
Idea
-
Traverse the list.
-
Maintain a set to store seen elements.
-
If an element is not in the set, add it to:
-
the result list
-
the set of seen elements.
-
This ensures duplicates are skipped while maintaining order.
Code
def remove_duplicates(lst):
seen = set()
result = []
for item in lst:
if item not in seen:
seen.add(item)
result.append(item)
return result
A missing dictionary key occurs when we try to access a key that does not exist in a dictionary.
In Python, this raises a KeyError, which can cause the function or program to crash if not handled properly.
Example problem:
Since "salary" does not exist in the dictionary, Python throws a KeyError.
The get() method safely retrieves a value from a dictionary.
If the key does not exist, it returns None or a default value instead of raising an error.
Example
data = {"name": "Alice", "age": 25}
salary = data.get("salary", 0)
print(salary)
In Python, a file object acts as an iterator, meaning it automatically reads the file line by line using buffering.
Only a small portion of the file is kept in memory at any time, making it ideal for large files.
Example
with open("large_file.txt", "r") as file:
for line in file:
process(line)
Explanation:
-
open()opens the file. -
withensures the file is automatically closed after use. -
for line in filereads one line at a time. -
process(line)represents any processing logic.
Why This Is Efficient
| Method | Memory Usage | Suitable for Large Files |
|---|---|---|
file.read() | High (loads entire file) | No |
for line in file | Low (line-by-line) | Yes |
In production systems, logging helps track issues, debug problems, monitor system behavior, and maintain application reliability.
Log Levels in Python
| Log Level | Purpose |
|---|---|
| DEBUG | Detailed information for debugging |
| INFO | General application events |
| WARNING | Something unexpected but not critical |
| ERROR | A failure occurred |
| CRITICAL | Serious failure that may stop the program |
Example hierarchy:
DEBUG < INFO < WARNING < ERROR < CRITICAL
Basic Logging Setup
Explanation
In production scripts, logging is usually configured to write errors to a log file instead of printing them.
Example
import logging
logging.basicConfig(
filename="app.log",
level=logging.ERROR,
format="%(asctime)s - %(levelname)s - %(message)s"
)
try:
x = 10 / 0
except Exception as e:
logging.error("An error occurred: %s", e)
A clean validation structure usually follows this approach:
User Input
│
▼
Validation Layer
│
┌─────────────┬─────────────┐
▼ ▼ ▼
Email Check Phone Check Other Validations
│
▼
Valid Data → Continue Processing
Invalid Data → Return Error
Environment variables store configuration outside the application code, making them secure and easy to change across environments (development, testing, production).
Example variables:
Example
import os
api_key = os.getenv("API_KEY")
db_url = os.getenv("DB_URL")
Here:
-
os.getenv()safely reads the environment variable. -
If the variable is missing, it returns None or a default value.
.env Files
Explanation
During development, configuration values are often stored in a .env file and loaded using libraries like python-dotenv.
Example .env file:
Python code:
Exception Handling Flow
Application Code
│
▼
Exception Occurs
│
▼
Catch Exception (try-except)
│
┌──────────────┬──────────────┐
▼ ▼
Log Error Show Generic Message
(Internal) (User Safe)
8. You need to sort objects based on multiple attributes. How would you do it?
To sort objects based on multiple attributes in Python, you can use the key parameter of sorted() or list.sort() and return a tuple of the attributes you want to sort by. Python compares tuples in order, meaning it first compares the first attribute, and if those are equal, it compares the next one.
For example, suppose you have objects with attributes salary and age, and you want to sort first by salary and then by age:
How it works:
-
Python first sorts by salary.
-
If two employees have the same salary, it then sorts them by age.
Intermediate Level
9. Two threads are modifying shared data and causing inconsistent results. How would you fix it?When two threads modify the same shared data simultaneously, it can lead to a race condition. A race condition occurs when the final result depends on the timing of thread execution, which can cause incorrect or inconsistent outcomes.
To fix this issue, we must ensure that only one thread accesses or modifies the shared resource at a time. This is done using synchronization mechanisms, most commonly a lock (mutex).
Using a Lock (threading.Lock)
A lock ensures that the section of code modifying shared data (called the critical section) is executed by only one thread at a time.
How This Fixes the Problem
-
threading.Lock()creates a mutual exclusion lock. -
with lock:ensures only one thread enters the critical section at a time. -
Other threads must wait until the lock is released.
-
This prevents simultaneous modification of the shared variable and eliminates inconsistent results.
10. You need to cache expensive function results. How would you implement caching?
When a function performs an expensive operation (e.g., heavy computation, database query, API call), repeatedly executing it with the same inputs wastes time and resources.
Caching solves this by storing the result of the function for given inputs, so if the function is called again with the same arguments, the stored result is returned instead of recomputing it.
Using Built-in lru_cache
Python provides a built-in decorator in the functools module called lru_cache (Least Recently Used cache). It automatically stores and reuses function results.
Usage:
Output behavior:
-
The first call computes the result.
-
The second call retrieves the result from the cache, avoiding recomputation.
How It Works
-
The function arguments act as the cache key.
-
The result is stored in memory.
-
If the same arguments appear again, Python returns the cached value.
-
maxsizelimits the number of stored results. When the cache is full, the least recently used entry is removed.
11. You have a nested JSON response from an API. How would you safely extract deeply nested values?
When working with API responses, JSON objects are often deeply nested dictionaries and lists. Directly accessing nested keys using dict["key"] can raise KeyError or TypeError if any level of the structure is missing. Therefore, values should be extracted safely to prevent runtime errors.
Using .get() Method
A common and safe way is to use the dict.get() method, which returns None or a default value if the key does not exist instead of throwing an error.
Example JSON:
Safe extraction:
How it works:
-
Each
.get()retrieves the key if it exists. -
If the key is missing, it returns
{}(an empty dictionary). -
This prevents the next
.get()call from failing.
If the final value doesn't exist, the result will simply be None instead of an exception.
12. A Python API built with Flask becomes slow under load. What steps would you take to debug it?
When a Flask API becomes slow under load, the goal is to identify the bottleneck causing the performance issue. The slowdown could come from the application code, database queries, external APIs, server configuration, or infrastructure. Debugging should be done systematically.
1. Monitor Application Performance
Start by monitoring the API to see where time is being spent.
Useful tools:
-
Flask Debug Toolbar
-
New Relic / Datadog / Prometheus
-
Application logs
These tools help identify:
-
Slow endpoints
-
High response times
-
Error rates
-
CPU or memory spikes
2. Profile the Application
Use Python profiling tools to find slow parts of the code.
Common tools:
-
cProfile -
line_profiler -
py-spy
These tools show:
-
Which functions consume the most time
-
Expensive loops or computations
Example:
3. Check Database Performance
A common cause of slow APIs is inefficient database queries.
Things to check:
-
Slow queries
-
Missing indexes
-
N+1 query problems
-
Large result sets
Possible fixes:
-
Add indexes
-
Optimize queries
-
Use pagination
-
Use connection pooling
4. Analyze External API Calls
If the Flask API depends on other services, those calls might be slow.
Check for:
-
Blocking API requests
-
Long network latency
Solutions:
-
Add timeouts
-
Use async requests
-
Implement caching
5. Add Caching
If certain responses are repeatedly requested, caching can significantly reduce load.
Common caching strategies:
-
Redis
-
Memcached
-
Flask caching libraries
Caching helps avoid repeated database or computation work.
13. You need to retry failed API calls with exponential backoff. How would you design it?
When calling external APIs, failures can occur due to temporary issues such as network instability, server overload, or rate limiting. Instead of immediately failing, the system should retry the request after waiting for increasing intervals. This approach is called exponential backoff.
In exponential backoff, the delay between retries doubles after each failed attempt, reducing pressure on the external service and improving the chances of success.
Design Approach
-
Set a maximum number of retries to avoid infinite attempts.
-
Start with a small delay (e.g., 1 second).
-
After each failure, increase the delay exponentially (e.g., 1s → 2s → 4s → 8s).
-
Stop retrying when:
-
The request succeeds, or
-
The maximum retry limit is reached.
-
-
Optionally add jitter (random delay) to avoid multiple clients retrying simultaneously.
Example Implementation
import time
import requests
def fetch_data(url, max_retries=5, base_delay=1):
for attempt in range(max_retries):
try:
response = requests.get(url)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException:
if attempt == max_retries - 1:
raise
delay = base_delay * (2 ** attempt)
time.sleep(delay)
How It Works
-
max_retrieslimits how many attempts are made. -
2 ** attemptcreates exponential growth in delay. -
time.sleep()pauses before the next retry. -
If the final retry fails, the exception is raised.
14. You must process 1 million records from a CSV file. How would you optimize it?
When processing a very large CSV file (e.g., 1 million records), the main challenges are memory usage, processing time, and I/O efficiency. The goal is to process the data incrementally and efficiently without loading the entire dataset into memory.
1. Process the File in a Streaming Manner
Instead of reading the entire file at once, read it row by row using a streaming approach. This keeps memory usage low because only one record is processed at a time.
Example using Python’s csv module:
This avoids loading the full dataset into memory.
2. Use Chunk-Based Processing
For heavy computations, reading the file in chunks (batches) improves efficiency.
Example using pandas:
Benefits:
-
Only a portion of the data is loaded at a time.
-
Works well for large datasets.
3. Use Efficient Data Processing
Optimize the processing logic itself:
-
Avoid unnecessary loops.
-
Use vectorized operations if using pandas.
-
Reduce repeated calculations.
15. You need to write unit tests for legacy code with tight coupling. How would you approach it?
Legacy code often has tight coupling, meaning different components are strongly dependent on each other. This makes unit testing difficult because a function may depend on databases, APIs, file systems, or other modules. The goal is to introduce testability without breaking existing functionality.
1. Understand the Existing Code
Before writing tests, analyze the legacy code to understand:
-
Dependencies (database, APIs, services)
-
Side effects
-
Critical business logic
This helps identify what needs to be isolated during testing.
2. Start with Characterization Tests
When working with legacy code, the first step is to write characterization tests. These tests capture the current behavior of the system, even if the behavior is not ideal.
Purpose:
-
Ensure the code continues to behave the same during refactoring.
-
Provide a safety net before making changes.
3. Use Mocking to Isolate Dependencies
Since tightly coupled code depends on external systems, use mocking to replace those dependencies during testing.
Example dependencies to mock:
-
Database calls
-
API requests
-
File operations
Conceptually:
Original Code
│
Depends on DB / API / Services
│
Unit Test
│
Replace dependencies with mocks
This allows testing only the logic of the function.
4. Introduce Dependency Injection Gradually
Refactor the code so that dependencies are passed into functions or classes instead of being created inside them.
Example idea:
Instead of a function directly calling a database inside it, pass the database object as a parameter. This makes it easy to replace it with a mock during testing.
16. How would you prevent circular imports in a growing Python project?
Circular imports occur when two or more modules depend on each other, causing Python to get stuck during the import process. For example, if module A imports module B and module B also imports module A, Python may fail because one module is not fully initialized yet.
To prevent this in a growing project, the focus should be on better project structure and dependency management.
1. Restructure the Code (Best Solution)
Circular imports often indicate poor module design. The most effective solution is to refactor the code structure.
Move shared logic into a separate module that both modules can import.
Example concept:
module_a → shared_module
module_b → shared_module
Instead of:
module_a ↔ module_b
This removes the circular dependency.
2. Use Local Imports When Necessary
If restructuring is not immediately possible, you can move the import inside a function or method so it executes only when needed.
Example:
This delays the import until runtime and avoids circular initialization during module loading.
3. Use Dependency Injection
Instead of importing modules directly, pass dependencies as parameters.
Conceptually:
Main Module
│
├── Module A
└── Module B
Both modules receive dependencies from a higher-level module, eliminating direct imports between them.
4. Organize Code by Layers
Large Python projects benefit from a layered architecture, such as:
controllers
↓
services
↓
repositories
↓
models
Rules:
-
Higher layers depend on lower layers.
-
Lower layers should not import higher layers.
This structure prevents circular dependencies.
17. You need to schedule background tasks in a web application. How would you implement it?
In a web application, some tasks should not run during the request–response cycle because they may take a long time. Examples include sending emails, generating reports, processing uploads, or calling external APIs. Running these tasks synchronously can slow down the application, so they should be executed in the background.
Using a Task Queue (Recommended for Production)
A common solution is to use a task queue system where tasks are sent to a queue and processed by separate worker processes.
Popular tools in Python:
-
Celery
-
Redis / RabbitMQ (message broker)
-
RQ (Redis Queue)
Basic design:
Web Application
│
▼
Send Task to Queue
│
▼
Message Broker (Redis / RabbitMQ)
│
▼
Worker Process Executes Task
Example concept using Celery:
In the web application:
Here:
-
.delay()sends the task to the queue. -
Workers process it asynchronously in the background.
18. How would you handle file uploads securely in a Python web application?
File uploads can introduce security risks such as malicious files, large file attacks, or unauthorized access. To handle uploads securely, the application must validate files, restrict access, and store them safely.
1. Validate File Type
Never trust the file type provided by the user. Validate the file extension and, if possible, check the file’s MIME type.
Example idea:
-
Allow only specific formats like
jpg,png, orpdf. -
Reject executable or script files.
This prevents attackers from uploading harmful files such as .exe or .php.
2. Limit File Size
Large file uploads can exhaust server resources or cause denial-of-service attacks. Set a maximum upload size at the application or server level.
For example:
-
Limit uploads to a few MB depending on the use case.
3. Sanitize File Names
User-provided filenames may contain malicious paths like:
../../etc/passwd
Always sanitize filenames before saving them to avoid path traversal attacks. Use utilities that generate safe filenames or create unique identifiers.
4. Store Files Outside the Application Directory
Uploaded files should not be stored in directories that are directly executable by the web server.
Best practice:
-
Store files in a separate storage location (e.g., cloud storage or dedicated upload directory).
-
Serve them through controlled endpoints instead of direct access.
Advanced Level
19. You suspect a memory leak in production. How would you investigate it?A memory leak occurs when an application continues to consume memory without releasing it, eventually causing performance degradation or crashes. Investigating it requires identifying where memory usage increases and what objects are not being released.
1. Monitor Memory Usage
First, confirm that memory usage is continuously increasing over time.
Tools that can help:
-
System monitoring tools (top, htop, Docker stats)
-
Application monitoring tools (Prometheus, Grafana, Datadog)
-
Cloud monitoring dashboards
Look for patterns such as:
-
Memory steadily increasing after each request
-
Memory not being freed after tasks complete
2. Reproduce the Issue
Try to reproduce the problem in a staging or test environment by simulating production load. This makes it easier to analyze the issue without affecting users.
Load testing tools can help:
-
Locust
-
JMeter
-
k6
3. Use Memory Profiling Tools
Python provides several tools to analyze memory usage and detect leaks.
Common tools:
-
tracemalloc – tracks memory allocations
-
memory-profiler – measures memory usage line-by-line
-
objgraph – identifies growing object references
-
py-spy or pyflame – runtime analysis
Example concept:
Application
│
Memory Allocation
│
Profiler Tracks Objects
│
Find Objects That Keep Growing
These tools help identify which objects or functions are consuming memory.
20. When would you use threading vs multiprocessing vs asyncio?
These three approaches are used to handle concurrency and parallelism in Python, but each is suitable for different types of tasks.
1. Threading
Threading is useful for tasks that are I/O-bound, meaning the program spends time waiting for external resources such as network responses, file operations, or database queries.
Because of Python’s Global Interpreter Lock (GIL), threads cannot execute Python bytecode in true parallel for CPU-heavy tasks. However, while one thread waits for I/O, another can run.
Typical use cases:
-
Network requests
-
File reading/writing
-
Database queries
-
Web scraping
Example scenario:
A program downloading data from multiple APIs can use threads so that while one request waits for a response, others continue executing.
2. Multiprocessing
Multiprocessing is used for CPU-bound tasks where heavy computations need to run in parallel. It creates separate processes, each with its own Python interpreter and memory space, allowing true parallel execution across multiple CPU cores.
Typical use cases:
-
Data processing
-
Image processing
-
Machine learning computations
-
Large numerical calculations
Example scenario:
If a program processes millions of records with complex calculations, multiprocessing can distribute the work across multiple CPU cores.
3. Asyncio
Asyncio is designed for highly concurrent I/O operations using an event loop and asynchronous programming (async/await). It allows a single thread to handle thousands of tasks efficiently without creating multiple threads.
Typical use cases:
-
High-concurrency web servers
-
Real-time applications
-
Web scraping with many requests
-
Asynchronous APIs
Example scenario:
A service making thousands of HTTP requests can handle them efficiently using asyncio without creating thousands of threads.
Comparison
| Approach | Best For | Key Characteristic |
|---|---|---|
| Threading | I/O-bound tasks | Multiple threads share memory |
| Multiprocessing | CPU-bound tasks | True parallel execution across CPU cores |
| Asyncio | High concurrency I/O | Event-driven asynchronous programming |
21. A database query in a Django application is slow. How would you optimize it?
If a database query in a Django application is slow, the goal is to identify the bottleneck and reduce unnecessary database work. Optimization usually involves analyzing the query, improving how data is fetched, and reducing database load.
1. Inspect the Generated SQL Query
Django ORM converts queries into SQL. First, inspect the actual SQL being executed to understand what the database is doing.
Example:
This helps identify:
-
Complex joins
-
Missing filters
-
Inefficient queries
2. Use Query Profiling Tools
Use tools that show slow queries and execution time.
Common options:
-
Django Debug Toolbar
-
Database logs
-
EXPLAINquery analysis
Example concept:
Slow Query
↓
Analyze Execution Plan
↓
Identify Full Table Scan or Missing Index
This helps determine whether the issue is indexing, joins, or query structure.
3. Add Database Indexes
Slow queries often occur because the database must scan entire tables.
Adding indexes on frequently filtered fields improves performance.
Example:
Indexes are useful for:
-
WHEREconditions -
ORDER BY -
JOINoperations
4. Avoid N+1 Query Problems
The N+1 query problem occurs when Django executes multiple queries inside loops.
Bad pattern:
1 query to fetch objects
+ N queries to fetch related objects
Solution:
-
Use
select_related()for foreign keys -
Use
prefetch_related()for many-to-many relationships
This loads related data in fewer queries.
22. You need to handle 10,000 concurrent requests. How would you design the system?
Handling 10,000 concurrent requests requires designing the system for high scalability, efficient resource usage, and fault tolerance. The goal is to ensure the application remains responsive under heavy load.
1. Use a Scalable Web Server
The application should run on a production-grade server that supports concurrency.
Examples:
-
Gunicorn
-
uWSGI
-
ASGI servers like Uvicorn or Daphne
For high concurrency, an async framework (ASGI) can handle many requests efficiently without blocking.
Example architecture:
Clients
│
▼
Load Balancer
│
▼
Multiple App Servers (Gunicorn/Uvicorn Workers)
2. Add a Load Balancer
A load balancer distributes incoming traffic across multiple application instances.
Common options:
-
Nginx
-
AWS ELB / ALB
-
HAProxy
Benefits:
-
Prevents a single server from being overloaded
-
Improves availability
-
Enables horizontal scaling
3. Scale Horizontally
Instead of relying on a single server, run multiple application instances.
Example:
Server 1
Server 2
Server 3
Server 4
The load balancer distributes requests among them.
This approach allows the system to handle thousands of requests by adding more servers.
4. Use Asynchronous Processing
Long-running operations should not block request handling.
Move tasks like:
-
Email sending
-
Report generation
-
Image processing
to background workers using:
-
Celery
-
Redis / RabbitMQ
23. You are building a payment system with multiple payment methods. How would you structure it using OOP principles?
When designing a payment system that supports multiple payment methods (e.g., credit card, PayPal, UPI), the goal is to make the system extensible, maintainable, and loosely coupled. Object-Oriented Programming helps achieve this by using abstraction, inheritance, and polymorphism.
1. Use an Abstract Base Class for Payment
Create a base class or interface that defines the common behavior for all payment types. This ensures that every payment method implements the same method, such as pay().
Example:
This establishes a contract that every payment method must implement.
2. Implement Concrete Payment Classes
Each payment method becomes a separate class that inherits from the base class and implements its own payment logic.
Example:
Each class handles its specific payment implementation.
3. Use Polymorphism in the Payment Processor
Create a payment processor that works with the abstract payment type, not specific implementations.
Usage:
The processor doesn't need to know which payment type is used.
24. Your Python service must integrate with multiple third-party APIs. How would you design a resilient integration layer?
When integrating with multiple third-party APIs, the main challenges are network failures, rate limits, slow responses, and API changes. A resilient integration layer should isolate external dependencies and ensure the system continues functioning even when external services fail.
1. Create a Dedicated Integration Layer
Instead of calling external APIs directly from business logic, create a separate service layer or client module responsible for all API interactions.
Example structure:
app/
├── services/
│ ├── payment_service.py
│ ├── notification_service.py
│
├── integrations/
│ ├── stripe_client.py
│ ├── paypal_client.py
│ ├── sms_client.py
This keeps external integrations isolated from core application logic.
2. Use Adapter/Wrapper Classes
Each third-party API should have its own client wrapper class that handles:
-
Authentication
-
Request formatting
-
Response parsing
-
Error handling
Example concept:
Application Logic
│
▼
Integration Layer
│
┌──────────────┬──────────────┐
StripeClient PayPalClient SMSClient
│ │ │
External APIs
This ensures the rest of the system interacts with consistent interfaces, regardless of the external API.
3. Implement Retry with Exponential Backoff
External APIs may fail temporarily due to network issues or server overload. Implement retry logic with exponential backoff to retry failed requests safely.
This prevents:
-
Immediate repeated failures
-
Overloading external services
4. Add Timeouts
Never allow API calls to wait indefinitely. Configure request timeouts so the system fails fast if a service is unresponsive.
Benefits:
-
Prevents threads/workers from being blocked
-
Keeps the application responsive
25. How would you implement centralized logging across multiple services?
In a system with multiple services (microservices or distributed systems), logs generated by different services need to be collected, stored, and analyzed in one central location. Centralized logging helps with debugging issues, monitoring system behavior, and tracing requests across services.
1. Standardize Logging Format
First, ensure all services log information in a consistent format (preferably structured logs like JSON). This makes logs easier to search and analyze.
Example fields typically included:
-
Timestamp
-
Service name
-
Log level
-
Request ID / Trace ID
-
Message
Example concept:
Structured logs allow centralized systems to index and query logs efficiently.
2. Use a Logging Agent
Each service writes logs locally, and a log collector agent gathers them and forwards them to a central logging system.
Common log collectors:
-
Fluentd
-
Logstash
-
Filebeat
Flow:
Service Logs
│
▼
Log Collector (Fluentd / Filebeat)
│
▼
Central Logging System
26. You need to implement role-based access control (RBAC). How would you design it?
Role-Based Access Control (RBAC) is used to restrict system access based on user roles. Instead of assigning permissions directly to users, permissions are assigned to roles, and users are assigned roles. This makes access management scalable and easier to maintain.
Core Concepts
RBAC typically involves three main entities:
-
Users – people using the system
-
Roles – groups that define access levels (Admin, Manager, User)
-
Permissions – actions allowed in the system (read, write, delete)
Relationship:
User → Role → Permissions
Example:
Alice → Admin → create_user, delete_user, view_reports
Bob → Viewer → view_reports
Database Structure
A common RBAC schema includes these tables:
users
roles
permissions
user_roles
role_permissions
Relationships:
Users ─── user_roles ─── Roles ─── role_permissions ─── Permissions
Example:
| User | Role |
|---|---|
| Alice | Admin |
| Bob | Manager |
| Role | Permission |
|---|---|
| Admin | create_user |
| Admin | delete_user |
| Manager | view_reports |
27. How would you ensure backward compatibility when releasing new API versions?
When releasing new versions of an API, it is important to ensure that existing clients continue to work without breaking. Backward compatibility ensures that applications already using the API do not fail when changes are introduced.
1. Use API Versioning
The most common approach is to introduce explicit versioning so that older clients can continue using the previous version while new clients use the updated one.
Common strategies:
-
URL versioning
/api/v1/users
/api/v2/users -
Header versioning
Clients specify the API version in request headers. -
Query parameter versioning
/users?version=2
URL versioning is the most commonly used because it is clear and easy to manage.
2. Avoid Breaking Changes
Whenever possible, changes should be additive rather than destructive.
Safe changes:
-
Adding new fields
-
Adding optional parameters
-
Adding new endpoints
Risky changes:
-
Removing fields
-
Renaming parameters
-
Changing response formats
Example:
Instead of removing a field:
Add a new field:
3. Maintain Multiple Versions
Older API versions should remain active for a period of time so existing clients can continue using them.
Example:
v1 → legacy clients
v2 → new features
Eventually, older versions can be deprecated gradually.
28. You need to serialize complex Python objects for network transmission. What options would you consider?
When transmitting Python objects over a network (between services, APIs, message queues), the objects must be converted into a format that can be sent over the network and reconstructed on the receiving side. This process is called serialization.
Several serialization formats can be used depending on performance, compatibility, and security requirements.
1. JSON (Most Common for APIs)
JSON is widely used for network communication because it is human-readable and language-independent.
Advantages:
-
Easy to debug
-
Supported by almost all programming languages
-
Common for REST APIs
Limitations:
-
Cannot directly serialize complex Python objects like classes, datetime, or sets without custom handling.
Example concept:
Python Object → JSON → Network → JSON → Python Object
Libraries often used:
-
json -
pydantic -
marshmallow
2. Pickle (Python-Specific Serialization)
Pickle can serialize almost any Python object, including custom classes.
Advantages:
-
Handles complex Python structures automatically
-
Very easy to use
Limitations:
-
Python-specific (not cross-language)
-
Unsafe for untrusted data because it can execute arbitrary code.
Typical use cases:
-
Internal communication between trusted Python services
-
Caching or object persistence.
3. MessagePack
MessagePack is a binary serialization format similar to JSON but more compact and faster.
Advantages:
-
Smaller payload size
-
Faster serialization/deserialization
-
Cross-language support
Common use cases:
-
High-performance APIs
-
Microservice communication
Senior / Architect Level
29. Design a scalable microservices architecture in Python handling 1M+ daily users.To support 1M+ daily users, the system must be designed for scalability, fault tolerance, and high availability. A microservices architecture divides the application into independent services that can scale individually.
1. Break the System into Microservices
Instead of a monolithic system, split the application into domain-based services.
Example services:
User Service
Auth Service
Payment Service
Order Service
Notification Service
Analytics Service
Each service:
-
Handles a single responsibility
-
Can be developed, deployed, and scaled independently
2. Use an API Gateway
An API Gateway acts as the single entry point for clients.
Responsibilities:
-
Routing requests to services
-
Authentication
-
Rate limiting
-
Request aggregation
Architecture:
Clients
│
▼
API Gateway
│
├── User Service
├── Order Service
├── Payment Service
└── Notification Service
Common tools:
-
Nginx
-
Kong
-
AWS API Gateway
3. Containerization and Orchestration
Each microservice should run inside containers.
Tools:
-
Docker for containerization
-
Kubernetes for orchestration
Benefits:
-
Auto-scaling
-
Self-healing services
-
Easy deployments
Example:
Kubernetes Cluster
├── User Service Pods
├── Payment Service Pods
├── Order Service Pods
30. How would you implement rate limiting in a high-traffic API?
Rate limiting is used to control how many requests a client can make within a given time window. It protects APIs from abuse, traffic spikes, and denial-of-service attacks, while ensuring fair usage among users.
1. Choose a Rate Limiting Strategy
Several algorithms can be used depending on the use case.
Common algorithms:
-
Token Bucket – Allows bursts but limits overall rate.
-
Leaky Bucket – Smooths traffic by processing requests at a constant rate.
-
Fixed Window Counter – Limits requests per fixed time window (e.g., 100 requests/minute).
-
Sliding Window – More accurate version of fixed window.
For high-traffic APIs, Token Bucket or Sliding Window are typically preferred.
Example concept:
Client Requests
│
▼
Rate Limiter
│
┌───────────────┐
│Within Limit? │
└───────┬───────┘
│
Yes ──┘ └── No
│ │
Allow Request Return 429 (Too Many Requests)
2. Use Redis for Distributed Rate Limiting
In high-traffic systems with multiple API servers, rate limits must be shared across instances. Redis is commonly used because it is fast and supports atomic operations.
Example logic:
-
Use a key per user/IP
-
Increment the request counter
-
Set expiration for the time window
Example concept:
User Request
│
▼
Check Redis Counter
│
├─ If < limit → allow request
└─ If ≥ limit → reject request
Example pseudo implementation idea:
3. Apply Rate Limits at the API Gateway
For large systems, rate limiting should be enforced before requests reach backend services.
Tools that support this:
-
Nginx
-
Kong
-
AWS API Gateway
-
Cloudflare
Architecture:
Clients
│
▼
API Gateway (Rate Limiting)
│
▼
Backend Services
This reduces load on application servers.
31. A Python service intermittently times out in production. How would you troubleshoot it?
Intermittent timeouts usually indicate issues such as slow dependencies, resource exhaustion, network latency, or blocking operations. The key is to identify where the delay occurs and why it happens only occasionally.
1. Check Logs and Error Patterns
Start by analyzing application logs to identify:
-
Which endpoints are timing out
-
When the issue occurs (traffic spikes, specific operations)
-
Correlation with external services or database queries
Look for patterns such as:
-
Same endpoint repeatedly failing
-
Timeouts during peak traffic
-
Specific users or requests causing delays
2. Measure Request Latency
Add detailed request timing to understand where the time is spent.
Break down the request lifecycle:
Incoming Request
│
├── Authentication
├── Business Logic
├── Database Query
└── External API Calls
By measuring each step, you can determine the slow component.
Tools commonly used:
-
Application performance monitoring (APM)
-
Custom timing logs
-
Middleware timing metrics
3. Investigate External Dependencies
Timeouts often occur because the service is waiting for:
-
Databases
-
Third-party APIs
-
Message queues
-
File storage systems
Check:
-
Response times of these services
-
Connection errors
-
Network latency
Add timeouts and retry logic if needed.
32. How would you design a distributed task processing system (like background job queues)?
A distributed task processing system allows long-running or resource-intensive tasks to run in the background instead of blocking the main application. This improves performance, scalability, and reliability.
Examples of tasks:
-
Sending emails
-
Image/video processing
-
Payment processing
-
Report generation
1. Core Components
A distributed task system typically includes the following components:
Client / Web App
│
▼
Task Queue (Message Broker)
│
▼
Worker Nodes
│
▼
Task Result Storage
Components:
-
Producer – The service that submits tasks.
-
Message Broker / Queue – Stores tasks until workers process them.
-
Workers – Processes tasks asynchronously.
-
Result Backend – Stores results or task states.
Common technologies:
-
Redis
-
RabbitMQ
-
Kafka
-
Celery / RQ / Dramatiq
2. Task Submission (Producer)
The application sends tasks to the queue instead of executing them directly.
Example flow:
User Request
│
▼
API Server
│
▼
Push Task to Queue
This keeps the API response fast and non-blocking.
3. Message Broker (Queue)
The queue stores tasks and distributes them to workers.
Responsibilities:
-
Task persistence
-
Task ordering
-
Handling retries
-
Load distribution
Popular brokers:
-
Redis (simple and fast)
-
RabbitMQ (reliable messaging)
-
Kafka (high throughput systems)
4. Worker Processes
Workers continuously pull tasks from the queue and execute them.
Example architecture:
Queue
│
├── Worker 1
├── Worker 2
├── Worker 3
Workers can be scaled horizontally to handle increasing workloads.
33. How would you implement circuit breaker and retry mechanisms?
When a system depends on external services (APIs, databases, microservices), failures can cascade and bring down the entire system. Two common resilience patterns used to handle this are retry mechanisms and circuit breakers.
Retry Mechanism
A retry mechanism automatically re-attempts a failed operation when the failure is temporary (network issues, transient service failures).
Typical design:
-
Limit the number of retries
-
Use exponential backoff between attempts
-
Retry only for recoverable errors
Example idea:
Retry timing example:
Attempt 1 → immediate
Attempt 2 → wait 2s
Attempt 3 → wait 4s
This prevents overwhelming the external service.
Circuit Breaker Pattern
A circuit breaker prevents the system from continuously calling a failing service.
Instead of retrying indefinitely, the circuit breaker temporarily stops requests to the failing service.
Circuit breaker states:
Closed → Normal operation
Open → Requests blocked due to failures
Half-open → Test request allowed to check recovery
Flow:
Service Call
│
▼
Failure Threshold Reached
│
▼
Circuit Opens
│
Requests Blocked Temporarily
│
After Cooldown → Half-open
│
Success → Close Circuit
Failure → Open Circuit Again
34. You need to deploy and containerize a Python application using Docker. What best practices would you follow?
When containerizing a Python application with Docker, the goal is to create secure, lightweight, reproducible, and scalable containers that can run consistently across environments.
1. Use a Lightweight Base Image
Choose a small and secure base image to reduce the container size and attack surface.
Common choices:
-
python:3.x-slim -
python:3.x-alpine(very small but sometimes requires extra dependencies)
Example:
FROM python:3.11-slim
This keeps images smaller and faster to deploy.
2. Use a .dockerignore File
Avoid copying unnecessary files into the container.
Example .dockerignore:
__pycache__/
*.pyc
.git
.env
node_modules
tests/
Benefits:
-
Smaller image size
-
Faster build time
-
Avoid leaking sensitive data
3. Use Multi-Stage Builds (When Needed)
Multi-stage builds help keep production images minimal by separating build dependencies from runtime dependencies.
Example concept:
Build Stage
│
Install dependencies
│
Final Stage
│
Copy only required files
This prevents unnecessary tools from ending up in the final image.
35. How would you design CI/CD pipelines for Python microservices?
A CI/CD pipeline automates building, testing, and deploying microservices so that changes can be released quickly, safely, and consistently. For Python microservices, the pipeline should ensure code quality, containerized builds, and automated deployments.
1. Source Control and Branch Strategy
All services should be stored in version control (e.g., Git) with a clear branching strategy.
Common approach:
main → production-ready code
develop → integration branch
feature/* → new features
Whenever code is pushed or a pull request is created, the CI pipeline should trigger automatically.
2. Continuous Integration (CI)
The CI stage ensures that every code change is tested and validated before merging.
Typical CI steps:
Code Push
│
▼
Install Dependencies
│
Run Linters
│
Run Unit Tests
│
Build Docker Image
Key checks include:
-
Linting
-
flake8
-
pylint
-
black (formatting)
-
-
Unit Tests
-
pytest
-
unittest
-
-
Security Scanning
-
Bandit
-
Dependency vulnerability scanning
-
3. Container Build Stage
Each microservice should be packaged as a Docker container.
Pipeline step:
Build Docker Image
│
▼
Tag Image with Version
│
▼
Push Image to Registry
Container registries:
-
Docker Hub
-
AWS ECR
-
Google Artifact Registry
Example tagging:
service-name:1.2.0
service-name:latest
4. Continuous Deployment (CD)
Once the image is built and tested, it can be deployed automatically.
Deployment flow:
Docker Registry
│
▼
Deployment Pipeline
│
▼
Kubernetes Cluster
Deployment methods:
-
Helm charts
-
Kubernetes manifests
-
Infrastructure-as-Code tools
36. Your system must support real-time updates (like chat). How would you design it?
To support real-time updates like chat messages, notifications, or live dashboards, the system must allow instant bidirectional communication between clients and the server instead of traditional request-response APIs.
The most common solution is using WebSockets with a scalable backend architecture.
1. Use WebSockets for Real-Time Communication
Unlike HTTP, WebSockets keep a persistent connection open between the client and server, allowing messages to be pushed instantly.
Flow:
Client
│
WebSocket Connection
│
Real-Time Server
This allows:
-
Instant message delivery
-
Low latency communication
-
Bidirectional messaging
Python frameworks commonly used:
-
FastAPI (WebSockets)
-
Django Channels
-
Socket.IO
-
aiohttp
2. Real-Time Server Layer
Instead of standard REST servers, use an event-driven server capable of handling many concurrent connections.
Example stack:
Clients
│
Load Balancer
│
WebSocket Servers
Servers should be asynchronous to handle thousands of connections efficiently.
3. Use a Message Broker for Scaling
In a distributed system with multiple WebSocket servers, messages must be shared between instances.
Use a pub/sub message broker.
Example:
User A sends message
│
▼
WebSocket Server
│
▼
Message Broker (Redis/Kafka)
│
▼
Other WebSocket Servers
│
▼
User B receives message
Common technologies:
-
Redis Pub/Sub
-
Kafka
-
RabbitMQ
37. How would you scale CPU-bound workloads in Python despite the GIL?
Python’s Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode simultaneously within the same process. Because of this, threading does not improve performance for CPU-bound tasks. To scale CPU-heavy workloads, alternative strategies must be used.
1. Use Multiprocessing (Most Common Solution)
The multiprocessing module creates separate processes instead of threads. Each process has its own Python interpreter and memory space, so the GIL does not limit parallel execution.
Example concept:
Main Process
│
├── Worker Process 1
├── Worker Process 2
├── Worker Process 3
Each process runs on a separate CPU core, enabling true parallelism.
Common tools:
-
multiprocessing.Pool -
ProcessPoolExecutor
Typical use cases:
-
Data processing
-
Image/video processing
-
Machine learning workloads
2. Use Distributed Task Processing
For very large workloads, distribute computation across multiple machines.
Architecture:
Task Producer
│
Message Queue
│
├── Worker Node 1
├── Worker Node 2
├── Worker Node 3
Tools commonly used:
-
Celery
-
RabbitMQ / Redis
-
Kafka
This allows scaling CPU-heavy jobs across many servers.
3. Use Native Extensions (Release the GIL)
Some libraries written in C/C++ release the GIL during computation. This allows multiple threads to execute native code simultaneously.
Examples:
-
NumPy
-
Pandas
-
TensorFlow
-
PyTorch
These libraries perform heavy computations outside Python, bypassing the GIL.
38. How would you secure secrets and configuration in production environments?
In production systems, secrets such as API keys, database credentials, tokens, and encryption keys must be protected to prevent unauthorized access. Proper secret management ensures that sensitive information is stored securely, rotated regularly, and never exposed in code or logs.
1. Avoid Hardcoding Secrets
Secrets should never be stored directly in the source code or committed to version control.
Bad practice:
Instead, secrets should be stored outside the application code and injected at runtime.
2. Use Environment Variables
A common practice is storing configuration values as environment variables.
Example:
Advantages:
-
Keeps secrets separate from code
-
Easy to change across environments (dev, staging, production)
However, environment variables alone are not sufficient for large-scale systems.
3. Use Secret Management Systems
Production systems typically use dedicated secret management services.
Common solutions:
-
AWS Secrets Manager
-
HashiCorp Vault
-
Azure Key Vault
-
Google Secret Manager
Architecture example:
Application
│
▼
Secret Manager
│
Secure Storage
Benefits:
-
Encrypted secret storage
-
Access control
-
Secret rotation
-
Auditing
4. Encrypt Secrets at Rest and in Transit
Secrets should always be encrypted:
-
At rest (in databases or secret stores)
-
In transit (using TLS)
39. A monolithic Python application needs to be migrated to microservices. How would you approach the migration strategy?
Migrating a monolithic system to microservices should be done incrementally, not as a big-bang rewrite. The goal is to gradually extract services while keeping the system stable and minimizing risk.
1. Analyze and Understand the Monolith
Start by analyzing the existing system to identify:
-
Core business domains
-
Module dependencies
-
Database structure
-
Performance bottlenecks
Tools such as architecture diagrams and dependency analysis help determine logical service boundaries.
2. Identify Service Boundaries (Domain Decomposition)
Break the monolith into domain-driven components based on business functionality.
Example:
Monolith
│
├── User Management
├── Authentication
├── Orders
├── Payments
└── Notifications
These domains can later become independent microservices.
3. Apply the Strangler Fig Pattern
Instead of rewriting everything, use the Strangler Pattern, where new microservices gradually replace parts of the monolith.
Flow:
Clients
│
▼
API Gateway
│
├── Monolith
└── New Microservices
Over time:
-
Features move from the monolith to microservices
-
The monolith shrinks until it can be removed
4. Extract the First Service
Choose a low-risk, loosely coupled component as the first microservice.
Good candidates:
-
Notification service
-
Authentication service
-
Reporting modules
This allows teams to validate the architecture without risking critical systems.
40. How would you design observability (logging, metrics, tracing) for a distributed Python system?
In a distributed system with multiple services, observability is essential to understand system behavior, detect failures, and debug issues quickly. A good observability design includes three pillars: logging, metrics, and distributed tracing.
1. Centralized Logging
Each service should generate structured logs and send them to a centralized logging system.
Best practices:
-
Use structured logs (JSON) instead of plain text.
-
Include important fields such as:
-
Timestamp
-
Service name
-
Log level
-
Request ID / Trace ID
-
Error details
-
Example flow:
Services
│
Log Collector (Fluentd / Filebeat)
│
Central Storage (Elasticsearch / Loki)
│
Visualization (Kibana / Grafana)
Popular stack:
-
ELK Stack (Elasticsearch, Logstash, Kibana)
-
Grafana Loki
This allows searching logs across all services.
2. Metrics Collection
Metrics help monitor system performance and health over time.
Common metrics:
-
Request latency
-
Error rate
-
Throughput (requests/sec)
-
CPU and memory usage
-
Queue length
Metrics flow:
Application
│
Metrics Exporter
│
Prometheus
│
Grafana Dashboards
Python tools:
-
Prometheus client library
-
StatsD
-
OpenTelemetry metrics
These metrics help identify performance issues quickly.
3. Distributed Tracing
In distributed systems, a single request may pass through multiple services. Distributed tracing tracks the entire request path.
Example flow:
User Request
│
API Gateway
│
User Service
│
Order Service
│
Payment Service
Each service records a trace span, allowing visualization of request latency across services.
Common tools:
-
OpenTelemetry
-
Jaeger
-
Zipkin
Tracing helps identify:
-
Slow services
-
Network delays
-
Bottlenecks in request chains
-