placement brief / Interview Questions / brief / 08 Jun 2026

Top 26 DBMS Indexing Interview Questions & Answers (2026)

Q: Does an index guarantee faster query performance?

No. An index helps only when the query's selectivity is high and the optimizer chooses to use it. For low-selectivity queries or small tables, indexes add overhead without benefit.

Q: How many indexes should a table have?

There is no fixed rule. Typical OLTP tables have 3-7 indexes. Each index adds write overhead (INSERT/UPDATE/DELETE cost). Profile actual query patterns and add indexes for the most frequent, slow queries.

Q: What happens to indexes when you run ALTER TABLE?

Adding or removing columns may require rebuilding indexes. ALTER TABLE on large tables can be very slow and lock the table (or use online DDL with tools like pt-online-schema-change to avoid locking). ---

Indexing is tested extensively in DBMS rounds at backend SDE interviews. Candidates report that B+tree internals, index selection decisions, and EXPLAIN plan...

By Aditya SharmaPublished 8 Jun 20262 sources listedSpot an error? Corrections open

13 min read last revised 8 Jun 2026

on this page§ 06

Last Updated: June 2026 | Level: Beginner to Advanced | Format: Q&A with Diagrams, SQL, and Complexity Analysis

Indexing is tested extensively in DBMS rounds at backend SDE interviews. Candidates report that B+tree internals, index selection decisions, and EXPLAIN plan analysis appear at product companies including Google, Amazon, and Flipkart. Based on public preparation resources and candidate-reported accounts, questions on clustered vs non-clustered and covering indexes are common at mid-to-senior SDE rounds.

Index Basics (Q1-Q8)
B-Tree and B+Tree (Q9-Q15)
Index Types and Selection (Q16-Q22)
Query Optimization (Q23-Q26)

Index Basics

Q1. What is a database index? Why is it needed? `Easy`

A database index is a separate data structure that enables fast lookups on one or more columns without scanning every row in a table.

Without index:

SELECT * FROM employees WHERE department = 'Engineering';
-- Full table scan: reads every row, checks every department value.
-- For 10 million rows: reads 10 million rows.
-- Time complexity: O(n)

With index on department:

Index lookup: find 'Engineering' in the B+tree index.
Index directly provides the row locations.
-- Reads only matching rows (e.g., 5000 Engineering employees).
-- Time complexity: O(log n) for lookup + O(k) for k matching rows.

Cost of indexes:

Write overhead: every INSERT/UPDATE/DELETE must also update the index.
Storage: indexes consume disk space (often 10-50% of table size).
Memory: often cached in the buffer pool.

The trade-off: indexes speed up reads at the cost of writes and storage.

Q2. What is the difference between a clustered and non-clustered index? `Medium`

Aspect	Clustered Index	Non-Clustered Index
Number per table	One (only one physical ordering)	Many (multiple separate index structures)
Physical storage	Table rows stored in index key order	Separate structure with pointers to rows
Leaf nodes contain	Actual row data	Index key + row pointer (RID or PK)
Range scan	Very efficient (rows physically ordered)	May require many random I/Os (row lookup per entry)
MySQL InnoDB	Primary key is always the clustered index	Secondary indexes store PK, then look up cluster
SQL Server	Clustered or heap table	All non-PK indexes

Diagram:

Clustered index (InnoDB):
Leaf nodes of B+tree contain the actual rows:
[...| id=5, name=Alice, dept=Eng |...| id=6, name=Bob, dept=HR |...]

Non-clustered index (InnoDB secondary):
Leaf nodes contain: [dept_value | primary_key]
[...|dept=Eng, pk=5|...| dept=Eng, pk=12|...|dept=HR, pk=6|...]
To get full row: look up pk in clustered index (double lookup).

Q3. What is a composite index? What is the left-prefix rule? `Medium`

A composite index is an index on multiple columns.

CREATE INDEX idx_dept_salary ON employees(department, salary);

Left-prefix rule: The index can be used for queries that filter by the leftmost column(s). Skipping a column breaks the prefix.

Query	Uses idx_dept_salary?
`WHERE department = 'Eng'`	YES (leftmost column)
`WHERE department = 'Eng' AND salary > 50000`	YES (both columns)
`WHERE salary > 50000`	NO (skips department, leftmost column)
`WHERE department = 'Eng' AND salary = 70000`	YES
`WHERE department = 'Eng' OR salary > 50000`	PARTIAL (only the OR adds complexity)

Why left-prefix rule exists: A composite index sorts by (department, salary). You can binary search on department. But salary order is only meaningful within a department partition; standalone salary query requires full index scan.

Q4. What is selectivity? How does it affect index effectiveness? `Medium`

Selectivity is the fraction of rows that match a condition. High selectivity = few matching rows. Low selectivity = many matching rows.

Selectivity = (distinct values in column) / (total rows)

Column: gender (M/F, 2 distinct values in 1M rows) -> Selectivity = 2/1,000,000 = 0.000002
Column: email (1M unique values in 1M rows)         -> Selectivity = 1,000,000/1,000,000 = 1.0

Effect on index use:

High selectivity (email): Index very effective. Query returns very few rows; B+tree lookup is efficient.
Low selectivity (gender): Index often NOT used by optimizer. Scanning half the table via random I/Os is slower than a full sequential table scan.

Rule of thumb: MySQL optimizer typically uses an index if the query will return fewer than ~20-30% of rows. For queries returning > 30% of rows, a full table scan is often faster.

Cardinality: Database term for the number of distinct values. High cardinality (email) = high selectivity = good index candidate.

Q5. What is a covering index (index-only scan)? `Medium`

A covering index includes all columns needed by the query (both WHERE filter and SELECT projection columns). The query can be satisfied entirely from the index without touching the actual table.

-- Query:
SELECT name, salary FROM employees WHERE department = 'Engineering';

-- Regular index on department:
-- Step 1: Find 'Engineering' in index -> get row pointers
-- Step 2: Look up EACH row in the actual table to get name and salary
-- Two I/O operations per row.

-- Covering index: (department, name, salary)
CREATE INDEX idx_cover ON employees(department, name, salary);
-- Step 1: Find 'Engineering' in index
-- Leaf node already has name, salary. NO table lookup needed.
-- One I/O set total.

When to use covering indexes:

Frequent queries that select a small, fixed set of columns.
High-read, low-write tables (covering indexes add write overhead for all those extra columns).

MySQL EXPLAIN output: Look for Using index in the Extra column. This means an index-only scan is being used.

Q6. What is a partial index? `Medium`

A partial index (available in PostgreSQL, SQLite; not MySQL) indexes only a subset of rows that satisfy a WHERE condition.

-- Partial index: only indexes active orders (not completed/cancelled)
CREATE INDEX idx_active_orders ON orders(customer_id)
WHERE status = 'active';

-- This query benefits from the partial index:
SELECT * FROM orders WHERE customer_id = 42 AND status = 'active';

-- This query does NOT:
SELECT * FROM orders WHERE customer_id = 42 AND status = 'completed';

Advantages:

Smaller index size (only active rows; may be 10% of total rows).
Faster updates (completed orders don't need to update this index).
More likely to fit in memory.

Use cases:

Soft-deleted rows (WHERE deleted_at IS NULL).
Active/pending work items.
Specific status values that are queried far more than others.

Q7. What is the difference between a dense and sparse index? `Medium`

Aspect	Dense Index	Sparse Index
Entries	One index entry per data record	One index entry per block/page of data
Lookup	Direct: find entry, access record	Indirect: find closest block entry, scan block
Space	More (one entry per row)	Less (one entry per block, e.g., 1/100 of dense)
Updates	More overhead (update index per row)	Less overhead
Used in	Non-clustered indexes, hash indexes	Clustered indexes on sorted data

Dense index example:

Index: [key=1, ptr] [key=2, ptr] [key=3, ptr] ... one entry per row

Sparse index example:

Data blocks (sorted by key):
Block 1: rows 1-100
Block 2: rows 101-200
Block 3: rows 201-300

Sparse index: [key=1, ptr to Block 1] [key=101, ptr to Block 2] [key=201, ptr to Block 3]
To find row 150: find entry <=150 in index (entry 101), access Block 2, scan for 150.

Clustered indexes in most databases use sparse indexing at higher levels of the B+tree (internal nodes are sparse; leaf nodes are dense).

Q8. What are the ACID implications of indexes? `Hard`

Indexes must participate in ACID transactions:

Atomicity: If a transaction aborts, index changes made during the transaction must also be undone. WAL includes index modifications.

Consistency: Unique indexes enforce uniqueness constraints. If an INSERT violates a UNIQUE index, the insert is rejected, maintaining consistency.

Isolation: Index entries are locked just like data rows. Inserting a new row acquires an exclusive lock on the index entry. Reading via index acquires shared locks (or is handled via MVCC). Gap locks on B+tree indexes prevent phantom reads in some isolation levels.

Durability: Index changes are logged to WAL. On crash recovery, index state is restored along with data.

Performance implication: ACID-compliant index updates add write amplification. Every INSERT into a table with 5 indexes requires 6 write operations (1 table + 5 indexes), all logged to WAL.

B-Tree and B+Tree

Q9. What is a B-tree? `Medium`

A B-tree (Balanced Tree) is a self-balancing tree data structure designed for storage systems. Unlike binary trees, each node can have many children.

Properties of B-tree of order m:

Every node has at most m children.
Every non-leaf non-root node has at least ceil(m/2) children.
Root has at least 2 children (unless it is a leaf).
All leaves are at the same level (perfectly balanced).
A node with k children contains k-1 keys.

Structure:

B-tree of order 5 (each node: 2-4 keys, 3-5 children):

         [30      |      70]
        /         |         \
  [10|20]     [40|50|60]   [80|90]
  / | | \      / | | | \    /  | \
 ...         ...            ...
(data at each node)

Key property: Data is stored at ALL nodes (internal and leaf). This means to retrieve data, you may find it at any level.

Time complexity: Search, insert, delete: O(log n) where n is number of keys.

Q10. What is a B+tree? How does it differ from B-tree? `Medium`

A B+tree is a variant of B-tree where:

All data records are stored ONLY at leaf nodes.
Internal nodes store only keys (used for routing).
Leaf nodes are linked in a doubly-linked list.

Comparison:

Aspect	B-tree	B+tree
Data location	At all nodes	Only at leaf nodes
Internal nodes	Store keys + data pointers	Store only keys (routing)
Leaf links	None	Linked list (enables range scan)
Range queries	Slow (must traverse tree for each record)	Fast (scan leaf linked list)
Point queries	Slightly faster (may find data at internal node)	Slightly slower (always go to leaf)
Use in databases	Rarely	Dominant (MySQL, PostgreSQL, Oracle)

B+tree structure:

         [30    |    60]          <- Internal (routing keys only)
        /        |        \
  [10|20]     [30|40|50]   [60|70|80]    <- Leaves (data here)
  data data   data data data  data data data
     |            |               |
     +------------+---------------+  <- Linked list

Range query advantage:

Find all employees with salary BETWEEN 30000 AND 60000:
1. Find leaf node containing 30000 (log n time).
2. Scan linked list of leaves forward until key > 60000.
Total: O(log n + k) where k = number of matching records.
B-tree would require O(k log n) to fetch each record individually.

Q11. How does B+tree handle insertions? `Hard`

B+tree insertion algorithm:

1. Find the correct leaf node (binary search down the tree).
2. Insert the key-value pair into the leaf.
3. If leaf overflows (too many keys):
   a. Split: create a new leaf with upper half of keys.
   b. Promote the FIRST key of the new leaf (copy-up) to parent.
   c. Update leaf linked list pointers.
4. If parent overflows after promotion:
   a. Split the internal node: push middle key UP (push-up, not copy).
   b. Repeat up the tree until no overflow or a new root is created.

Example (order 3: max 2 keys per node):

Initial leaf: [10, 20]
Insert 15:    [10, 15, 20] <- OVERFLOW

Split: Left=[10,15], Right=[20], promote 20 to parent.
Parent was empty root: new root = [20]
Tree:
    [20]
   /    \
[10|15] [20]   <- leaf 20 is copy-upped

Key difference from B-tree: B+tree "copies up" the separator key (leaf keeps it). B-tree "pushes up" (internal node gets the key, not leaf).

Q12. How does B+tree handle deletions? `Hard`

B+tree deletion algorithm:

1. Find the leaf containing the key.
2. Delete the key from the leaf.
3. If leaf underflows (too few keys, below ceil(m/2)-1):
   Option A: BORROW from a sibling (redistribute).
   Option B: MERGE with a sibling (if borrowing not possible).
4. If merge: remove separator key from parent.
5. If parent underflows: recurse up.

Borrow example:

Left sibling: [5, 10, 15] (has extra)
Current leaf: [20] (underflow after deleting 25)
Parent separator: [20]

After borrow: redistribute -- move 15 to current, update separator to 15.
Left: [5, 10], Current: [15, 20], Parent separator: [15]

Merge example:

Left sibling: [5, 10] (cannot borrow)
Current leaf: [20] (underflow after deleting 25)
Parent separator: [20]

After merge: combine into [5, 10, 20], remove separator from parent.
Parent separator [20] removed. If parent underflows: repeat.

Amortized cost: Insertions and deletions are O(log n) amortized. Splits and merges are O(log n) worst case but rare in practice.

Q13. What is the height of a B+tree? How many I/Os does a lookup take? `Medium`

B+tree height:

For n total keys and order m (max m children per node):
Height h = O(log_m(n))

Each node holds up to m-1 keys.
Maximum keys at height h: (m-1) * m^h

For n=1 million keys, m=200 (page size 16KB, key+pointer ~80 bytes):
h = ceil(log_200(1,000,000)) = ceil(log(1M)/log(200)) = ceil(3.93) = 4

B+tree of height 4 with branching factor 200 holds hundreds of millions of keys.

I/Os per lookup:

Each node is one disk page.
Lookup = traverse from root to leaf = h I/Os.

With height 4: 4 disk I/Os per point query.
With root and upper levels cached in buffer pool: often 1-2 I/Os in practice.

Why large branching factor matters:

Higher branching factor = lower tree height = fewer I/Os per lookup.
Database designers tune page size (InnoDB: 16KB default) to maximize keys per node.

Q14. What is a B+tree fill factor? `Medium`

Fill factor is the percentage of each index page that is filled during an initial index build.

Fill factor = 70% means each index page starts 70% full.
30% is reserved for future insertions without immediate page splits.

High fill factor (90-100%):

Maximum space efficiency.
More keys per page, fewer pages, lower height.
Future inserts cause more page splits (worse write performance for insertions into existing ranges).

Low fill factor (50-70%):

More space reserved for future inserts.
More pages, higher height.
Fewer page splits as the index grows.
Better write performance for insert-heavy workloads.

SQL Server example:

CREATE INDEX idx_orders ON orders(order_date)
WITH FILLFACTOR = 80;  -- 80% filled, 20% reserved

PCTFREE (Oracle): Same concept, but expressed as percentage of free space to reserve (PCTFREE=20 means fill to 80%).

Q15. What is a hash index? When is it better than B+tree? `Medium`

A hash index uses a hash function to map keys to buckets containing row pointers.

Structure:

hash(key) -> bucket_number -> [pointer, pointer, ...]

Aspect	Hash Index	B+tree Index
Point lookup (=)	O(1) average	O(log n)
Range query (BETWEEN, >, <)	Not supported	O(log n + k)
Prefix queries (LIKE 'abc%')	Not supported	Supported
ORDER BY	Not supported	Supported (leaves are ordered)
Insert/delete	O(1) average	O(log n)
Memory vs disk	Better for in-memory	Better for disk-based

When to use hash index:

Exact equality lookups only (WHERE id = 42, WHERE email = '[email protected]').
In-memory tables or memory-optimized workloads.
No range queries needed.

MySQL: MEMORY storage engine uses hash indexes by default. InnoDB's adaptive hash index is a dynamic in-memory hash built automatically on frequently accessed B+tree entries (hidden from the user).

PostgreSQL: Has explicit hash index type (CREATE INDEX ... USING HASH). Not WAL-logged before PG 10; write-safe only from PG 10+.

Index Types and Selection

Q16. What is a bitmap index? When is it used? `Hard`

A bitmap index uses one bit-vector per distinct value in the indexed column. Each bit represents whether a row has that value.

gender column (M/F), 5 rows:
Row:    1   2   3   4   5
Data:   M   F   M   M   F

Bitmap for M: [1, 0, 1, 1, 0]
Bitmap for F: [0, 1, 0, 0, 1]

AND/OR operations for complex queries:

SELECT * FROM employees WHERE gender='M' AND department='Engineering';

Bitmap for M:          [1, 0, 1, 1, 0, 1, ...]
Bitmap for Engineering:[0, 0, 1, 0, 0, 1, ...]
AND result:            [0, 0, 1, 0, 0, 1, ...]  <- rows 3 and 6 match

When to use:

Low cardinality columns (few distinct values: gender, status, country, boolean flags).
Read-heavy data warehouse / OLAP workloads.
Queries combining multiple low-cardinality conditions.

Why NOT for OLTP:

Row-level locking is difficult (one update may affect bits across the entire vector).
Oracle: bitmap index pages locked at block level for updates -- high contention for DML-heavy tables.
PostgreSQL: does not support persistent bitmap indexes (only in-memory bitmap scans during query execution).

Q17. What is an index scan vs a table scan? When does the optimizer choose each? `Medium`

Scan Type	Method	When used
Full table scan (Seq scan)	Read every row in the table	No useful index, low selectivity, small table
Index scan	Traverse B+tree, look up rows	High selectivity, moderate result set
Index-only scan	Traverse B+tree, use index data (covering index)	All columns in query are in the index
Bitmap index scan	Collect row IDs from index, then fetch rows in heap order	Multiple indexes combined, moderate result set
Loose index scan	Skip-scan through distinct values	GROUP BY, DISTINCT with index on grouped column

Optimizer cost model: The query optimizer estimates:

Estimated rows returned (from table statistics).
Cost of random I/O (index lookup per row) vs sequential I/O (full scan).
Whether index fits in buffer pool.

When full scan beats index scan:

Query returns > 20-30% of rows. Random I/O per row from index is slower than one sequential scan.
Small table (< 1000 rows): full scan faster than B+tree traversal.
No usable index for the predicate.

Q18. What is an index on expression (functional index)? `Medium`

A functional index (expression index) stores the result of an expression applied to column values, enabling indexed lookups on computed values.

-- Without functional index: full scan for case-insensitive search
SELECT * FROM users WHERE LOWER(email) = '[email protected]';

-- With functional index:
CREATE INDEX idx_lower_email ON users(LOWER(email));  -- PostgreSQL
-- or in MySQL 8.0+:
CREATE INDEX idx_lower_email ON users((LOWER(email)));  -- note double parentheses

-- Now this query uses the index:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';

Common use cases:

Case-insensitive search.
Extracted substrings (LEFT(phone, 3) for area code prefix queries).
Date parts (YEAR(created_at) for year-based filtering).
JSON field extraction (data->>'status' in PostgreSQL).

Constraint: The query WHERE clause must use the exact same expression as the index definition. LOWER(email) and LOWER(TRIM(email)) are different expressions.

Q19. What is index cardinality and how do you view it? `Easy`

Cardinality in the context of indexes is the estimated number of distinct values in an indexed column. Higher cardinality = more unique values = better index selectivity.

-- MySQL: view index cardinality
SHOW INDEX FROM employees;
-- Output columns: Table, Key_name, Cardinality, ...
-- Cardinality is an estimate (updated by ANALYZE TABLE or auto-update stats)

-- PostgreSQL: view from pg_stats
SELECT attname, n_distinct FROM pg_stats WHERE tablename = 'employees';
-- n_distinct = estimated distinct values (negative means fraction of total rows)

-- Update statistics:
ANALYZE TABLE employees;  -- MySQL
ANALYZE employees;        -- PostgreSQL

Use in query planning: The optimizer uses cardinality estimates to determine:

Which index to use when multiple are available.
Whether to use an index at all (low cardinality = skip index).
Join order in multi-table queries.

Stale statistics issue: If table data changes significantly after the last ANALYZE, cardinality estimates become inaccurate. The optimizer may make poor index choices. Run ANALYZE after large bulk inserts.

Q20. How do you use EXPLAIN to analyze index usage? `Medium`

EXPLAIN shows the query execution plan, revealing which indexes are used and estimated costs.

MySQL EXPLAIN:

EXPLAIN SELECT * FROM employees WHERE department = 'Engineering' AND salary > 50000;

+----+-------------+-----------+------+---------------+------------------+-------+-------+------+
| id | select_type | table     | type | possible_keys | key              | rows  | Extra |
+----+-------------+-----------+------+---------------+------------------+-------+-------+------+
|  1 | SIMPLE      | employees | ref  | idx_dept      | idx_dept         | 500   | Using where |
+----+-------------+-----------+------+---------------+------------------+-------+-------+------+

Key EXPLAIN columns:

Column	Meaning
`type`	Join/scan type: ALL=full scan, index=full index scan, range=index range, ref=index lookup, const=PK lookup
`possible_keys`	Indexes that could be used
`key`	Index actually used (NULL = full scan)
`rows`	Estimated rows to examine
`Extra`	`Using index` = covering index; `Using filesort` = sort needed; `Using temporary` = temp table

PostgreSQL EXPLAIN ANALYZE:

EXPLAIN ANALYZE SELECT * FROM employees WHERE department = 'Engineering';

-- Output:
Index Scan using idx_dept on employees (cost=0.42..150.5 rows=500 width=80)
  (actual time=0.032..1.2 rows=487 loops=1)
  Index Cond: (department = 'Engineering')
Planning time: 0.5 ms
Execution time: 1.4 ms

EXPLAIN ANALYZE actually executes the query and shows actual vs estimated rows. If estimated vs actual rows differ significantly, statistics need updating.

Q21. What is an index merge? `Hard`

Index merge is a query optimization where the database uses multiple single-column indexes for a single query by merging their results.

-- Two separate indexes:
CREATE INDEX idx_dept ON employees(department);
CREATE INDEX idx_salary ON employees(salary);

-- Query:
SELECT * FROM employees WHERE department = 'Engineering' OR salary > 100000;

-- Index merge (union):
Step 1: Use idx_dept to find all Engineering row IDs.
Step 2: Use idx_salary to find all salary>100000 row IDs.
Step 3: Union the two result sets.
Step 4: Fetch actual rows.

Types of index merge:

Type	Use case
Union	OR conditions (get rows matching either condition)
Intersection	AND conditions on separate indexes (get rows matching both)
Sort-union	Like union but sorts by row ID before fetching for better I/O locality

When the optimizer prefers index merge over composite index: If no composite index (dept, salary) exists, index merge may be better than full scan. But a well-designed composite index is usually faster than index merge due to single B+tree traversal.

MySQL EXPLAIN output: Extra: Using union(idx_dept, idx_salary); Using where

Q22. What are the guidelines for when to create an index? `Easy`

Create an index when:

Column is frequently used in WHERE clauses.
Column is used in JOIN conditions (foreign key columns).
Column is used in ORDER BY or GROUP BY (avoids filesort).
Column has high cardinality (many distinct values).
Table is large (small tables are often better served by full scan).

Avoid creating an index when:

Column has very low cardinality (boolean, gender, status with 2-3 values).
Table is small (< a few thousand rows; full scan is fast).
Table has heavy write-only or write-dominant workload (indexes slow down writes).
Column is rarely used in queries.

Index maintenance review:

-- MySQL: find unused indexes (queries not using them)
-- Use performance_schema.table_io_waits_summary_by_index_usage
SELECT object_name, index_name, count_read
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE count_read = 0 AND index_name != 'PRIMARY';
-- These indexes are write overhead without read benefit. Consider dropping.

Query Optimization

Q23. What is a query execution plan? How does the optimizer choose one? `Medium`

A query execution plan is the step-by-step procedure the database uses to execute a SQL query.

The optimizer's job:

Parse the SQL into an internal representation.
Generate multiple equivalent execution plans (different join orders, index choices).
Estimate the cost of each plan (CPU, I/O, memory).
Choose the plan with the lowest estimated cost.

Cost estimation uses:

Table statistics (row counts, column cardinalities, histograms of value distributions).
Index information (height, selectivity).
Hardware model (I/O cost of disk pages, CPU cost per comparison).

Example -- two-table join:

SELECT e.name, d.dept_name
FROM employees e JOIN departments d ON e.dept_id = d.dept_id
WHERE d.location = 'Mumbai';

Plans considered:
1. Scan departments (Mumbai filter), nested loop join employees.
2. Scan employees (all), hash join departments.
3. Use index on d.location, then join employees.

Optimizer picks plan 3 if d.location index exists and selectivity is high.

Q24. What are the common join algorithms and when are they used? `Hard`

Algorithm	Description	Best when
Nested Loop Join	For each row in outer, scan inner	Inner table small or has good index; low row counts
Hash Join	Build hash table on smaller relation, probe with larger	Large tables, no usable index, equality joins only
Sort-Merge Join	Sort both relations, merge sorted outputs	Both inputs already sorted or join on range; good for ORDER BY result
Index Nested Loop	Nested loop but use index to look up inner rows	Outer small, inner has index on join column

Complexity:

Algorithm	Time	Space
Nested Loop	O(n * m)	O(1)
Hash Join	O(n + m)	O(min(n,m)) for hash table
Sort-Merge	O(n log n + m log m)	O(n + m) for sort buffers
Index Nested Loop	O(n * log m)	O(1)

MySQL: Uses nested loop join and hash join (since 8.0.18). No sort-merge join. PostgreSQL: Supports all three algorithms; optimizer chooses based on cost.

Q25. What is the N+1 query problem? How is it solved? `Medium`

The N+1 problem occurs when an application fetches a list of N objects and then issues one additional query per object to fetch related data.

-- Fetching 100 orders and their customer names:
Query 1: SELECT * FROM orders LIMIT 100;  -- Returns 100 rows
Query 2: SELECT name FROM customers WHERE id=1;  -- For order 1's customer
Query 3: SELECT name FROM customers WHERE id=5;  -- For order 2's customer
...
Query 101: SELECT name FROM customers WHERE id=99;  -- For order 100's customer

Total: 101 queries instead of 1.

Solutions:

1. JOIN:

SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
LIMIT 100;
-- 1 query, 1 round trip.

2. Eager loading (ORMs):

# SQLAlchemy: N+1 problem
orders = session.query(Order).limit(100).all()
for order in orders:
    print(order.customer.name)  # Each .customer fires a SQL query!

# Fix with eager loading:
from sqlalchemy.orm import joinedload
orders = session.query(Order).options(joinedload(Order.customer)).limit(100).all()
# Single query with JOIN.

3. IN-clause batching:

-- Fetch customer IDs from orders, then batch lookup:
SELECT name FROM customers WHERE id IN (1, 5, 9, ..., 99);
-- 2 queries total instead of N+1.

Q26. How would you optimize a slow query? (Full interview answer) `Hard`

A structured approach for optimizing a slow query:

Step 1: Reproduce and measure.

EXPLAIN ANALYZE SELECT ... FROM ... WHERE ... ORDER BY ...;
Note: execution time, row counts, scan types.

Step 2: Check for full table scans.

EXPLAIN shows type=ALL or Seq Scan.
Check: is there an index on the WHERE/JOIN columns?
If not: add an index on the high-selectivity filter column.

Step 3: Check index selectivity.

Is the index being used but still slow?
Check cardinality. Low cardinality index may be ignored by optimizer (or should be).
Consider composite index to cover more of the WHERE clause.

Step 4: Check for covering index opportunity.

Does the query SELECT a small set of columns?
Add those columns to the index to enable index-only scan.

Step 5: Check ORDER BY / GROUP BY.

Using filesort in EXPLAIN Extra = no index for sort.
Add index on ORDER BY column(s) to eliminate sort.

Step 6: Check JOIN order and algorithm.

For hash joins: ensure the smaller table is the build input.
For nested loop: ensure the inner table has an index on the join column.

Step 7: Check query rewrite.

Can a subquery be replaced with a JOIN?
Can an OR be replaced with UNION (for better index use)?
Are there redundant GROUP BY or DISTINCT operations?

Step 8: Check statistics freshness.

ANALYZE TABLE orders;  -- Update statistics for accurate query planning

Step 9: Consider hardware/caching.

Is the working set index fitting in the buffer pool?
Increase innodb_buffer_pool_size (MySQL) or shared_buffers (PostgreSQL).

FAQ

Q: Does an index guarantee faster query performance?

No. An index helps only when the query's selectivity is high and the optimizer chooses to use it. For low-selectivity queries or small tables, indexes add overhead without benefit.

Q: How many indexes should a table have?

There is no fixed rule. Typical OLTP tables have 3-7 indexes. Each index adds write overhead (INSERT/UPDATE/DELETE cost). Profile actual query patterns and add indexes for the most frequent, slow queries.

Q: What happens to indexes when you run ALTER TABLE?

Adding or removing columns may require rebuilding indexes. ALTER TABLE on large tables can be very slow and lock the table (or use online DDL with tools like pt-online-schema-change to avoid locking).

Related PapersAdda guides:

Sources and review notesreviewed 8 Jun 2026

Article-specific sources

Verification window

Page last edited 8 Jun 2026 by Aditya Sharma. A review date records an editorial edit, not a guarantee that every external fact is still current.

Evidence labels

Official notices, candidate reports, offer documents, and editorial practice questions carry different confidence levels. The visible source list lets you inspect the evidence instead of relying on a blanket verification badge.

Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

topic cluster

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.

Submit your story →

ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start free mock test →

related guides

Interview Questions

Share this guide

Twitter LinkedIn W WhatsApp

Top 26 DBMS Indexing Interview Questions & Answers (2026)

Table of Contents

Index Basics

Q1. What is a database index? Why is it needed? Easy

Q2. What is the difference between a clustered and non-clustered index? Medium

Q3. What is a composite index? What is the left-prefix rule? Medium

Q4. What is selectivity? How does it affect index effectiveness? Medium

Q5. What is a covering index (index-only scan)? Medium

Q6. What is a partial index? Medium

Q7. What is the difference between a dense and sparse index? Medium

Q8. What are the ACID implications of indexes? Hard

B-Tree and B+Tree

Q9. What is a B-tree? Medium

Q10. What is a B+tree? How does it differ from B-tree? Medium

Q11. How does B+tree handle insertions? Hard

Q12. How does B+tree handle deletions? Hard

Q13. What is the height of a B+tree? How many I/Os does a lookup take? Medium

Q14. What is a B+tree fill factor? Medium

Q15. What is a hash index? When is it better than B+tree? Medium

Index Types and Selection

Q16. What is a bitmap index? When is it used? Hard

Q17. What is an index scan vs a table scan? When does the optimizer choose each? Medium

Q18. What is an index on expression (functional index)? Medium

Q19. What is index cardinality and how do you view it? Easy

Q20. How do you use EXPLAIN to analyze index usage? Medium

Q21. What is an index merge? Hard

Q22. What are the guidelines for when to create an index? Easy

Query Optimization

Q23. What is a query execution plan? How does the optimizer choose one? Medium

Q24. What are the common join algorithms and when are they used? Hard

Q25. What is the N+1 query problem? How is it solved? Medium

Q26. How would you optimize a slow query? (Full interview answer) Hard

FAQ

Q: Does an index guarantee faster query performance?

Q: How many indexes should a table have?

Q: What happens to indexes when you run ALTER TABLE?

More resources in Interview Questions

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

Top 24 DBMS vs RDBMS Interview Questions & Answers (2026)

Top 26 DBMS Joins Interview Questions & Answers (2026)

Top 28 DBMS Normalization Interview Questions & Answers (2026)

Top 28 DBMS ACID Transactions Interview Questions (2026)

Top 26 OSI Model Interview Questions & Answers (2026)

Share this guide

Q1. What is a database index? Why is it needed? `Easy`

Q2. What is the difference between a clustered and non-clustered index? `Medium`

Q3. What is a composite index? What is the left-prefix rule? `Medium`

Q4. What is selectivity? How does it affect index effectiveness? `Medium`

Q5. What is a covering index (index-only scan)? `Medium`

Q6. What is a partial index? `Medium`

Q7. What is the difference between a dense and sparse index? `Medium`

Q8. What are the ACID implications of indexes? `Hard`

Q9. What is a B-tree? `Medium`

Q10. What is a B+tree? How does it differ from B-tree? `Medium`

Q11. How does B+tree handle insertions? `Hard`

Q12. How does B+tree handle deletions? `Hard`

Q13. What is the height of a B+tree? How many I/Os does a lookup take? `Medium`

Q14. What is a B+tree fill factor? `Medium`

Q15. What is a hash index? When is it better than B+tree? `Medium`

Q16. What is a bitmap index? When is it used? `Hard`

Q17. What is an index scan vs a table scan? When does the optimizer choose each? `Medium`

Q18. What is an index on expression (functional index)? `Medium`

Q19. What is index cardinality and how do you view it? `Easy`

Q20. How do you use EXPLAIN to analyze index usage? `Medium`

Q21. What is an index merge? `Hard`

Q22. What are the guidelines for when to create an index? `Easy`

Q23. What is a query execution plan? How does the optimizer choose one? `Medium`

Q24. What are the common join algorithms and when are they used? `Hard`

Q25. What is the N+1 query problem? How is it solved? `Medium`

Q26. How would you optimize a slow query? (Full interview answer) `Hard`