Top 28 DBMS Normalization Interview Questions & Answers (2026)

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
Last Updated: June 2026 | Level: Beginner to Advanced | Format: Q&A with Worked Examples and SQL
Database normalization is one of the most consistently tested DBMS topics in campus placements and backend SDE interviews. Candidates report that normalization questions appear in written rounds at TCS, Infosys, Wipro, and Accenture, and in DBMS theory rounds at product companies. Based on public preparation resources and candidate-reported accounts, identifying normal form violations and deriving decompositions are the most frequent question types. This guide covers 28 questions from 1NF through BCNF and beyond.
Table of Contents
- Normalization Basics (Q1-Q8)
- Normal Forms 1NF-3NF (Q9-Q18)
- BCNF and Beyond (Q19-Q24)
- Practical Design (Q25-Q28)
Normalization Basics
Q1. What is normalization? Why is it needed? Easy
Normalization is the process of organizing a relational database schema to reduce data redundancy and improve data integrity by applying a series of normal forms.
Why it is needed -- the update anomalies problem:
Consider an unnormalized table:
StudentCourse(StudentID, StudentName, CourseID, CourseName, InstructorID, InstructorName)
Data:
101, Alice, C1, DBMS, I1, Prof. Kumar
101, Alice, C2, OS, I2, Prof. Mehta
102, Bob, C1, DBMS, I1, Prof. Kumar
Anomalies in this table:
| Anomaly | Example |
|---|---|
| Update anomaly | Prof. Kumar changes name: must update every row where InstructorID=I1. Miss one = inconsistency. |
| Insert anomaly | Cannot add a new course until a student is enrolled (no student = no row). |
| Delete anomaly | Delete Bob's enrollment in C1 = lose the only record of Prof. Kumar teaching DBMS. |
Normalization eliminates these by decomposing into tables where each fact is stored once.
Q2. What is a functional dependency (FD)? What are the types? Easy
A functional dependency A -> B means that the value of attribute set A uniquely determines the value of attribute set B.
StudentID -> StudentName (knowing StudentID uniquely gives StudentName)
CourseID -> CourseName (knowing CourseID uniquely gives CourseName)
StudentID, CourseID -> Grade (knowing both gives the grade)
Types of FDs:
| Type | Description | Example |
|---|---|---|
| Trivial FD | B is a subset of A (A -> B where B subset of A) | {StudentID, Name} -> StudentID |
| Non-trivial FD | B is not a subset of A | StudentID -> StudentName |
| Partial FD | B depends on a proper subset of a composite key | {StudentID, CourseID} -> StudentName (only StudentID needed) |
| Transitive FD | A -> B -> C (B is not a key; A determines C through B) | StudentID -> DeptID -> DeptName |
| Multivalued FD (MVD) | A ->> B means each value of A is associated with a set of B values (4NF concept) | CourseID ->> TextbookID |
Q3. What are Armstrong's Axioms? Medium
Armstrong's Axioms are a complete set of inference rules for deriving all functional dependencies that hold in a relation from a given set of FDs.
| Axiom | Name | Rule |
|---|---|---|
| A1 | Reflexivity | If B subset of A, then A -> B (trivial FDs) |
| A2 | Augmentation | If A -> B, then AC -> BC (adding same attributes to both sides) |
| A3 | Transitivity | If A -> B and B -> C, then A -> C |
Derived rules (from axioms):
- Union: If A -> B and A -> C, then A -> BC
- Decomposition: If A -> BC, then A -> B and A -> C
- Pseudotransitivity: If A -> B and BC -> D, then AC -> D
Completeness: Armstrong's Axioms are sound (only derive valid FDs) and complete (can derive all valid FDs from a given set).
Q4. What is a candidate key, primary key, and superkey? Easy
| Key Type | Definition |
|---|---|
| Superkey | A set of attributes that functionally determines ALL attributes in the relation. May have redundant attributes. |
| Candidate key | A minimal superkey: removing any attribute from it would mean it no longer determines all attributes. |
| Primary key | The candidate key chosen by the DBA to be the principal identifier. Only one per table. No NULLs allowed. |
| Prime attribute | An attribute that is part of ANY candidate key. |
| Non-prime attribute | An attribute that is not part of any candidate key. |
Example:
Employee(EmpID, Email, Name, DeptID)
FDs: EmpID -> all, Email -> all
Superkeys: {EmpID}, {Email}, {EmpID, Email}, {EmpID, Name}, ...
Candidate keys: {EmpID}, {Email} (both minimal)
Primary key: EmpID (chosen)
Prime attributes: EmpID, Email
Non-prime attributes: Name, DeptID
Q5. What is attribute closure? How do you use it to find candidate keys? Medium
The attribute closure of a set of attributes A under a set of FDs F, written A+, is the set of all attributes that can be determined from A using F.
Algorithm to compute A+ (closure of A):
Initialize: result = A
Repeat until no change:
For each FD X -> Y in F:
If X is a subset of result:
result = result union Y
A+ = result
Example:
Relation R(A, B, C, D, E)
FDs: AB -> C, C -> D, D -> E
Find AB+:
Start: {A, B}
AB -> C: {A, B} subset of {A,B} -> add C. result = {A,B,C}
C -> D: {C} subset of {A,B,C} -> add D. result = {A,B,C,D}
D -> E: {D} subset of {A,B,C,D} -> add E. result = {A,B,C,D,E}
No more changes.
AB+ = {A,B,C,D,E} = all attributes -> AB is a superkey (and candidate key if minimal)
Check if B+ = all attributes (to see if B alone is a key):
B+ = {B} (no FD has B on left without A) -> B alone is not a key
Similarly A+ = {A} -> A alone is not a key
Therefore AB is a candidate key (minimal).
Q6. What is the difference between lossless-join decomposition and dependency-preserving decomposition? Hard
Lossless-join decomposition: When a relation R is decomposed into R1 and R2, the decomposition is lossless-join if joining R1 and R2 reconstructs R exactly (no spurious tuples).
Test for lossless-join (for two-way split R1, R2): The decomposition is lossless-join if and only if:
(R1 intersect R2) -> R1 OR (R1 intersect R2) -> R2
i.e., the common attributes of R1 and R2 must be a superkey of at least one of them.
Dependency-preserving decomposition: All original FDs can be checked using only the attributes in individual decomposed relations (no join needed to verify an FD).
Example:
R(A, B, C) with FDs: A->B, B->C, A->C
Decomposition 1: R1(A,B), R2(B,C)
- Lossless-join? R1 ∩ R2 = {B}. B->C holds (B is superkey of R2). YES, lossless.
- Dependency-preserving? FDs A->B in R1, B->C in R2. A->C derivable transitively. YES.
Decomposition 2: R1(A,B), R2(A,C)
- Lossless-join? R1 ∩ R2 = {A}. A->B holds in R1 (A is superkey of R1). YES, lossless.
- Dependency-preserving? B->C cannot be checked without joining R1 and R2. NO (not dep-preserving).
Goal: Always aim for lossless-join. Also aim for dependency-preserving. BCNF always gives lossless but may sacrifice dependency preservation. 3NF can always achieve both.
Q7. What is denormalization? When is it used? Easy
Denormalization is the intentional introduction of redundancy into a normalized schema to improve read performance.
| Scenario | Why denormalize |
|---|---|
| Frequent read-heavy JOIN queries | Joining 5 normalized tables on every page load is expensive |
| Reporting / analytics | Pre-aggregated summary tables avoid recomputing sums/counts |
| High-traffic API endpoints | Flat tables eliminate multi-table joins in hot paths |
Trade-off:
| Normalized | Denormalized | |
|---|---|---|
| Write overhead | Low (update one place) | High (must update multiple redundant copies) |
| Read overhead | High (JOIN required) | Low (single table scan) |
| Storage | Less | More |
| Consistency risk | Low | High (redundant copies can diverge) |
Production example: An e-commerce product listing page stores category_name directly in the products table even though it is also in a categories table. Avoids a JOIN on every product list API call. Updates to category name trigger a batch update to products.
Q8. What is the minimal cover (canonical cover) of a set of FDs? Hard
The minimal cover (or canonical cover) of a set of FDs F is an equivalent set of FDs Fc that is:
- Equivalent to F (same closure).
- Has singleton right-hand sides (each FD has exactly one attribute on the right).
- No redundant FDs (removing any FD changes the closure).
- No redundant attributes on any left-hand side.
Algorithm to find minimal cover:
Step 1: Decompose all RHS to singletons.
AB -> CD becomes AB -> C, AB -> D
Step 2: Remove redundant attributes from LHS.
For each FD XA -> B:
If A is in (X)+, then XA -> B is redundant (X -> B already implied).
Remove A from LHS.
Step 3: Remove redundant FDs.
For each FD A -> B:
Compute A+ using F - {A->B}.
If B is in A+: A -> B is redundant, remove it.
Example:
F = {A->B, B->C, A->C, AB->D}
Step 1: All RHS already singletons.
Step 2: Check AB->D.
Can we remove A? Check B+ with F: B->C, no other FDs with B. B+ = {B,C}. D not in B+. Keep A.
Can we remove B? Check A+ with F: A->B->C, A->C. A+ = {A,B,C}. D not in A+. Keep B.
Step 3: Remove redundant FDs.
A->C: Compute A+ without A->C: A->B->C. C in A+. A->C is redundant. REMOVE.
Check others: A->B needed (A+ without it = {A,C} using A->C, not enough). Keep.
Minimal cover Fc = {A->B, B->C, AB->D}
Normal Forms 1NF-3NF
Q9. What is 1NF (First Normal Form)? How do you normalize to 1NF? Easy
A relation is in 1NF if:
- Every attribute contains only atomic (indivisible) values.
- No repeating groups or multi-valued attributes.
- Each row is uniquely identifiable (has a primary key).
Violation example:
Student(StudentID, Name, Courses)
101, Alice, "DBMS, OS, CN" <- multi-valued in one cell
Fix (normalize to 1NF):
Option 1: Separate rows
StudentCourse(StudentID, Name, Course)
101, Alice, DBMS
101, Alice, OS
101, Alice, CN
PK: (StudentID, Course)
Another violation: Repeating groups
Order(OrderID, Item1, Qty1, Item2, Qty2, Item3, Qty3) <- NOT 1NF
Fix:
OrderItem(OrderID, ItemNum, ItemName, Qty) <- 1NF
PK: (OrderID, ItemNum)
Q10. What is 2NF (Second Normal Form)? What is a partial dependency? Medium
A relation is in 2NF if:
- It is in 1NF.
- Every non-prime attribute is fully functionally dependent on the ENTIRE primary key (no partial dependencies).
Partial dependency: A non-prime attribute depends on only PART of a composite primary key.
Violation example:
StudentCourse(StudentID, CourseID, StudentName, CourseName, Grade)
PK: (StudentID, CourseID)
FDs:
StudentID -> StudentName <- PARTIAL (StudentName depends only on part of PK)
CourseID -> CourseName <- PARTIAL (CourseName depends only on part of PK)
StudentID, CourseID -> Grade <- FULL (Grade depends on entire PK)
Fix (decompose to 2NF):
Student(StudentID, StudentName) -- PK: StudentID
Course(CourseID, CourseName) -- PK: CourseID
Enrollment(StudentID, CourseID, Grade) -- PK: (StudentID, CourseID)
Now no partial dependencies exist. Every non-prime attribute depends on the full key.
Q11. What is 3NF (Third Normal Form)? What is a transitive dependency? Medium
A relation is in 3NF if:
- It is in 2NF.
- No non-prime attribute transitively depends on the primary key.
Transitive dependency: A -> B -> C where A is the PK, B is a non-key attribute, C is a non-key attribute. C depends on A indirectly through B.
Violation example:
Employee(EmpID, Name, DeptID, DeptName, DeptLocation)
PK: EmpID
FDs:
EmpID -> DeptID (direct)
DeptID -> DeptName (DeptName transitively depends on EmpID via DeptID)
DeptID -> DeptLocation
If Prof. Kumar's department moves: must update EVERY employee row in that department. Miss one = inconsistency.
Fix (decompose to 3NF):
Employee(EmpID, Name, DeptID) -- PK: EmpID
Department(DeptID, DeptName, DeptLocation) -- PK: DeptID
Now DeptName and DeptLocation are stored once per department. No transitive dependency.
Q12. How do you check if a relation is in 3NF? Walk through the formal definition. Medium
Formal 3NF definition: A relation R is in 3NF if for every non-trivial FD A -> B in R, at least one of these holds:
- A is a superkey of R.
- B is a prime attribute (member of some candidate key).
Checking example:
R(A, B, C, D)
FDs: A -> B, B -> C, A -> D
Candidate key: A (A+ = {A,B,C,D} = all attrs, A is CK)
Prime attributes: {A} (only A is in a CK)
Non-prime: B, C, D
Check each FD:
A -> B: Is A a superkey? YES (A is CK). OK.
B -> C: Is B a superkey? B+ = {B,C}, not all attrs. NO. Is C a prime attribute? NO. FAIL.
A -> D: Is A a superkey? YES. OK.
Conclusion: R is NOT in 3NF because of B -> C.
Fix: Decompose into R1(A,B,D) and R2(B,C).
Q13. What are the 3NF synthesis algorithm steps? Hard
The 3NF Synthesis Algorithm produces a lossless-join, dependency-preserving 3NF decomposition.
Steps:
1. Compute minimal cover Fc of the given FDs.
2. For each FD X -> Y in Fc:
Create a relation schema Ri containing the attributes X union Y.
(Left-hand side attributes + all attributes they determine)
3. If no schema in the result contains a candidate key of R:
Add a schema containing just the candidate key attributes.
4. Eliminate redundant schemas (if schema Ri subset of Rj, remove Ri).
Example:
R(A, B, C, D, E)
Minimal cover Fc: A->B, AB->C, C->D, D->E
Candidate key: A (check: A+ = {A,B,C,D,E})
Step 2:
A -> B => R1(A, B)
AB -> C => R2(A, B, C)
C -> D => R3(C, D)
D -> E => R4(D, E)
Step 3: Does any schema contain {A}? R1(A,B) contains A. YES. Skip.
Step 4: R1(A,B) subset of R2(A,B,C). Remove R1.
Result: R2(A,B,C), R3(C,D), R4(D,E)
Lossless-join? R2 ∩ R3 = {C}, C->D holds in R3 (C is superkey of R3). YES.
Dependency-preserving? AB->C in R2, C->D in R3, D->E in R4, A->B in R2. All preserved.
Q14. What anomalies does 3NF eliminate? Easy
3NF eliminates the three classic update anomalies caused by transitive dependencies:
| Anomaly | Unnormalized (transitive dep) | After 3NF decomposition |
|---|---|---|
| Update anomaly | Changing DeptName requires updating every employee row | DeptName stored once in Department table; single update |
| Insert anomaly | Cannot add a department until it has an employee | Department table exists independently of employees |
| Delete anomaly | Deleting last employee of a dept deletes the dept info | Department record persists even after all employees leave |
Residual risk: Even in 3NF, redundancy can exist if a relation has multiple candidate keys. This is what BCNF addresses.
Q15. Show a worked example of 1NF -> 2NF -> 3NF decomposition. Medium
Starting relation (not even 1NF):
OrderDetails(OrderID, CustomerID, CustomerName, CustomerCity, ProductID, ProductName, Qty, UnitPrice)
Issues: No single column is atomic violation (everything is atomic here, OK for 1NF)
But check FDs:
PK (assumed): (OrderID, ProductID)
OrderID -> CustomerID, CustomerName, CustomerCity
CustomerID -> CustomerName, CustomerCity (TRANSITIVE through CustomerID)
ProductID -> ProductName, UnitPrice
OrderID, ProductID -> Qty
Step 1: Check 1NF: All attributes are atomic. Already in 1NF.
Step 2: Remove partial dependencies (move to 2NF):
OrderID -> CustomerID, CustomerName, CustomerCity <- partial (on OrderID alone)
ProductID -> ProductName, UnitPrice <- partial (on ProductID alone)
Decompose:
Order(OrderID, CustomerID, CustomerName, CustomerCity) [partial FDs captured here]
Product(ProductID, ProductName, UnitPrice)
OrderItem(OrderID, ProductID, Qty) [only full dependency on composite key]
Now in 2NF. But check Order table for transitive deps.
Step 3: Remove transitive dependencies (move to 3NF):
In Order(OrderID, CustomerID, CustomerName, CustomerCity):
OrderID -> CustomerID -> CustomerName, CustomerCity <- TRANSITIVE
Decompose:
Orders(OrderID, CustomerID) [PK: OrderID]
Customer(CustomerID, CustomerName, CustomerCity) [PK: CustomerID]
Final 3NF schema:
Customer(CustomerID, CustomerName, CustomerCity) -- PK: CustomerID
Orders(OrderID, CustomerID) -- PK: OrderID, FK->Customer
Product(ProductID, ProductName, UnitPrice) -- PK: ProductID
OrderItem(OrderID, ProductID, Qty) -- PK: (OrderID, ProductID)
No partial, no transitive dependencies. No update anomalies.
Q16. What is the difference between 2NF and 3NF in simple terms? Easy
A simple way to remember:
| Normal Form | What it eliminates | Cause |
|---|---|---|
| 2NF | Partial dependency | Non-prime attribute depends on PART of a composite key |
| 3NF | Transitive dependency | Non-prime attribute depends on ANOTHER non-prime attribute |
2NF problem: "StudentName should not be in the StudentCourse table because it only depends on StudentID, not on the whole (StudentID, CourseID) key."
3NF problem: "DeptName should not be in the Employee table because it depends on DeptID, which is not the primary key."
Trick question: Can a relation be in 3NF but not 2NF? NO. 3NF implies 2NF (each normal form builds on the previous).
Q17. What is the role of candidate keys in 2NF and 3NF? Medium
The distinction between prime and non-prime attributes is critical:
2NF: Every non-prime attribute must be fully dependent on EACH candidate key (not just the chosen primary key). If there are multiple candidate keys, ALL must be tested.
3NF formal test: For every non-trivial FD A -> B, either:
- A is a superkey, OR
- B is a prime attribute.
The second condition (B is prime) is what makes 3NF "softer" than BCNF. BCNF removes this second option.
Example where 3NF is achieved via prime attribute condition:
R(CourseID, Instructor, Room)
FD: Instructor -> Room (each instructor always uses same room)
CourseID, Instructor -> (all) [composite CK]
CourseID, Room -> (all) [another composite CK]
Prime attributes: CourseID, Instructor, Room (all are in some CK!)
Non-prime: none
Instructor -> Room: Is Instructor a superkey? NO. Is Room a prime attribute? YES.
By 3NF condition 2, this is allowed. Relation is in 3NF.
But it is NOT in BCNF (Instructor is not a superkey).
Q18. What is the Heath's decomposition theorem? Hard
Heath's Theorem provides the condition for a lossless-join two-way decomposition:
If relation R(A, B, C) has FD A -> B, then decomposing R into R1(A, B) and R2(A, C) is lossless-join.
Proof sketch:
Given R1(A,B) and R2(A,C), take any tuple (a,b,c) in R.
After decomposition: R1 has (a,b), R2 has (a,c).
Natural join R1 ⋈ R2 on A gives (a,b,c). Original tuple recovered.
Could the join produce spurious tuples?
If (a,b') in R1 and (a,c') in R2 where (a,b',c') not in R:
Since A -> B: all tuples in R with A=a have same B value = b.
But (a,b') in R1 means (a,b',?) in R, so b' = b.
Similarly, (a,c') in R2 means (a,?,c') in R.
No spurious tuples.
Practical use: Apply Heath's theorem repeatedly to decompose a relation step by step into BCNF or 3NF, ensuring no information is lost at each step.
BCNF and Beyond
Q19. What is BCNF (Boyce-Codd Normal Form)? Medium
BCNF is a stronger version of 3NF. A relation is in BCNF if for EVERY non-trivial FD A -> B, A must be a superkey.
BCNF removes the "B is a prime attribute" exception that 3NF allows.
3NF allows: Instructor -> Room (where Room is a prime attribute). BCNF disallows it: Unless Instructor is a superkey.
Formal definition: R is in BCNF if for every non-trivial FD A -> B in R, A is a superkey of R.
Testing:
Find all FDs. For each non-trivial FD A -> B:
Compute A+.
If A+ = all attributes of R: A is a superkey. OK.
If A+ != all attributes: BCNF VIOLATION.
Q20. Show the difference between a 3NF relation that is NOT in BCNF. Hard
This is a classic exam question. The classic example involves a relation with overlapping candidate keys.
Relation: Enrollment(StudentID, CourseID, Instructor)
Constraints:
- Each course-instructor pair is for a specific course (Instructor teaches only one course).
- A student can be enrolled in each course only once.
FDs:
StudentID, CourseID -> Instructor (knowing student and course gives instructor)
Instructor -> CourseID (instructor teaches only one course)
Candidate keys:
(StudentID, CourseID): CK1
(StudentID, Instructor): CK2 [Instructor->CourseID so (StudentID,Instructor)->CourseID->all]
Prime attributes: StudentID, CourseID, Instructor (all are in some CK!)
Non-prime attributes: none.
3NF check: For Instructor -> CourseID: Is Instructor a superkey? Instructor+ = {Instructor, CourseID}, not all. NO. Is CourseID a prime attribute? YES (CourseID is in CK1). By 3NF rule 2: PASSES 3NF.
BCNF check: For Instructor -> CourseID: Is Instructor a superkey? NO. BCNF VIOLATION.
Problem: If 100 students are enrolled with Instructor Mehta (CourseID=DBMS), and Mehta switches to OS, must update 100 rows. Redundancy exists despite being in 3NF.
BCNF decomposition:
Instructor(Instructor, CourseID) -- PK: Instructor
Enrollment(StudentID, Instructor) -- PK: (StudentID, Instructor)
Now Instructor -> CourseID in first table (Instructor is PK = superkey). BCNF satisfied.
Cost: FD (StudentID, CourseID -> Instructor) is no longer directly checkable. Not dependency-preserving.
Q21. What is the BCNF decomposition algorithm? Hard
Algorithm BCNF_Decompose(R, F):
result = {R}
Compute F+ (closure of F)
While some schema Ri in result is NOT in BCNF:
Find a non-trivial FD A -> B in Ri that violates BCNF (A not superkey of Ri)
Decompose Ri into:
R1 = A union B (attributes of A and B together)
R2 = Ri - B (original schema minus B)
Replace Ri in result with {R1, R2}
Project FDs onto R1 and R2.
Return result
Example:
R(A, B, C, D)
FDs: A -> B, B -> C, C -> D, D -> A
All of {A}, {B}, {C}, {D} are candidate keys (all determine each other cyclically).
Every attribute is prime. Every FD has a superkey on LHS. R is already in BCNF.
Harder example:
R(A, B, C) FDs: AB -> C, C -> A
CKs: AB (AB+ = {A,B,C}), BC (BC+ = {B,A,C})
C -> A: Is C a superkey? C+ = {C,A}, not all. VIOLATION.
Decompose:
R1(C, A) with FD C -> A. C is superkey of R1. BCNF.
R2(A, B, C) - A = (B, C) wait: R2 = R - {A} but keep A through C.
Actually: R2 = {A, B, C} - {A} + {C} = {B, C}?
Standard: R2 = (attributes of R) - (B - A) = R - {A} + {C} = {B,C} union {C}
Correct decomposition: R1 = CK FD + determined = (C,A), R2 = (C,B)
Check lossless: R1 ∩ R2 = {C}. C -> A holds in R1. LOSSLESS.
Q22. What is 4NF? What are multivalued dependencies? Hard
Multivalued Dependency (MVD): A -> ->> B means: for every value of A, the set of B values is independent of the other attributes in the relation.
CourseTextbook(Course, Textbook, Instructor)
Each course has a set of textbooks and a set of instructors, INDEPENDENTLY.
Course ->> Textbook (set of textbooks for a course independent of instructor)
Course ->> Instructor
Problem with MVDs:
Data:
(DBMS, Book1, Prof.Kumar)
(DBMS, Book1, Prof.Mehta)
(DBMS, Book2, Prof.Kumar)
(DBMS, Book2, Prof.Mehta) <- spurious! Must add this just to maintain consistency.
Adding a new textbook (Book3): must add 2 rows (one per instructor).
Adding a new instructor: must add 2 rows (one per textbook).
4NF definition: A relation is in 4NF if for every non-trivial MVD A ->> B, A is a superkey.
Fix:
CoursBook(Course, Textbook) -- captures Course ->> Textbook
CourseInstructor(Course, Instructor) -- captures Course ->> Instructor
Now adding a textbook is 1 row. Adding an instructor is 1 row. No spurious combinations.
Q23. What is 5NF (PJNF)? Hard
5NF (Fifth Normal Form), also called Project-Join Normal Form (PJNF), addresses join dependencies.
A relation is in 5NF if every join dependency in R is a consequence of the candidate keys.
Join dependency: R satisfies JD(R1, R2, R3) if R = R1 ⋈ R2 ⋈ R3 (R can be reconstructed from three projections).
When 4NF is not enough:
Agent(Agent, Company, Product)
Meaning: Agent sells Product for Company.
FDs/MVDs: none that decompose cleanly.
But there is a join dependency:
R can be decomposed into three projections:
R1(Agent, Company)
R2(Agent, Product)
R3(Company, Product)
And R1 ⋈ R2 ⋈ R3 = R (exactly, no spurious tuples)
if the business rule holds: if Agent sells for Company, Agent sells Product, and Company sells Product,
then Agent sells that Product for that Company.
5NF says: Only decompose into 3+ relations (join dependency) if the dependency is forced by candidate keys. Otherwise, the relation is already in 5NF.
In practice: 5NF is rarely needed. Most real-world schemas stop at BCNF or 3NF. 5NF is a theoretical concern and may appear in advanced database theory exams.
Q24. Why do most production databases use 3NF rather than BCNF? Medium
Reasons to stop at 3NF in production:
-
Dependency preservation: 3NF decomposition always achieves both lossless-join AND dependency preservation. BCNF guarantees lossless-join but may lose some FDs. Losing FDs means constraints that were enforced by the schema now require triggers or application logic.
-
No practical benefit beyond 3NF for most schemas: Most business schemas do not have the overlapping candidate keys pattern that makes 3NF != BCNF. For the vast majority of tables, 3NF and BCNF are the same.
-
Join cost: BCNF decomposition can create more tables, requiring more joins in queries. At high volume, this matters.
-
ORM compatibility: Most ORMs (Hibernate, SQLAlchemy) work better with fewer, larger tables that correspond to domain objects.
When BCNF is worth it:
- Tables with multiple overlapping candidate keys and update-heavy workloads.
- When the violation of BCNF causes real update anomalies in measured production usage.
Practical Design
Q25. How do you normalize this table? (A complete worked example) Medium
Given table:
LibraryRecord(MemberID, MemberName, BookID, BookTitle, AuthorID, AuthorName, BorrowDate, ReturnDate)
Assume:
- One member can borrow one copy of a book at a time.
- PK: (MemberID, BookID, BorrowDate)
- FDs: MemberID -> MemberName
BookID -> BookTitle, AuthorID
AuthorID -> AuthorName
MemberID, BookID, BorrowDate -> ReturnDate
Check 1NF: All atomic. No repeating groups. OK.
Check 2NF: Composite PK = (MemberID, BookID, BorrowDate). Partial deps: MemberID -> MemberName (partial on MemberID alone). BookID -> BookTitle, AuthorID (partial on BookID alone). Violation: NOT in 2NF.
Decompose to 2NF:
Member(MemberID, MemberName)
Book(BookID, BookTitle, AuthorID)
Borrow(MemberID, BookID, BorrowDate, ReturnDate)
Check 3NF on Book: BookID -> AuthorID -> AuthorName. Transitive dependency. NOT in 3NF.
Decompose to 3NF:
Member(MemberID, MemberName) -- PK: MemberID
Author(AuthorID, AuthorName) -- PK: AuthorID
Book(BookID, BookTitle, AuthorID) -- PK: BookID, FK->Author
Borrow(MemberID, BookID, BorrowDate, ReturnDate) -- PK: (MemberID, BookID, BorrowDate)
No partial, no transitive dependencies. All in 3NF.
Q26. What SQL constraints enforce normalization rules? Medium
While normalization is a schema design principle, SQL constraints enforce similar rules at the data level:
| Normalization concept | SQL enforcement |
|---|---|
| Primary key (no duplicates, no NULLs) | PRIMARY KEY (col1, col2) |
| Candidate key | UNIQUE (col1) or UNIQUE (col1, col2) |
| Foreign key (referential integrity) | FOREIGN KEY (col) REFERENCES table(col) |
| Not NULL (required attribute) | NOT NULL constraint |
| Check constraint (domain constraint) | CHECK (salary > 0) |
| Functional dependency enforcement | Materialized via normalization + FK + UNIQUE; no direct SQL for arbitrary FDs |
Example: Enforcing 3NF via schema + constraints:
CREATE TABLE Department (
DeptID INT PRIMARY KEY,
DeptName VARCHAR(100) NOT NULL UNIQUE, -- UNIQUE enforces DeptName -> DeptID direction
Location VARCHAR(100)
);
CREATE TABLE Employee (
EmpID INT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
DeptID INT REFERENCES Department(DeptID) -- FK enforces referential integrity
);
The FK ensures no employee references a non-existent department. The schema structure ensures DeptName/Location stored once per department (3NF property).
Q27. What is denormalization via materialized views? Hard
A materialized view is a pre-computed, physically stored snapshot of a query result. It is the controlled form of denormalization: redundant data is stored but automatically maintained by the database.
-- Normalized base tables
CREATE TABLE Orders(OrderID, CustomerID, OrderDate, Amount);
CREATE TABLE Customer(CustomerID, Name, City);
-- Materialized view (denormalized for reporting)
CREATE MATERIALIZED VIEW OrderSummary AS
SELECT o.OrderID, c.Name, c.City, o.OrderDate, o.Amount
FROM Orders o JOIN Customer c ON o.CustomerID = c.CustomerID;
-- Usage: fast reads without JOIN
SELECT * FROM OrderSummary WHERE City = 'Mumbai';
Refresh modes:
ON COMMIT: Updated whenever base tables change. Always fresh but adds write overhead.ON DEMAND / COMPLETE: Manually refreshed. Fast writes, stale reads between refreshes.FAST (incremental): Only changed rows propagated. Requires materialized view log.
Used in: OLAP/data warehouses (star schema dimension tables), high-traffic reporting dashboards, PostgreSQL materialized views, Oracle MVs, Redshift/BigQuery equivalent pre-aggregated tables.
Q28. Common normalization interview traps and how to avoid them. Medium
Trap 1: Assuming the primary key is the only candidate key. Always find ALL candidate keys before checking 2NF and 3NF. Partial and transitive dependencies must be checked against ALL candidate keys.
Trap 2: Confusing 3NF and BCNF. 3NF: non-prime attributes can be determined by non-superkey if the determined attribute is prime. BCNF: NO non-trivial FD's LHS can be a non-superkey. Period.
Trap 3: Assuming BCNF decomposition is always better. BCNF may lose dependency-preservation. For exam questions: state whether your BCNF decomposition is dependency-preserving or not.
Trap 4: Forgetting to check lossless-join. Every decomposition must be lossless-join. Always verify: intersection of decomposed schemas is a superkey of at least one schema.
Trap 5: Applying NF rules to the wrong level. Normalization applies to the logical schema (table definitions + FDs), not to individual data values.
Quick checklist before answering a normalization question:
1. Find ALL candidate keys (compute closure of every subset).
2. Mark prime vs non-prime attributes.
3. List all non-trivial FDs.
4. Check 2NF: any non-prime attribute partially dependent on a CK?
5. Check 3NF: any non-prime attribute transitively dependent? Any FD violation?
6. Check BCNF: any FD where LHS is not a superkey?
7. Decompose and verify lossless-join.
FAQ
Q: What is the maximum number of times you need to decompose to reach 3NF? Each decomposition reduces at least one violation. In the worst case, you perform n-1 decompositions for n attributes. In practice, it takes 2-4 steps for typical schemas.
Q: Is a relation with a single attribute always in BCNF? Yes. Any FD involving a single attribute is trivial or has the attribute itself as the superkey.
Q: Can a relation be in BCNF but still have update anomalies? Technically, BCNF eliminates redundancy from FDs. But if there are multivalued dependencies (4NF violations), anomalies can still exist. BCNF only covers FD-based redundancy.
Related PapersAdda guides:
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Explore this topic cluster
More resources in Interview Questions
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
Paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.
Submit your story →Ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.
Start Free Mock Test →More from PapersAdda
Accenture Interview Questions 2026 (with Answers for Freshers)
Capgemini Interview Questions 2026 (with Answers for Freshers)
HCLTech Interview Questions 2026 (TechBee + TGT, with Answers)
IBM Interview Questions 2026 (with Answers for Freshers)