Union-Find (Disjoint Set Union) Questions 2026: 14+ Problems [Solved]

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
Last Updated: June 2026
Why Union-Find is a Must-Know Pattern
Candidates report Union-Find appearing in roughly 10% of FAANG graph rounds and more frequently in competitive programming contexts. Based on public preparation resources and candidate-reported interview threads, problems involving dynamic connectivity, cycle detection in undirected graphs, and Kruskal's MST require Union-Find for the optimal solution.
Union-Find is also the clean alternative to DFS for connected components when you need to process edges one by one (streaming or online setting).
The Data Structure
class UnionFind:
def __init__(self, n):
self.parent = list(range(n)) # each node is its own parent
self.rank = [0] * n # tree height estimate
def find(self, x):
"""
Find root of x with path compression.
Time: O(alpha(n)) amortized - effectively O(1)
"""
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
"""
Merge sets containing x and y. Union by rank.
Returns True if they were in different sets (new connection).
Time: O(alpha(n)) amortized
"""
px, py = self.find(x), self.find(y)
if px == py:
return False # already same set
# Union by rank: attach smaller tree under larger
if self.rank[px] < self.rank[py]:
px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]:
self.rank[px] += 1
return True
def connected(self, x, y):
return self.find(x) == self.find(y)
Why path compression + union by rank? Without them, find() can be O(n) in the worst case (degenerate tree). Path compression flattens the tree after each traversal. Union by rank ensures the tree never gets deep. Together, they give amortized O(alpha(n)) per operation, which is practically constant for all real inputs.
Practice Questions with Solutions
Problem 1: Number of Provinces (MEDIUM)
Asked at: Amazon, Google, Microsoft - standard connected components
def find_circle_num_uf(is_connected):
"""
Union all directly connected cities. Count distinct roots.
Time: O(n^2 * alpha(n)) Space: O(n)
"""
n = len(is_connected)
uf = UnionFind(n)
for i in range(n):
for j in range(i + 1, n):
if is_connected[i][j] == 1:
uf.union(i, j)
return len({uf.find(i) for i in range(n)})
# Example
print(find_circle_num_uf([[1,1,0],[1,1,0],[0,0,1]])) # 2
Complexity: Time O(n^2), Space O(n)
Problem 2: Number of Islands II (Dynamic) (HARD)
Asked at: Google, Amazon, Facebook - dynamic version
Problem: Start with all water. For each land addition, return current island count.
def num_islands_ii(m, n, positions):
"""
Union-Find: add land cell, union with adjacent land neighbors.
Track count of components dynamically.
Time: O(k * alpha(m*n)) where k = number of positions Space: O(m*n)
"""
parent = [-1] * (m * n)
rank = [0] * (m * n)
count = 0
def find(x):
if parent[x] != x:
parent[x] = find(parent[x])
return parent[x]
def union(x, y):
nonlocal count
px, py = find(x), find(y)
if px == py:
return
if rank[px] < rank[py]:
px, py = py, px
parent[py] = px
if rank[px] == rank[py]:
rank[px] += 1
count -= 1
result = []
for r, c in positions:
idx = r * n + c
if parent[idx] != -1:
result.append(count)
continue
parent[idx] = idx
count += 1
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = r + dr, c + dc
nidx = nr * n + nc
if 0 <= nr < m and 0 <= nc < n and parent[nidx] != -1:
union(idx, nidx)
result.append(count)
return result
# Example
print(num_islands_ii(3, 3, [[0,0],[0,1],[1,2],[2,1],[1,1]]))
# [1, 1, 2, 3, 1]
Complexity: Time O(k * alpha(mn)), Space O(mn)
Problem 3: Redundant Connection (MEDIUM)
Asked at: Amazon, Google, Microsoft
Problem: Find the edge that, when removed, makes the graph a tree (no cycle). Return last such edge.
def find_redundant_connection(edges):
"""
Process edges one by one with Union-Find.
The first edge connecting already-connected nodes creates a cycle.
Time: O(n * alpha(n)) Space: O(n)
"""
n = len(edges)
uf = UnionFind(n + 1)
for u, v in edges:
if not uf.union(u, v):
return [u, v]
return []
# Example
print(find_redundant_connection([[1,2],[1,3],[2,3]])) # [2,3]
print(find_redundant_connection([[1,2],[2,3],[3,4],[1,4],[1,5]])) # [1,4]
Complexity: Time O(n * alpha(n)), Space O(n)
Problem 4: Accounts Merge (MEDIUM)
Asked at: Amazon, Google, Facebook - string + Union-Find
Problem: Given list of accounts, each with name and emails, merge accounts sharing an email.
def accounts_merge(accounts):
"""
Union all emails within the same account.
Then group emails by root, sort, and attach name.
Time: O(n * k * alpha(n*k)) where k = avg emails per account
Space: O(n * k)
"""
email_to_id = {}
email_to_name = {}
id_counter = [0]
def get_id(email, name):
if email not in email_to_id:
email_to_id[email] = id_counter[0]
email_to_name[email] = name
id_counter[0] += 1
return email_to_id[email]
n = len(accounts) * 20 # upper bound on unique emails
uf = UnionFind(n + 1)
for account in accounts:
name = account[0]
first_id = get_id(account[1], name)
for email in account[2:]:
uf.union(first_id, get_id(email, name))
from collections import defaultdict
groups = defaultdict(list)
for email, eid in email_to_id.items():
groups[uf.find(eid)].append(email)
result = []
for root, emails in groups.items():
# get name from any email in this group
name = email_to_name[emails[0]]
result.append([name] + sorted(emails))
return result
Complexity: Time O(n * k * alpha(nk)), Space O(nk)
Problem 5: Smallest String With Swaps (MEDIUM)
Asked at: Amazon, Google
Problem: Given string and allowed swap pairs (indices), find lexicographically smallest string.
def smallest_string_with_swaps(s, pairs):
"""
Union all swap pairs. Within each component, sort characters.
Time: O(n log n) Space: O(n)
"""
from collections import defaultdict
n = len(s)
uf = UnionFind(n)
for a, b in pairs:
uf.union(a, b)
# Group indices by root
groups = defaultdict(list)
for i in range(n):
groups[uf.find(i)].append(i)
result = list(s)
for indices in groups.values():
chars = sorted(s[i] for i in indices)
for i, idx in enumerate(sorted(indices)):
result[idx] = chars[i]
return ''.join(result)
# Example
print(smallest_string_with_swaps("dcab", [[0,3],[1,2]])) # "bacd"
print(smallest_string_with_swaps("dcab", [[0,3],[1,2],[0,2]])) # "abcd"
Complexity: Time O(n log n), Space O(n)
Problem 6: Satisfiability of Equality Equations (MEDIUM)
Asked at: Amazon, Google
def equations_possible(equations):
"""
First pass: union all == pairs.
Second pass: for != pairs, if both variables have same root, contradiction.
Time: O(n) Space: O(26) = O(1)
"""
uf = UnionFind(26)
for eq in equations:
if eq[1] == '=':
uf.union(ord(eq[0]) - ord('a'), ord(eq[3]) - ord('a'))
for eq in equations:
if eq[1] == '!':
if uf.connected(ord(eq[0]) - ord('a'), ord(eq[3]) - ord('a')):
return False
return True
# Example
print(equations_possible(["a==b","b!=a"])) # False
print(equations_possible(["b==a","a==b"])) # True
Complexity: Time O(n), Space O(1)
Problem 7: Kruskal's Minimum Spanning Tree (MEDIUM)
Asked at: Amazon, Google, Goldman Sachs - graph algorithm
def kruskal_mst(n, edges):
"""
Sort edges by weight. Add edge if it connects two different components.
Stop when n-1 edges added.
Time: O(E log E) Space: O(n)
"""
edges.sort(key=lambda x: x[2]) # sort by weight
uf = UnionFind(n)
mst_weight = 0
mst_edges = []
for u, v, w in edges:
if uf.union(u, v):
mst_weight += w
mst_edges.append((u, v, w))
if len(mst_edges) == n - 1:
break
return mst_weight, mst_edges
# Example: n=4, edges: (0,1,10),(0,2,6),(0,3,5),(1,3,15),(2,3,4)
weight, mst = kruskal_mst(4, [[0,1,10],[0,2,6],[0,3,5],[1,3,15],[2,3,4]])
print(weight) # 19 (edges: 2-3 weight 4, 0-3 weight 5, 0-1 weight 10)
Complexity: Time O(E log E), Space O(n)
Problem 8: Making a Large Island (HARD)
Asked at: Google, Amazon
def largest_island(grid):
"""
Label each island with size using Union-Find.
For each 0 cell, try connecting adjacent islands: sum their sizes + 1.
Time: O(n^2) Space: O(n^2)
"""
n = len(grid)
uf = UnionFind(n * n)
for r in range(n):
for c in range(n):
if grid[r][c] == 1:
for dr, dc in [(0,1),(1,0)]:
nr, nc = r + dr, c + dc
if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] == 1:
uf.union(r * n + c, nr * n + nc)
# Compute size of each component
from collections import defaultdict
comp_size = defaultdict(int)
for r in range(n):
for c in range(n):
if grid[r][c] == 1:
comp_size[uf.find(r * n + c)] += 1
best = max(comp_size.values(), default=0)
# Try flipping each 0
for r in range(n):
for c in range(n):
if grid[r][c] == 0:
seen = set()
size = 1
for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
nr, nc = r + dr, c + dc
if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] == 1:
root = uf.find(nr * n + nc)
if root not in seen:
seen.add(root)
size += comp_size[root]
best = max(best, size)
return best
# Example
print(largest_island([[1,0],[0,1]])) # 3
print(largest_island([[1,1],[1,0]])) # 4
Complexity: Time O(n^2), Space O(n^2)
Problem 9: Graph Valid Tree (MEDIUM)
Asked at: Google, Amazon, Facebook
Problem: Given n nodes and a list of undirected edges, determine whether they form a valid tree (fully connected and acyclic).
def valid_tree(n, edges):
"""
A valid tree on n nodes has exactly n-1 edges and no cycle.
Union edges; a union that finds two nodes already connected
means a cycle. End with exactly one component.
Time: O(n * alpha(n)) Space: O(n)
"""
if len(edges) != n - 1:
return False # too many or too few edges
uf = UnionFind(n)
for u, v in edges:
if not uf.union(u, v): # already connected -> cycle
return False
return True # n-1 edges, no cycle -> connected tree
# Example
print(valid_tree(5, [[0,1],[0,2],[0,3],[1,4]])) # True
print(valid_tree(5, [[0,1],[1,2],[2,3],[1,3],[1,4]])) # False (cycle)
Key shortcut: the edge-count check (exactly n minus 1 edges) does most of the work. With that count guaranteed, a single cycle would imply a disconnected component elsewhere, so verifying no-cycle is sufficient for tree-ness.
Complexity: Time O(n * alpha(n)), Space O(n)
Problem 10: Number of Operations to Make Network Connected (MEDIUM)
Asked at: Amazon, Google
Problem: Given n computers and connections (cables), return the minimum cables to move to connect all computers, or -1 if impossible.
def make_connected(n, connections):
"""
To connect c components you need c-1 cables. You can only move
a cable that is redundant (connects already-connected nodes).
If spare cables >= components - 1, answer is components - 1.
Time: O(E * alpha(n)) Space: O(n)
"""
if len(connections) < n - 1:
return -1 # not enough cables to ever connect n nodes
uf = UnionFind(n)
for u, v in connections:
uf.union(u, v)
components = len({uf.find(i) for i in range(n)})
return components - 1
# Example
print(make_connected(4, [[0,1],[0,2],[1,2]])) # 1
print(make_connected(6, [[0,1],[0,2],[0,3],[1,2],[1,3]])) # 2
The counting argument: any spanning of c components needs at least c minus 1 connecting cables. As long as you have at least that many redundant cables (guaranteed when total cables is at least n minus 1), the answer is simply components minus 1.
Complexity: Time O(E * alpha(n)), Space O(n)
Problem 11: Most Stones Removed with Same Row or Column (MEDIUM)
Asked at: Amazon, Google
Problem: Stones on a grid. You may remove a stone if it shares a row or column with another remaining stone. Return the maximum stones removable.
def remove_stones(stones):
"""
Union stones sharing a row or column. Within each connected
component of size k, you can remove k-1 stones (leaving one).
Answer = total stones - number of components.
Time: O(n * alpha(n)) Space: O(n)
"""
parent = {}
def find(x):
parent.setdefault(x, x)
if parent[x] != x:
parent[x] = find(parent[x])
return parent[x]
def union(a, b):
parent[find(a)] = find(b)
for r, c in stones:
# union the row-key and column-key namespaces
union(("r", r), ("c", c))
roots = {find(("r", r)) for r, c in stones}
return len(stones) - len(roots)
# Example
print(remove_stones([[0,0],[0,1],[1,0],[1,2],[2,1],[2,2]])) # 5
The component insight: each connected component of k stones collapses to a single survivor, so you remove k minus 1 from each. Summed across components, removable equals total stones minus component count. Modeling rows and columns as separate node namespaces is the clever step.
Complexity: Time O(n * alpha(n)), Space O(n)
Why Path Compression and Union by Rank Both Matter
Candidates often implement only one optimization and assume it is enough. Both are needed to reach the near-constant inverse-Ackermann bound, and an interviewer may probe why.
Union by rank (or by size) keeps trees shallow during merges by always attaching the shorter tree under the taller one. Without it, a sequence of unions can build a degenerate linked-list-shaped tree of height n, making each find O(n). Union by rank alone caps tree height at O(log n).
Path compression flattens the tree during find by repointing every node on the path directly to the root. Without it, even balanced trees pay O(log n) per find repeatedly. Path compression alone also gives strong amortized bounds.
Together, the two optimizations yield O(alpha(n)) amortized per operation, where alpha is the inverse Ackermann function, which is at most 4 for any input you will ever see. The takeaway to state: union by rank controls how trees grow, path compression repairs them as you query, and the combination is what makes union-find effectively constant-time.
How to Recognize a Union-Find Problem
The pattern has a tight signature. The strongest signal is a question about connectivity, grouping, or components: how many groups, are these two connected, which items belong together. Number of Provinces, Accounts Merge, and Most Stones Removed are all "group the related things and count or query the groups."
A second signal is a dynamic graph where edges arrive over time and you must answer connectivity after each addition. DFS and BFS would re-scan the whole graph per query, but union-find processes each new edge in near-constant time and maintains the component structure online. Number of Islands II is the canonical dynamic case.
A third signal is cycle detection in an undirected graph or any "find the redundant edge" framing. When you union two endpoints and discover they already share a root, you have found the edge that closes a cycle. Redundant Connection and Graph Valid Tree both reduce to this single check.
A fourth signal is a minimum spanning tree request, where Kruskal's algorithm sorts edges by weight and uses union-find to skip any edge that would form a cycle.
If the graph is directed and you need strongly connected components, union-find does not apply; reach for Tarjan's or Kosaraju's DFS instead. Union-find models undirected, symmetric "same group" relationships.
Union-Find vs DFS vs BFS for Connectivity
| Use case | Best choice | Why |
|---|---|---|
| Static graph, single connectivity check | DFS or BFS | Simpler to implement |
| Dynamic graph (edges added over time) | Union-Find | O(alpha(n)) per edge, online |
| Cycle detection in undirected graph | Union-Find | Clean: if union returns false, cycle found |
| Minimum spanning tree | Union-Find (Kruskal's) | Pairs naturally with sorted edge list |
| Strongly connected components (directed) | DFS (Tarjan/Kosaraju) | Union-Find is for undirected only |
Path Compression Without Recursion
def find_iterative(self, x):
"""
Iterative path compression - avoids recursion limit on very deep trees.
"""
root = x
while self.parent[root] != root:
root = self.parent[root]
# Path compression: make all nodes point to root
while self.parent[x] != root:
next_x = self.parent[x]
self.parent[x] = root
x = next_x
return root
Complexity Summary
| Operation | Without optimization | With path compression + union by rank |
|---|---|---|
| find() | O(n) worst | O(alpha(n)) amortized |
| union() | O(n) worst | O(alpha(n)) amortized |
| n operations | O(n^2) worst | O(n * alpha(n)) effectively O(n) |
alpha(n) is the inverse Ackermann function. For all practical purposes (n < 10^600), alpha(n) <= 4.
Related Articles
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Explore this topic cluster
More resources in Uncategorized
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
Paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.
Submit your story →Ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.
Start Free Mock Test →More from PapersAdda
LeetCode Questions for TCS 2026: 50 Problems [Solved]
Accenture Coding Assessment 2026: 2 Problems, 45 Minutes, The Silent Filter
Accenture Placement Papers 2026: Cognitive + Coding [Solved]
Axis Bank Placement Papers 2026: 50+ Questions [Solved]