placement brief / Topics & Practice / brief / 08 Jun 2026

Union-Find (Disjoint Set Union) Questions 2026: 14+ Problems [Solved]

Q: What is Union-Find (Disjoint Set Union)?

Union-Find maintains a collection of disjoint sets with two operations: find(x) returns the representative (root) of x's set, and union(x, y) merges the sets containing x and y. With path compression and union by rank, both operations run in near-constant time O(alpha(n)) where alpha is the inverse Ackermann function.

Q: When should I use Union-Find vs DFS for connectivity problems?

Use Union-Find when you need to process edges one by one and dynamically maintain connectivity (online queries). Use DFS/BFS when the full graph is given upfront and you need a single connectivity check. Union-Find is also preferred when detecting cycle during Kruskal's MST algorithm.

Q: What are path compression and union by rank?

Path compression: when finding the root, make every node on the path point directly to the root, flattening the tree. Union by rank: always attach the shorter tree under the taller tree to keep trees balanced. Together, they give amortized O(alpha(n)) per operation, effectively O(1) in practice.

Candidates report Union-Find appearing in roughly 10% of FAANG graph rounds and more frequently in competitive programming contexts. Based on public...

By Aditya SharmaPublished 8 Jun 20262 sources listedSpot an error? Corrections open

6 min read last revised 8 Jun 2026

on this page§ 10

Last Updated: June 2026

Why Union-Find is a Must-Know Pattern

Candidates report Union-Find appearing in roughly 10% of FAANG graph rounds and more frequently in competitive programming contexts. Based on public preparation resources and candidate-reported interview threads, problems involving dynamic connectivity, cycle detection in undirected graphs, and Kruskal's MST require Union-Find for the optimal solution.

Union-Find is also the clean alternative to DFS for connected components when you need to process edges one by one (streaming or online setting).

The Data Structure

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))   # each node is its own parent
        self.rank = [0] * n            # tree height estimate

    def find(self, x):
        """
        Find root of x with path compression.
        Time: O(alpha(n)) amortized - effectively O(1)
        """
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        """
        Merge sets containing x and y. Union by rank.
        Returns True if they were in different sets (new connection).
        Time: O(alpha(n)) amortized
        """
        px, py = self.find(x), self.find(y)

        if px == py:
            return False  # already same set

        # Union by rank: attach smaller tree under larger
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1

        return True

    def connected(self, x, y):
        return self.find(x) == self.find(y)

Why path compression + union by rank? Without them, find() can be O(n) in the worst case (degenerate tree). Path compression flattens the tree after each traversal. Union by rank ensures the tree never gets deep. Together, they give amortized O(alpha(n)) per operation, which is practically constant for all real inputs.

Practice Questions with Solutions

Problem 1: Number of Provinces (MEDIUM)

Asked at: Amazon, Google, Microsoft - standard connected components

def find_circle_num_uf(is_connected):
    """
    Union all directly connected cities. Count distinct roots.
    Time: O(n^2 * alpha(n))  Space: O(n)
    """
    n = len(is_connected)
    uf = UnionFind(n)

    for i in range(n):
        for j in range(i + 1, n):
            if is_connected[i][j] == 1:
                uf.union(i, j)

    return len({uf.find(i) for i in range(n)})

# Example
print(find_circle_num_uf([[1,1,0],[1,1,0],[0,0,1]]))  # 2

Complexity: Time O(n^2), Space O(n)

Problem 2: Number of Islands II (Dynamic) (HARD)

Asked at: Google, Amazon, Facebook - dynamic version

Problem: Start with all water. For each land addition, return current island count.

def num_islands_ii(m, n, positions):
    """
    Union-Find: add land cell, union with adjacent land neighbors.
    Track count of components dynamically.
    Time: O(k * alpha(m*n)) where k = number of positions  Space: O(m*n)
    """
    parent = [-1] * (m * n)
    rank = [0] * (m * n)
    count = 0

    def find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]

    def union(x, y):
        nonlocal count
        px, py = find(x), find(y)
        if px == py:
            return
        if rank[px] < rank[py]:
            px, py = py, px
        parent[py] = px
        if rank[px] == rank[py]:
            rank[px] += 1
        count -= 1

    result = []

    for r, c in positions:
        idx = r * n + c
        if parent[idx] != -1:
            result.append(count)
            continue

        parent[idx] = idx
        count += 1

        for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
            nr, nc = r + dr, c + dc
            nidx = nr * n + nc
            if 0 <= nr < m and 0 <= nc < n and parent[nidx] != -1:
                union(idx, nidx)

        result.append(count)

    return result

# Example
print(num_islands_ii(3, 3, [[0,0],[0,1],[1,2],[2,1],[1,1]]))
# [1, 1, 2, 3, 1]

Complexity: Time O(k * alpha(mn)), Space O(mn)

Problem 3: Redundant Connection (MEDIUM)

Asked at: Amazon, Google, Microsoft

Problem: Find the edge that, when removed, makes the graph a tree (no cycle). Return last such edge.

def find_redundant_connection(edges):
    """
    Process edges one by one with Union-Find.
    The first edge connecting already-connected nodes creates a cycle.
    Time: O(n * alpha(n))  Space: O(n)
    """
    n = len(edges)
    uf = UnionFind(n + 1)

    for u, v in edges:
        if not uf.union(u, v):
            return [u, v]

    return []

# Example
print(find_redundant_connection([[1,2],[1,3],[2,3]]))  # [2,3]
print(find_redundant_connection([[1,2],[2,3],[3,4],[1,4],[1,5]]))  # [1,4]

Complexity: Time O(n * alpha(n)), Space O(n)

Problem 4: Accounts Merge (MEDIUM)

Asked at: Amazon, Google, Facebook - string + Union-Find

Problem: Given list of accounts, each with name and emails, merge accounts sharing an email.

def accounts_merge(accounts):
    """
    Union all emails within the same account.
    Then group emails by root, sort, and attach name.
    Time: O(n * k * alpha(n*k)) where k = avg emails per account
    Space: O(n * k)
    """
    email_to_id = {}
    email_to_name = {}
    id_counter = [0]

    def get_id(email, name):
        if email not in email_to_id:
            email_to_id[email] = id_counter[0]
            email_to_name[email] = name
            id_counter[0] += 1
        return email_to_id[email]

    n = len(accounts) * 20  # upper bound on unique emails
    uf = UnionFind(n + 1)

    for account in accounts:
        name = account[0]
        first_id = get_id(account[1], name)
        for email in account[2:]:
            uf.union(first_id, get_id(email, name))

    from collections import defaultdict
    groups = defaultdict(list)
    for email, eid in email_to_id.items():
        groups[uf.find(eid)].append(email)

    result = []
    for root, emails in groups.items():
        # get name from any email in this group
        name = email_to_name[emails[0]]
        result.append([name] + sorted(emails))

    return result

Complexity: Time O(n * k * alpha(nk)), Space O(nk)

Problem 5: Smallest String With Swaps (MEDIUM)

Asked at: Amazon, Google

Problem: Given string and allowed swap pairs (indices), find lexicographically smallest string.

def smallest_string_with_swaps(s, pairs):
    """
    Union all swap pairs. Within each component, sort characters.
    Time: O(n log n)  Space: O(n)
    """
    from collections import defaultdict

    n = len(s)
    uf = UnionFind(n)

    for a, b in pairs:
        uf.union(a, b)

    # Group indices by root
    groups = defaultdict(list)
    for i in range(n):
        groups[uf.find(i)].append(i)

    result = list(s)

    for indices in groups.values():
        chars = sorted(s[i] for i in indices)
        for i, idx in enumerate(sorted(indices)):
            result[idx] = chars[i]

    return ''.join(result)

# Example
print(smallest_string_with_swaps("dcab", [[0,3],[1,2]]))   # "bacd"
print(smallest_string_with_swaps("dcab", [[0,3],[1,2],[0,2]]))  # "abcd"

Complexity: Time O(n log n), Space O(n)

Problem 6: Satisfiability of Equality Equations (MEDIUM)

Asked at: Amazon, Google

def equations_possible(equations):
    """
    First pass: union all == pairs.
    Second pass: for != pairs, if both variables have same root, contradiction.
    Time: O(n)  Space: O(26) = O(1)
    """
    uf = UnionFind(26)

    for eq in equations:
        if eq[1] == '=':
            uf.union(ord(eq[0]) - ord('a'), ord(eq[3]) - ord('a'))

    for eq in equations:
        if eq[1] == '!':
            if uf.connected(ord(eq[0]) - ord('a'), ord(eq[3]) - ord('a')):
                return False

    return True

# Example
print(equations_possible(["a==b","b!=a"]))  # False
print(equations_possible(["b==a","a==b"]))  # True

Complexity: Time O(n), Space O(1)

Problem 7: Kruskal's Minimum Spanning Tree (MEDIUM)

Asked at: Amazon, Google, Goldman Sachs - graph algorithm

def kruskal_mst(n, edges):
    """
    Sort edges by weight. Add edge if it connects two different components.
    Stop when n-1 edges added.
    Time: O(E log E)  Space: O(n)
    """
    edges.sort(key=lambda x: x[2])  # sort by weight
    uf = UnionFind(n)
    mst_weight = 0
    mst_edges = []

    for u, v, w in edges:
        if uf.union(u, v):
            mst_weight += w
            mst_edges.append((u, v, w))
            if len(mst_edges) == n - 1:
                break

    return mst_weight, mst_edges

# Example: n=4, edges: (0,1,10),(0,2,6),(0,3,5),(1,3,15),(2,3,4)
weight, mst = kruskal_mst(4, [[0,1,10],[0,2,6],[0,3,5],[1,3,15],[2,3,4]])
print(weight)  # 19 (edges: 2-3 weight 4, 0-3 weight 5, 0-1 weight 10)

Complexity: Time O(E log E), Space O(n)

Problem 8: Making a Large Island (HARD)

Asked at: Google, Amazon

def largest_island(grid):
    """
    Label each island with size using Union-Find.
    For each 0 cell, try connecting adjacent islands: sum their sizes + 1.
    Time: O(n^2)  Space: O(n^2)
    """
    n = len(grid)
    uf = UnionFind(n * n)

    for r in range(n):
        for c in range(n):
            if grid[r][c] == 1:
                for dr, dc in [(0,1),(1,0)]:
                    nr, nc = r + dr, c + dc
                    if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] == 1:
                        uf.union(r * n + c, nr * n + nc)

    # Compute size of each component
    from collections import defaultdict
    comp_size = defaultdict(int)
    for r in range(n):
        for c in range(n):
            if grid[r][c] == 1:
                comp_size[uf.find(r * n + c)] += 1

    best = max(comp_size.values(), default=0)

    # Try flipping each 0
    for r in range(n):
        for c in range(n):
            if grid[r][c] == 0:
                seen = set()
                size = 1
                for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
                    nr, nc = r + dr, c + dc
                    if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] == 1:
                        root = uf.find(nr * n + nc)
                        if root not in seen:
                            seen.add(root)
                            size += comp_size[root]
                best = max(best, size)

    return best

# Example
print(largest_island([[1,0],[0,1]]))  # 3
print(largest_island([[1,1],[1,0]]))  # 4

Complexity: Time O(n^2), Space O(n^2)

Problem 9: Graph Valid Tree (MEDIUM)

Asked at: Google, Amazon, Facebook

Problem: Given n nodes and a list of undirected edges, determine whether they form a valid tree (fully connected and acyclic).

def valid_tree(n, edges):
    """
    A valid tree on n nodes has exactly n-1 edges and no cycle.
    Union edges; a union that finds two nodes already connected
    means a cycle. End with exactly one component.
    Time: O(n * alpha(n))  Space: O(n)
    """
    if len(edges) != n - 1:
        return False                  # too many or too few edges
    uf = UnionFind(n)
    for u, v in edges:
        if not uf.union(u, v):        # already connected -> cycle
            return False
    return True                       # n-1 edges, no cycle -> connected tree

# Example
print(valid_tree(5, [[0,1],[0,2],[0,3],[1,4]]))   # True
print(valid_tree(5, [[0,1],[1,2],[2,3],[1,3],[1,4]]))  # False (cycle)

Key shortcut: the edge-count check (exactly n minus 1 edges) does most of the work. With that count guaranteed, a single cycle would imply a disconnected component elsewhere, so verifying no-cycle is sufficient for tree-ness.

Complexity: Time O(n * alpha(n)), Space O(n)

Problem 10: Number of Operations to Make Network Connected (MEDIUM)

Asked at: Amazon, Google

Problem: Given n computers and connections (cables), return the minimum cables to move to connect all computers, or -1 if impossible.

def make_connected(n, connections):
    """
    To connect c components you need c-1 cables. You can only move
    a cable that is redundant (connects already-connected nodes).
    If spare cables >= components - 1, answer is components - 1.
    Time: O(E * alpha(n))  Space: O(n)
    """
    if len(connections) < n - 1:
        return -1                     # not enough cables to ever connect n nodes

    uf = UnionFind(n)
    for u, v in connections:
        uf.union(u, v)

    components = len({uf.find(i) for i in range(n)})
    return components - 1

# Example
print(make_connected(4, [[0,1],[0,2],[1,2]]))        # 1
print(make_connected(6, [[0,1],[0,2],[0,3],[1,2],[1,3]]))  # 2

The counting argument: any spanning of c components needs at least c minus 1 connecting cables. As long as you have at least that many redundant cables (guaranteed when total cables is at least n minus 1), the answer is simply components minus 1.

Complexity: Time O(E * alpha(n)), Space O(n)

Problem 11: Most Stones Removed with Same Row or Column (MEDIUM)

Asked at: Amazon, Google

Problem: Stones on a grid. You may remove a stone if it shares a row or column with another remaining stone. Return the maximum stones removable.

def remove_stones(stones):
    """
    Union stones sharing a row or column. Within each connected
    component of size k, you can remove k-1 stones (leaving one).
    Answer = total stones - number of components.
    Time: O(n * alpha(n))  Space: O(n)
    """
    parent = {}

    def find(x):
        parent.setdefault(x, x)
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]

    def union(a, b):
        parent[find(a)] = find(b)

    for r, c in stones:
        # union the row-key and column-key namespaces
        union(("r", r), ("c", c))

    roots = {find(("r", r)) for r, c in stones}
    return len(stones) - len(roots)

# Example
print(remove_stones([[0,0],[0,1],[1,0],[1,2],[2,1],[2,2]]))   # 5

The component insight: each connected component of k stones collapses to a single survivor, so you remove k minus 1 from each. Summed across components, removable equals total stones minus component count. Modeling rows and columns as separate node namespaces is the clever step.

Complexity: Time O(n * alpha(n)), Space O(n)

Why Path Compression and Union by Rank Both Matter

Candidates often implement only one optimization and assume it is enough. Both are needed to reach the near-constant inverse-Ackermann bound, and an interviewer may probe why.

Union by rank (or by size) keeps trees shallow during merges by always attaching the shorter tree under the taller one. Without it, a sequence of unions can build a degenerate linked-list-shaped tree of height n, making each find O(n). Union by rank alone caps tree height at O(log n).

Path compression flattens the tree during find by repointing every node on the path directly to the root. Without it, even balanced trees pay O(log n) per find repeatedly. Path compression alone also gives strong amortized bounds.

Together, the two optimizations yield O(alpha(n)) amortized per operation, where alpha is the inverse Ackermann function, which is at most 4 for any input you will ever see. The takeaway to state: union by rank controls how trees grow, path compression repairs them as you query, and the combination is what makes union-find effectively constant-time.

How to Recognize a Union-Find Problem

The pattern has a tight signature. The strongest signal is a question about connectivity, grouping, or components: how many groups, are these two connected, which items belong together. Number of Provinces, Accounts Merge, and Most Stones Removed are all "group the related things and count or query the groups."

A second signal is a dynamic graph where edges arrive over time and you must answer connectivity after each addition. DFS and BFS would re-scan the whole graph per query, but union-find processes each new edge in near-constant time and maintains the component structure online. Number of Islands II is the canonical dynamic case.

A third signal is cycle detection in an undirected graph or any "find the redundant edge" framing. When you union two endpoints and discover they already share a root, you have found the edge that closes a cycle. Redundant Connection and Graph Valid Tree both reduce to this single check.

A fourth signal is a minimum spanning tree request, where Kruskal's algorithm sorts edges by weight and uses union-find to skip any edge that would form a cycle.

If the graph is directed and you need strongly connected components, union-find does not apply; reach for Tarjan's or Kosaraju's DFS instead. Union-find models undirected, symmetric "same group" relationships.

Union-Find vs DFS vs BFS for Connectivity

Use case	Best choice	Why
Static graph, single connectivity check	DFS or BFS	Simpler to implement
Dynamic graph (edges added over time)	Union-Find	O(alpha(n)) per edge, online
Cycle detection in undirected graph	Union-Find	Clean: if union returns false, cycle found
Minimum spanning tree	Union-Find (Kruskal's)	Pairs naturally with sorted edge list
Strongly connected components (directed)	DFS (Tarjan/Kosaraju)	Union-Find is for undirected only

Path Compression Without Recursion

def find_iterative(self, x):
    """
    Iterative path compression - avoids recursion limit on very deep trees.
    """
    root = x
    while self.parent[root] != root:
        root = self.parent[root]

    # Path compression: make all nodes point to root
    while self.parent[x] != root:
        next_x = self.parent[x]
        self.parent[x] = root
        x = next_x

    return root

Complexity Summary

Operation	Without optimization	With path compression + union by rank
find()	O(n) worst	O(alpha(n)) amortized
union()	O(n) worst	O(alpha(n)) amortized
n operations	O(n^2) worst	O(n * alpha(n)) effectively O(n)

alpha(n) is the inverse Ackermann function. For all practical purposes (n < 10^600), alpha(n) <= 4.

Frequently Asked Questions

What is Union-Find (Disjoint Set Union)?

Union-Find maintains a collection of disjoint sets with two operations: find(x) returns the representative (root) of x's set, and union(x, y) merges the sets containing x and y. With path compression and union by rank, both operations run in near-constant time O(alpha(n)) where alpha is the inverse Ackermann function.

When should I use Union-Find vs DFS for connectivity problems?

Use Union-Find when you need to process edges one by one and dynamically maintain connectivity (online queries). Use DFS/BFS when the full graph is given upfront and you need a single connectivity check. Union-Find is also preferred when detecting cycle during Kruskal's MST algorithm.

What are path compression and union by rank?

Path compression: when finding the root, make every node on the path point directly to the root, flattening the tree. Union by rank: always attach the shorter tree under the taller tree to keep trees balanced. Together, they give amortized O(alpha(n)) per operation, effectively O(1) in practice.

Sources and review notesreviewed 8 Jun 2026

Article-specific sources

Verification window

Page last edited 8 Jun 2026 by Aditya Sharma. A review date records an editorial edit, not a guarantee that every external fact is still current.

Evidence labels

Official notices, candidate reports, offer documents, and editorial practice questions carry different confidence levels. The visible source list lets you inspect the evidence instead of relying on a blanket verification badge.

Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

topic cluster

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.

Submit your story →

ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start free mock test →

related guides

Topics & Practice

Share this guide

Twitter LinkedIn W WhatsApp

Union-Find (Disjoint Set Union) Questions 2026: 14+ Problems [Solved]

Why Union-Find is a Must-Know Pattern

The Data Structure

Practice Questions with Solutions

Problem 1: Number of Provinces (MEDIUM)

Problem 2: Number of Islands II (Dynamic) (HARD)

Problem 3: Redundant Connection (MEDIUM)

Problem 4: Accounts Merge (MEDIUM)

Problem 5: Smallest String With Swaps (MEDIUM)

Problem 6: Satisfiability of Equality Equations (MEDIUM)

Problem 7: Kruskal's Minimum Spanning Tree (MEDIUM)

Problem 8: Making a Large Island (HARD)

Problem 9: Graph Valid Tree (MEDIUM)

Problem 10: Number of Operations to Make Network Connected (MEDIUM)

Problem 11: Most Stones Removed with Same Row or Column (MEDIUM)

Why Path Compression and Union by Rank Both Matter

How to Recognize a Union-Find Problem

Union-Find vs DFS vs BFS for Connectivity

Path Compression Without Recursion

Complexity Summary

Frequently Asked Questions

What is Union-Find (Disjoint Set Union)?

When should I use Union-Find vs DFS for connectivity problems?

What are path compression and union by rank?

More resources in Topics & Practice

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

Binary Search Patterns 2026: 20+ Problems [Solved]

Bit Manipulation Questions 2026: 18+ Problems [Solved]

Merge Intervals Pattern 2026: 14+ Problems [Solved]

Monotonic Stack Questions 2026: 16+ Problems [Solved]

Prefix Sum Questions 2026: 16+ Problems [Solved]

Share this guide

Union-Find (Disjoint Set Union) Questions 2026: 14+ Problems [Solved]

Why Union-Find is a Must-Know Pattern

The Data Structure

Practice Questions with Solutions

Problem 1: Number of Provinces (MEDIUM)

Problem 2: Number of Islands II (Dynamic) (HARD)

Problem 3: Redundant Connection (MEDIUM)

Problem 4: Accounts Merge (MEDIUM)

Problem 5: Smallest String With Swaps (MEDIUM)

Problem 6: Satisfiability of Equality Equations (MEDIUM)

Problem 7: Kruskal's Minimum Spanning Tree (MEDIUM)

Problem 8: Making a Large Island (HARD)

Problem 9: Graph Valid Tree (MEDIUM)

Problem 10: Number of Operations to Make Network Connected (MEDIUM)

Problem 11: Most Stones Removed with Same Row or Column (MEDIUM)

Why Path Compression and Union by Rank Both Matter

How to Recognize a Union-Find Problem

Union-Find vs DFS vs BFS for Connectivity

Path Compression Without Recursion

Complexity Summary

Related Articles

Frequently Asked Questions

What is Union-Find (Disjoint Set Union)?

When should I use Union-Find vs DFS for connectivity problems?

What are path compression and union by rank?

More resources in Topics & Practice

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

Binary Search Patterns 2026: 20+ Problems [Solved]

Bit Manipulation Questions 2026: 18+ Problems [Solved]

Merge Intervals Pattern 2026: 14+ Problems [Solved]

Monotonic Stack Questions 2026: 16+ Problems [Solved]

Prefix Sum Questions 2026: 16+ Problems [Solved]

Share this guide