issue 117apr 27mmxxvi
est. 2017
Sun, 27 Apr 2026
vol. IX · no. 117
PapersAdda
placement intelligence, since 2017
640+ briefs · 24 campuses · by reservation
verified offers · sourced from r/developersIndia
razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1

Union-Find (Disjoint Set Union) Questions 2026: 14+ Problems [Solved]

14 min read
Uncategorized
Updated: 8 Jun 2026
Aditya Sharma
Aditya's Edit

PapersAdda 2026 Placement Cycle

By Aditya Sharma·Founder & Editor, PapersAdda

What changed in 2026 drives

Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.

What I'd actually study for this

  • 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
  • 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
  • 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
  • 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken

Where most candidates trip up

The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.

Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.

Last Updated: June 2026


Why Union-Find is a Must-Know Pattern

Candidates report Union-Find appearing in roughly 10% of FAANG graph rounds and more frequently in competitive programming contexts. Based on public preparation resources and candidate-reported interview threads, problems involving dynamic connectivity, cycle detection in undirected graphs, and Kruskal's MST require Union-Find for the optimal solution.

Union-Find is also the clean alternative to DFS for connected components when you need to process edges one by one (streaming or online setting).


The Data Structure

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))   # each node is its own parent
        self.rank = [0] * n            # tree height estimate

    def find(self, x):
        """
        Find root of x with path compression.
        Time: O(alpha(n)) amortized - effectively O(1)
        """
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        """
        Merge sets containing x and y. Union by rank.
        Returns True if they were in different sets (new connection).
        Time: O(alpha(n)) amortized
        """
        px, py = self.find(x), self.find(y)

        if px == py:
            return False  # already same set

        # Union by rank: attach smaller tree under larger
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1

        return True

    def connected(self, x, y):
        return self.find(x) == self.find(y)

Why path compression + union by rank? Without them, find() can be O(n) in the worst case (degenerate tree). Path compression flattens the tree after each traversal. Union by rank ensures the tree never gets deep. Together, they give amortized O(alpha(n)) per operation, which is practically constant for all real inputs.


Practice Questions with Solutions

Problem 1: Number of Provinces (MEDIUM)

Asked at: Amazon, Google, Microsoft - standard connected components

def find_circle_num_uf(is_connected):
    """
    Union all directly connected cities. Count distinct roots.
    Time: O(n^2 * alpha(n))  Space: O(n)
    """
    n = len(is_connected)
    uf = UnionFind(n)

    for i in range(n):
        for j in range(i + 1, n):
            if is_connected[i][j] == 1:
                uf.union(i, j)

    return len({uf.find(i) for i in range(n)})

# Example
print(find_circle_num_uf([[1,1,0],[1,1,0],[0,0,1]]))  # 2

Complexity: Time O(n^2), Space O(n)


Problem 2: Number of Islands II (Dynamic) (HARD)

Asked at: Google, Amazon, Facebook - dynamic version

Problem: Start with all water. For each land addition, return current island count.

def num_islands_ii(m, n, positions):
    """
    Union-Find: add land cell, union with adjacent land neighbors.
    Track count of components dynamically.
    Time: O(k * alpha(m*n)) where k = number of positions  Space: O(m*n)
    """
    parent = [-1] * (m * n)
    rank = [0] * (m * n)
    count = 0

    def find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]

    def union(x, y):
        nonlocal count
        px, py = find(x), find(y)
        if px == py:
            return
        if rank[px] < rank[py]:
            px, py = py, px
        parent[py] = px
        if rank[px] == rank[py]:
            rank[px] += 1
        count -= 1

    result = []

    for r, c in positions:
        idx = r * n + c
        if parent[idx] != -1:
            result.append(count)
            continue

        parent[idx] = idx
        count += 1

        for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
            nr, nc = r + dr, c + dc
            nidx = nr * n + nc
            if 0 <= nr < m and 0 <= nc < n and parent[nidx] != -1:
                union(idx, nidx)

        result.append(count)

    return result

# Example
print(num_islands_ii(3, 3, [[0,0],[0,1],[1,2],[2,1],[1,1]]))
# [1, 1, 2, 3, 1]

Complexity: Time O(k * alpha(mn)), Space O(mn)


Problem 3: Redundant Connection (MEDIUM)

Asked at: Amazon, Google, Microsoft

Problem: Find the edge that, when removed, makes the graph a tree (no cycle). Return last such edge.

def find_redundant_connection(edges):
    """
    Process edges one by one with Union-Find.
    The first edge connecting already-connected nodes creates a cycle.
    Time: O(n * alpha(n))  Space: O(n)
    """
    n = len(edges)
    uf = UnionFind(n + 1)

    for u, v in edges:
        if not uf.union(u, v):
            return [u, v]

    return []

# Example
print(find_redundant_connection([[1,2],[1,3],[2,3]]))  # [2,3]
print(find_redundant_connection([[1,2],[2,3],[3,4],[1,4],[1,5]]))  # [1,4]

Complexity: Time O(n * alpha(n)), Space O(n)


Problem 4: Accounts Merge (MEDIUM)

Asked at: Amazon, Google, Facebook - string + Union-Find

Problem: Given list of accounts, each with name and emails, merge accounts sharing an email.

def accounts_merge(accounts):
    """
    Union all emails within the same account.
    Then group emails by root, sort, and attach name.
    Time: O(n * k * alpha(n*k)) where k = avg emails per account
    Space: O(n * k)
    """
    email_to_id = {}
    email_to_name = {}
    id_counter = [0]

    def get_id(email, name):
        if email not in email_to_id:
            email_to_id[email] = id_counter[0]
            email_to_name[email] = name
            id_counter[0] += 1
        return email_to_id[email]

    n = len(accounts) * 20  # upper bound on unique emails
    uf = UnionFind(n + 1)

    for account in accounts:
        name = account[0]
        first_id = get_id(account[1], name)
        for email in account[2:]:
            uf.union(first_id, get_id(email, name))

    from collections import defaultdict
    groups = defaultdict(list)
    for email, eid in email_to_id.items():
        groups[uf.find(eid)].append(email)

    result = []
    for root, emails in groups.items():
        # get name from any email in this group
        name = email_to_name[emails[0]]
        result.append([name] + sorted(emails))

    return result

Complexity: Time O(n * k * alpha(nk)), Space O(nk)


Problem 5: Smallest String With Swaps (MEDIUM)

Asked at: Amazon, Google

Problem: Given string and allowed swap pairs (indices), find lexicographically smallest string.

def smallest_string_with_swaps(s, pairs):
    """
    Union all swap pairs. Within each component, sort characters.
    Time: O(n log n)  Space: O(n)
    """
    from collections import defaultdict

    n = len(s)
    uf = UnionFind(n)

    for a, b in pairs:
        uf.union(a, b)

    # Group indices by root
    groups = defaultdict(list)
    for i in range(n):
        groups[uf.find(i)].append(i)

    result = list(s)

    for indices in groups.values():
        chars = sorted(s[i] for i in indices)
        for i, idx in enumerate(sorted(indices)):
            result[idx] = chars[i]

    return ''.join(result)

# Example
print(smallest_string_with_swaps("dcab", [[0,3],[1,2]]))   # "bacd"
print(smallest_string_with_swaps("dcab", [[0,3],[1,2],[0,2]]))  # "abcd"

Complexity: Time O(n log n), Space O(n)


Problem 6: Satisfiability of Equality Equations (MEDIUM)

Asked at: Amazon, Google

def equations_possible(equations):
    """
    First pass: union all == pairs.
    Second pass: for != pairs, if both variables have same root, contradiction.
    Time: O(n)  Space: O(26) = O(1)
    """
    uf = UnionFind(26)

    for eq in equations:
        if eq[1] == '=':
            uf.union(ord(eq[0]) - ord('a'), ord(eq[3]) - ord('a'))

    for eq in equations:
        if eq[1] == '!':
            if uf.connected(ord(eq[0]) - ord('a'), ord(eq[3]) - ord('a')):
                return False

    return True

# Example
print(equations_possible(["a==b","b!=a"]))  # False
print(equations_possible(["b==a","a==b"]))  # True

Complexity: Time O(n), Space O(1)


Problem 7: Kruskal's Minimum Spanning Tree (MEDIUM)

Asked at: Amazon, Google, Goldman Sachs - graph algorithm

def kruskal_mst(n, edges):
    """
    Sort edges by weight. Add edge if it connects two different components.
    Stop when n-1 edges added.
    Time: O(E log E)  Space: O(n)
    """
    edges.sort(key=lambda x: x[2])  # sort by weight
    uf = UnionFind(n)
    mst_weight = 0
    mst_edges = []

    for u, v, w in edges:
        if uf.union(u, v):
            mst_weight += w
            mst_edges.append((u, v, w))
            if len(mst_edges) == n - 1:
                break

    return mst_weight, mst_edges

# Example: n=4, edges: (0,1,10),(0,2,6),(0,3,5),(1,3,15),(2,3,4)
weight, mst = kruskal_mst(4, [[0,1,10],[0,2,6],[0,3,5],[1,3,15],[2,3,4]])
print(weight)  # 19 (edges: 2-3 weight 4, 0-3 weight 5, 0-1 weight 10)

Complexity: Time O(E log E), Space O(n)


Problem 8: Making a Large Island (HARD)

Asked at: Google, Amazon

def largest_island(grid):
    """
    Label each island with size using Union-Find.
    For each 0 cell, try connecting adjacent islands: sum their sizes + 1.
    Time: O(n^2)  Space: O(n^2)
    """
    n = len(grid)
    uf = UnionFind(n * n)

    for r in range(n):
        for c in range(n):
            if grid[r][c] == 1:
                for dr, dc in [(0,1),(1,0)]:
                    nr, nc = r + dr, c + dc
                    if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] == 1:
                        uf.union(r * n + c, nr * n + nc)

    # Compute size of each component
    from collections import defaultdict
    comp_size = defaultdict(int)
    for r in range(n):
        for c in range(n):
            if grid[r][c] == 1:
                comp_size[uf.find(r * n + c)] += 1

    best = max(comp_size.values(), default=0)

    # Try flipping each 0
    for r in range(n):
        for c in range(n):
            if grid[r][c] == 0:
                seen = set()
                size = 1
                for dr, dc in [(0,1),(0,-1),(1,0),(-1,0)]:
                    nr, nc = r + dr, c + dc
                    if 0 <= nr < n and 0 <= nc < n and grid[nr][nc] == 1:
                        root = uf.find(nr * n + nc)
                        if root not in seen:
                            seen.add(root)
                            size += comp_size[root]
                best = max(best, size)

    return best

# Example
print(largest_island([[1,0],[0,1]]))  # 3
print(largest_island([[1,1],[1,0]]))  # 4

Complexity: Time O(n^2), Space O(n^2)


Problem 9: Graph Valid Tree (MEDIUM)

Asked at: Google, Amazon, Facebook

Problem: Given n nodes and a list of undirected edges, determine whether they form a valid tree (fully connected and acyclic).

def valid_tree(n, edges):
    """
    A valid tree on n nodes has exactly n-1 edges and no cycle.
    Union edges; a union that finds two nodes already connected
    means a cycle. End with exactly one component.
    Time: O(n * alpha(n))  Space: O(n)
    """
    if len(edges) != n - 1:
        return False                  # too many or too few edges
    uf = UnionFind(n)
    for u, v in edges:
        if not uf.union(u, v):        # already connected -> cycle
            return False
    return True                       # n-1 edges, no cycle -> connected tree

# Example
print(valid_tree(5, [[0,1],[0,2],[0,3],[1,4]]))   # True
print(valid_tree(5, [[0,1],[1,2],[2,3],[1,3],[1,4]]))  # False (cycle)

Key shortcut: the edge-count check (exactly n minus 1 edges) does most of the work. With that count guaranteed, a single cycle would imply a disconnected component elsewhere, so verifying no-cycle is sufficient for tree-ness.

Complexity: Time O(n * alpha(n)), Space O(n)


Problem 10: Number of Operations to Make Network Connected (MEDIUM)

Asked at: Amazon, Google

Problem: Given n computers and connections (cables), return the minimum cables to move to connect all computers, or -1 if impossible.

def make_connected(n, connections):
    """
    To connect c components you need c-1 cables. You can only move
    a cable that is redundant (connects already-connected nodes).
    If spare cables >= components - 1, answer is components - 1.
    Time: O(E * alpha(n))  Space: O(n)
    """
    if len(connections) < n - 1:
        return -1                     # not enough cables to ever connect n nodes

    uf = UnionFind(n)
    for u, v in connections:
        uf.union(u, v)

    components = len({uf.find(i) for i in range(n)})
    return components - 1

# Example
print(make_connected(4, [[0,1],[0,2],[1,2]]))        # 1
print(make_connected(6, [[0,1],[0,2],[0,3],[1,2],[1,3]]))  # 2

The counting argument: any spanning of c components needs at least c minus 1 connecting cables. As long as you have at least that many redundant cables (guaranteed when total cables is at least n minus 1), the answer is simply components minus 1.

Complexity: Time O(E * alpha(n)), Space O(n)


Problem 11: Most Stones Removed with Same Row or Column (MEDIUM)

Asked at: Amazon, Google

Problem: Stones on a grid. You may remove a stone if it shares a row or column with another remaining stone. Return the maximum stones removable.

def remove_stones(stones):
    """
    Union stones sharing a row or column. Within each connected
    component of size k, you can remove k-1 stones (leaving one).
    Answer = total stones - number of components.
    Time: O(n * alpha(n))  Space: O(n)
    """
    parent = {}

    def find(x):
        parent.setdefault(x, x)
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]

    def union(a, b):
        parent[find(a)] = find(b)

    for r, c in stones:
        # union the row-key and column-key namespaces
        union(("r", r), ("c", c))

    roots = {find(("r", r)) for r, c in stones}
    return len(stones) - len(roots)

# Example
print(remove_stones([[0,0],[0,1],[1,0],[1,2],[2,1],[2,2]]))   # 5

The component insight: each connected component of k stones collapses to a single survivor, so you remove k minus 1 from each. Summed across components, removable equals total stones minus component count. Modeling rows and columns as separate node namespaces is the clever step.

Complexity: Time O(n * alpha(n)), Space O(n)


Why Path Compression and Union by Rank Both Matter

Candidates often implement only one optimization and assume it is enough. Both are needed to reach the near-constant inverse-Ackermann bound, and an interviewer may probe why.

Union by rank (or by size) keeps trees shallow during merges by always attaching the shorter tree under the taller one. Without it, a sequence of unions can build a degenerate linked-list-shaped tree of height n, making each find O(n). Union by rank alone caps tree height at O(log n).

Path compression flattens the tree during find by repointing every node on the path directly to the root. Without it, even balanced trees pay O(log n) per find repeatedly. Path compression alone also gives strong amortized bounds.

Together, the two optimizations yield O(alpha(n)) amortized per operation, where alpha is the inverse Ackermann function, which is at most 4 for any input you will ever see. The takeaway to state: union by rank controls how trees grow, path compression repairs them as you query, and the combination is what makes union-find effectively constant-time.


How to Recognize a Union-Find Problem

The pattern has a tight signature. The strongest signal is a question about connectivity, grouping, or components: how many groups, are these two connected, which items belong together. Number of Provinces, Accounts Merge, and Most Stones Removed are all "group the related things and count or query the groups."

A second signal is a dynamic graph where edges arrive over time and you must answer connectivity after each addition. DFS and BFS would re-scan the whole graph per query, but union-find processes each new edge in near-constant time and maintains the component structure online. Number of Islands II is the canonical dynamic case.

A third signal is cycle detection in an undirected graph or any "find the redundant edge" framing. When you union two endpoints and discover they already share a root, you have found the edge that closes a cycle. Redundant Connection and Graph Valid Tree both reduce to this single check.

A fourth signal is a minimum spanning tree request, where Kruskal's algorithm sorts edges by weight and uses union-find to skip any edge that would form a cycle.

If the graph is directed and you need strongly connected components, union-find does not apply; reach for Tarjan's or Kosaraju's DFS instead. Union-find models undirected, symmetric "same group" relationships.


Union-Find vs DFS vs BFS for Connectivity

Use caseBest choiceWhy
Static graph, single connectivity checkDFS or BFSSimpler to implement
Dynamic graph (edges added over time)Union-FindO(alpha(n)) per edge, online
Cycle detection in undirected graphUnion-FindClean: if union returns false, cycle found
Minimum spanning treeUnion-Find (Kruskal's)Pairs naturally with sorted edge list
Strongly connected components (directed)DFS (Tarjan/Kosaraju)Union-Find is for undirected only

Path Compression Without Recursion

def find_iterative(self, x):
    """
    Iterative path compression - avoids recursion limit on very deep trees.
    """
    root = x
    while self.parent[root] != root:
        root = self.parent[root]

    # Path compression: make all nodes point to root
    while self.parent[x] != root:
        next_x = self.parent[x]
        self.parent[x] = root
        x = next_x

    return root

Complexity Summary

OperationWithout optimizationWith path compression + union by rank
find()O(n) worstO(alpha(n)) amortized
union()O(n) worstO(alpha(n)) amortized
n operationsO(n^2) worstO(n * alpha(n)) effectively O(n)

alpha(n) is the inverse Ackermann function. For all practical purposes (n < 10^600), alpha(n) <= 4.


Methodology applied to this articlelast verified 8 Jun 2026
Sources used
Public exam-pattern documents, official recruiter pages, and verified candidate reports on r/developersIndia and LinkedIn.
Verification window
Page last edited 8 Jun 2026 by Aditya Sharma. Numbers and patterns sanity-checked against the most recent 2026 cycle drives we tracked.
What we did NOT do
  • No fabricated salary numbers or success rates. If we quote a range, it's sourced.
  • No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
  • No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

Explore this topic cluster

More resources in Uncategorized

Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.

Paid contributor programme

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.

Submit your story →

Ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start Free Mock Test →

More from PapersAdda

Share this guide: