Top K Elements Pattern 2026: 14+ Problems [Solved]
Candidates report top-K questions as appearing in roughly 15-20% of FAANG data structure rounds. Based on public preparation resources and candidate-reported...

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
Last Updated: June 2026
Why Top-K Problems Appear Constantly
Candidates report top-K questions as appearing in roughly 15-20% of FAANG data structure rounds. Based on public preparation resources and candidate-reported interview threads, the pattern covers a wide surface area: streaming data, frequency analysis, proximity queries, and order statistics. Interviewers use it to test knowledge of heaps and the tradeoff between sort, heap, and quickselect.
Three Approaches at a Glance
| Approach | Time | Space | When to use |
|---|---|---|---|
| Sort | O(n log n) | O(1) | Simple, one-time query |
| Min-heap of size k | O(n log k) | O(k) | k << n, streaming data |
| Quickselect | O(n) average | O(1) | Maximum performance, no sort needed |
| Bucket sort | O(n) | O(n) | Bounded values (e.g., frequencies in [1..n]) |
Min-Heap of Size K Template
import heapq
def top_k_largest(nums, k):
"""
Maintain a min-heap of size k.
The smallest element in the heap is the kth largest.
Time: O(n log k) Space: O(k)
"""
heap = []
for num in nums:
heapq.heappush(heap, num)
if len(heap) > k:
heapq.heappop(heap) # remove the smallest
return list(heap) # the k largest elements
# Python's heapq is a MIN-heap by default.
# For max-heap: push -num.
Quickselect Template
import random
def quickselect(nums, k):
"""
Find the kth largest element. In-place.
Partition around pivot. Recurse on the side containing kth position.
Time: O(n) average, O(n^2) worst Space: O(1)
"""
def partition(left, right, pivot_idx):
pivot = nums[pivot_idx]
nums[pivot_idx], nums[right] = nums[right], nums[pivot_idx]
store = left
for i in range(left, right):
if nums[i] < pivot:
nums[i], nums[store] = nums[store], nums[i]
store += 1
nums[store], nums[right] = nums[right], nums[store]
return store
left, right = 0, len(nums) - 1
k_smallest = len(nums) - k # kth largest = (n-k)th smallest
while left <= right:
pivot_idx = random.randint(left, right)
pivot_idx = partition(left, right, pivot_idx)
if pivot_idx == k_smallest:
return nums[pivot_idx]
elif pivot_idx < k_smallest:
left = pivot_idx + 1
else:
right = pivot_idx - 1
Practice Questions with Solutions
Problem 1: Kth Largest Element in Array (MEDIUM)
Asked at: Amazon, Google, Facebook, Microsoft - very frequently asked
def find_kth_largest(nums, k):
"""
Min-heap of size k. Heap top = kth largest.
Time: O(n log k) Space: O(k)
"""
import heapq
heap = []
for num in nums:
heapq.heappush(heap, num)
if len(heap) > k:
heapq.heappop(heap)
return heap[0] # kth largest
# Example
print(find_kth_largest([3, 2, 1, 5, 6, 4], 2)) # 5
print(find_kth_largest([3, 2, 3, 1, 2, 4, 5, 5, 6], 4)) # 4
Complexity: Time O(n log k), Space O(k)
Problem 2: K Frequent Elements (MEDIUM)
Asked at: Amazon, Google, Facebook, Apple
def top_k_frequent(nums, k):
"""
Approach 1: Heap, O(n log k)
Approach 2: Bucket sort, O(n) - preferred when n is known
"""
from collections import Counter
# Approach 1: Min-heap
count = Counter(nums)
heap = []
for num, freq in count.items():
heapq.heappush(heap, (freq, num))
if len(heap) > k:
heapq.heappop(heap)
return [num for freq, num in heap]
def top_k_frequent_bucket(nums, k):
"""
Bucket sort: bucket[freq] = list of numbers with that frequency.
Frequencies range from 1 to n.
Time: O(n) Space: O(n)
"""
from collections import Counter
n = len(nums)
count = Counter(nums)
buckets = [[] for _ in range(n + 1)]
for num, freq in count.items():
buckets[freq].append(num)
result = []
for freq in range(n, 0, -1):
for num in buckets[freq]:
result.append(num)
if len(result) == k:
return result
return result
# Example
print(top_k_frequent([1, 1, 1, 2, 2, 3], 2)) # [1, 2]
print(top_k_frequent_bucket([1, 1, 1, 2, 2, 3], 2)) # [1, 2]
Complexity (bucket): Time O(n), Space O(n)
Problem 3: K Closest Points to Origin (MEDIUM)
Asked at: Amazon, Google, Facebook, Microsoft
def k_closest(points, k):
"""
Use a max-heap of size k (negate distances for max-heap behavior).
Time: O(n log k) Space: O(k)
"""
import heapq
heap = []
for x, y in points:
dist = -(x * x + y * y) # negate for max-heap
heapq.heappush(heap, (dist, x, y))
if len(heap) > k:
heapq.heappop(heap)
return [[x, y] for (d, x, y) in heap]
# Example
print(k_closest([[1, 3], [-2, 2]], 1)) # [[-2, 2]]
print(k_closest([[3, 3], [5, -1], [-2, 4]], 2)) # [[3,3],[-2,4]]
Complexity: Time O(n log k), Space O(k)
Problem 4: Top K Frequent Words (MEDIUM)
Asked at: Amazon, Google, Facebook
def top_k_frequent_words(words, k):
"""
Heap ordered by (-frequency, word) for lexicographic tie-breaking.
Time: O(n log k) Space: O(n)
"""
from collections import Counter
import heapq
count = Counter(words)
heap = [(-freq, word) for word, freq in count.items()]
heapq.heapify(heap) # O(n)
return [heapq.heappop(heap)[1] for _ in range(k)]
# Example
print(top_k_frequent_words(["i","love","leetcode","i","love","coding"], 2))
# ["i", "love"]
print(top_k_frequent_words(["the","day","is","sunny","the","the","the","sunny","is","is"], 4))
# ["the", "is", "sunny", "day"]
Complexity: Time O(n log k), Space O(n)
Problem 5: Kth Smallest Element in Sorted Matrix (MEDIUM)
Asked at: Amazon, Google, Goldman Sachs
def kth_smallest_matrix(matrix, k):
"""
Min-heap: push (val, row, col) starting with first column.
Pop k times; push (row, col+1) after each pop.
Time: O(k log n) Space: O(n)
"""
import heapq
n = len(matrix)
heap = [(matrix[r][0], r, 0) for r in range(n)]
heapq.heapify(heap)
for _ in range(k - 1):
val, r, c = heapq.heappop(heap)
if c + 1 < n:
heapq.heappush(heap, (matrix[r][c + 1], r, c + 1))
return heapq.heappop(heap)[0]
# Example
matrix = [[1, 5, 9], [10, 11, 13], [12, 13, 15]]
print(kth_smallest_matrix(matrix, 8)) # 13
Complexity: Time O(k log n), Space O(n)
Problem 6: Find Median from Data Stream (HARD)
Asked at: Amazon, Google, Microsoft, Facebook - two-heap trick
class MedianFinder:
"""
Max-heap for lower half, min-heap for upper half.
Keep sizes balanced (differ by at most 1).
Median = top of one or average of both tops.
Time per operation: O(log n) Space: O(n)
"""
def __init__(self):
self.lower = [] # max-heap (negate values)
self.upper = [] # min-heap
def add_num(self, num):
# Always push to lower first
heapq.heappush(self.lower, -num)
# Balance: lower top must be <= upper top
if self.upper and -self.lower[0] > self.upper[0]:
heapq.heappush(self.upper, -heapq.heappop(self.lower))
# Balance sizes: lower can have at most 1 more than upper
if len(self.lower) > len(self.upper) + 1:
heapq.heappush(self.upper, -heapq.heappop(self.lower))
elif len(self.upper) > len(self.lower):
heapq.heappush(self.lower, -heapq.heappop(self.upper))
def find_median(self):
if len(self.lower) > len(self.upper):
return -self.lower[0]
return (-self.lower[0] + self.upper[0]) / 2
# Example
mf = MedianFinder()
mf.add_num(1); mf.add_num(2)
print(mf.find_median()) # 1.5
mf.add_num(3)
print(mf.find_median()) # 2.0
Complexity: Add O(log n), FindMedian O(1), Space O(n)
Problem 7: Merge K Sorted Lists (HARD)
Asked at: Amazon, Google, Microsoft, Facebook
def merge_k_lists(lists):
"""
Min-heap: push (val, list_idx, node) from each list head.
Pop minimum, push next from that list.
Time: O(n log k) where n = total nodes, k = number of lists
Space: O(k)
"""
import heapq
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
heap = []
for i, node in enumerate(lists):
if node:
heapq.heappush(heap, (node.val, i, node))
dummy = ListNode(0)
curr = dummy
while heap:
val, i, node = heapq.heappop(heap)
curr.next = node
curr = curr.next
if node.next:
heapq.heappush(heap, (node.next.val, i, node.next))
return dummy.next
Complexity: Time O(n log k), Space O(k)
Problem 8: Sliding Window Median (HARD)
Asked at: Google, Amazon
def sliding_window_median(nums, k):
"""
Two heaps, but with lazy deletion for the outgoing element.
Time: O(n log k) Space: O(k)
"""
from sortedcontainers import SortedList
sl = SortedList()
result = []
for i, num in enumerate(nums):
sl.add(num)
if i >= k:
sl.remove(nums[i - k])
if i >= k - 1:
if k % 2 == 1:
result.append(float(sl[k // 2]))
else:
result.append((sl[k // 2 - 1] + sl[k // 2]) / 2)
return result
# Example
print(sliding_window_median([1, 3, -1, -3, 5, 3, 6, 7], 3))
# [1.0, -1.0, -1.0, 3.0, 5.0, 6.0]
Complexity: Time O(n log k), Space O(k)
Heap Operations Quick Reference
import heapq
# Create min-heap
heap = []
heapq.heapify(arr) # in-place, O(n)
# Push
heapq.heappush(heap, val) # O(log n)
# Pop minimum
heapq.heappop(heap) # O(log n)
# Peek minimum (no pop)
heap[0] # O(1)
# Max-heap: negate values
heapq.heappush(heap, -val)
max_val = -heapq.heappop(heap)
# Heap of tuples: ordered by first element
heapq.heappush(heap, (priority, item))
Algorithm Selection Guide
n values, find k largest:
k << n: Min-heap of size k -- O(n log k)
k close to n: Sort in reverse -- O(n log n)
one-time only: Quickselect -- O(n) average
frequencies: Bucket sort -- O(n)
Streaming data (values arrive one by one):
Always: Min-heap of size k. O(log k) per element.
Need both min and max simultaneously (median):
Two heaps: max-heap for lower half + min-heap for upper half.
Why Min-Heap of Size K Works for Top-K Largest
The counterintuitive part: to find the k LARGEST, you use a MIN-heap, not a max-heap. The reasoning: the min-heap acts as a gate. Any new element that is larger than the smallest element in the heap is a better candidate for the top-k. By always popping the minimum (the weakest current candidate) when the heap exceeds size k, you ensure only the k strongest elements survive.
A max-heap would let you pop the largest first, which is useful when you want to process elements in order from largest to smallest, but does not help you maintain exactly the top k.
Quickselect vs Heap: When Does It Actually Matter?
For most interview settings, the O(n log k) heap solution is preferred because it is predictable and handles streaming data. Quickselect's O(n) average is theoretically better, but its O(n^2) worst case (bad pivot) is a concern. In practice, randomized pivot selection makes worst case extremely unlikely.
Quickselect modifies the input array in-place, which is a problem if the caller needs the original array preserved. When the interviewer asks "can you do better than O(n log k)?", that is the signal to describe quickselect and its tradeoffs.
The Two-Heap Median Pattern
The two-heap approach for streaming median is a high-signal answer because it demonstrates understanding of heap invariants. The lower half max-heap and upper half min-heap must satisfy two invariants at all times: the tops are ordered (max of lower <= min of upper), and sizes are balanced (differ by at most one). Every insertion requires at most two heap operations to restore both invariants.
Interviewers at Google and Facebook have been known to ask follow-ups: "how do you handle deletions?" and "what if the stream is infinite?" Both lead to the lazy deletion and two-heap rebalancing discussion.
Related Articles
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
topic cluster
More resources in Uncategorized
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.
Submit your story →ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.