NumPy Interview Questions 2026: 25 Answers with Code

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
NumPy is the numerical computing foundation of the Python ML stack. Every data science, ML engineering, and quant role assumes NumPy fluency. Interviewers test array operations, broadcasting, linear algebra, and performance patterns -- not just syntax. This guide covers 25 NumPy interview questions with full answers and code examples.
PapersAdda's take: Candidates report that broadcasting and vectorization questions separate good candidates from great ones in NumPy-heavy rounds at tier-1 product companies. The most common live coding scenario: "rewrite this Python loop using NumPy". Confirm the specific interview format on the official company careers portal before you prepare.
Related articles: Pandas Interview Questions 2026 | Machine Learning Interview Questions 2026 | Data Science Interview Questions 2026 | Scikit-Learn Interview Questions 2026 | Deep Learning Interview Questions 2026
EASY: Array Fundamentals (Questions 1-8)
Q1. What is a NumPy ndarray? How does it differ from a Python list?
import numpy as np
import sys
# Python list: heterogeneous, dynamic, pointers to objects
py_list = [1, 2.0, "three"]
print(f"List type: {type(py_list[0])}, {type(py_list[1])}") # int, float
# NumPy array: homogeneous, contiguous memory, typed
arr = np.array([1, 2, 3], dtype=np.float32)
print(arr.dtype) # float32
print(arr.shape) # (3,)
print(arr.ndim) # 1
print(arr.strides) # bytes between elements: (4,) for float32
# Memory comparison
py_list = list(range(1_000_000))
np_arr = np.arange(1_000_000)
print(f"Python list: {sys.getsizeof(py_list) / 1e6:.1f} MB") # ~8 MB (pointer array + objects)
print(f"NumPy array: {np_arr.nbytes / 1e6:.1f} MB") # ~8 MB (int64) or ~4 MB (int32)
| Property | Python list | NumPy ndarray |
|---|---|---|
| Dtype | Heterogeneous | Homogeneous |
| Memory | Scattered (pointers to objects) | Contiguous block |
| Speed | Slow (Python loop per element) | Fast (C/BLAS vectorized) |
| Operations | Loops required | Element-wise (+, *, /...) |
| Multidimensional | List of lists | True N-D with shape/strides |
Q2. What are NumPy dtypes? How do you choose the right one?
import numpy as np
# Integer types
a_int8 = np.array([1, 2, 3], dtype=np.int8) # -128 to 127, 1 byte
a_int16 = np.array([1, 2, 3], dtype=np.int16) # -32768 to 32767, 2 bytes
a_int32 = np.array([1, 2, 3], dtype=np.int32) # ~+/-2B, 4 bytes
a_int64 = np.array([1, 2, 3], dtype=np.int64) # default integer, 8 bytes
# Float types
a_f16 = np.array([1.0], dtype=np.float16) # half precision, 2 bytes
a_f32 = np.array([1.0], dtype=np.float32) # single precision, 4 bytes (ML default)
a_f64 = np.array([1.0], dtype=np.float64) # double precision (Python default), 8 bytes
# Boolean
a_bool = np.array([True, False, True], dtype=np.bool_) # 1 byte per element
# Check type limits
print(np.iinfo(np.int8)) # min=-128, max=127
print(np.finfo(np.float32)) # eps=1.19e-07, max=3.4e+38
# Type conversion
a_float = a_int32.astype(np.float32)
# Gotcha: float16 overflow
print(np.float16(65504)) # 65504 (max)
print(np.float16(65600)) # inf (overflow)
ML guideline: use float32 for model weights (halves memory vs float64 with negligible precision loss). Use float16/bfloat16 for mixed-precision training. Use int32 for indices.
Q3. Explain NumPy array creation methods.
import numpy as np
# From data
arr = np.array([1, 2, 3, 4])
matrix = np.array([[1, 2], [3, 4]], dtype=np.float32)
# Zeros / ones / full
zeros = np.zeros((3, 4), dtype=np.float32)
ones = np.ones((2, 3))
full = np.full((2, 3), fill_value=7.0)
# Identity / diagonal
eye = np.eye(4) # 4x4 identity
diag = np.diag([1, 2, 3]) # diagonal matrix
# Range sequences
arange = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
linspace = np.linspace(0, 1, 11) # 11 evenly spaced points from 0 to 1
logspace = np.logspace(0, 3, 4) # [1, 10, 100, 1000]
# Random arrays
rng = np.random.default_rng(seed=42) # recommended modern API
uniform = rng.random((3, 3)) # Uniform [0, 1)
normal = rng.standard_normal((3, 3)) # N(0, 1)
integers = rng.integers(0, 100, (3, 3)) # Random ints
# From shape of another array
like_arr = np.zeros_like(matrix) # same shape and dtype, filled with 0
ones_like = np.ones_like(matrix)
Q4. What is array indexing and slicing in NumPy? What is fancy indexing?
import numpy as np
a = np.arange(12).reshape(3, 4)
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
# Basic slicing (returns a VIEW -- shares memory)
row0 = a[0, :] # First row
col1 = a[:, 1] # Second column
submatrix = a[0:2, 1:3] # Rows 0-1, cols 1-2
every_other = a[::2, ::2] # Step slicing
# Boolean indexing (returns COPY)
mask = a > 5
print(a[mask]) # [6, 7, 8, 9, 10, 11]
a[mask] = -1 # In-place modification via boolean mask
# Fancy indexing with integer arrays (returns COPY)
rows = np.array([0, 2])
cols = np.array([1, 3])
print(a[rows, cols]) # [a[0,1], a[2,3]] = [1, 11]
# Select entire rows
print(a[[0, 2], :]) # Rows 0 and 2
# np.ix_ for outer product indexing
print(a[np.ix_([0, 2], [1, 3])]) # 2x2 submatrix at rows [0,2] and cols [1,3]
# View vs copy verification
view = a[0:2, 0:2]
copy = a[[0, 1], :][:, [0, 1]]
print(np.shares_memory(a, view)) # True
print(np.shares_memory(a, copy)) # False
Q5. What is the difference between reshape, resize, and ravel/flatten?
import numpy as np
a = np.arange(12)
# reshape: returns VIEW (if possible), does not change data
b = a.reshape(3, 4) # (12,) -> (3, 4)
c = a.reshape(3, -1) # -1: NumPy infers the dimension -> (3, 4)
d = a.reshape(2, 2, 3) # 3D reshape
# Important: reshape returns view when possible
print(np.shares_memory(a, b)) # True
# Reshape fails if total size doesn't match
# a.reshape(3, 5) # ValueError: cannot reshape (12,) to (3, 5)
# ravel: return 1D VIEW (if contiguous), or copy
flat_view = b.ravel() # Fast -- returns view of contiguous array
print(np.shares_memory(b, flat_view)) # True
# flatten: always returns a COPY
flat_copy = b.flatten()
print(np.shares_memory(b, flat_copy)) # False
# resize: modifies array IN-PLACE, can change total size (fills with 0 or repeats)
a2 = np.array([1, 2, 3])
a2.resize(5) # [1, 2, 3, 0, 0] -- in-place, new size
a2.resize(2) # [1, 2] -- truncate
# np.resize (function, not method): creates new array, wraps values cyclically
resized = np.resize(np.array([1, 2, 3]), (2, 4))
# [[1, 2, 3, 1],
# [2, 3, 1, 2]]
Q6. Explain axis parameter in NumPy operations.
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6]]) # shape (2, 3)
# axis=0: collapse along rows (down the columns)
print(a.sum(axis=0)) # [5, 7, 9] -- sum down each column
print(a.max(axis=0)) # [4, 5, 6] -- max of each column
# axis=1: collapse along columns (across the rows)
print(a.sum(axis=1)) # [6, 15] -- sum across each row
print(a.mean(axis=1)) # [2., 5.] -- mean of each row
# keepdims: preserve dimension for broadcasting
col_means = a.mean(axis=0, keepdims=True) # shape (1, 3)
row_means = a.mean(axis=1, keepdims=True) # shape (2, 1)
# Normalize each row (subtract row mean)
a_row_norm = a - a.mean(axis=1, keepdims=True)
# Normalize each column (subtract column mean)
a_col_norm = a - a.mean(axis=0, keepdims=True)
# 3D example
b = np.ones((2, 3, 4)) # (batch, height, width)
print(b.sum(axis=0).shape) # (3, 4) -- sum over batch dimension
print(b.sum(axis=(1, 2)).shape) # (2,) -- sum over spatial dimensions
Q7. What are strides in NumPy? How do they enable views?
import numpy as np
a = np.arange(12, dtype=np.int32) # 1D array, 4 bytes per element
print(a.strides) # (4,) -- 4 bytes to next element
b = a.reshape(3, 4)
print(b.strides) # (16, 4) -- 16 bytes to next row, 4 bytes to next column
# Transposing: just reverses strides (no data copy)
bT = b.T
print(bT.strides) # (4, 16) -- 4 bytes to next row, 16 bytes to next column
print(np.shares_memory(b, bT)) # True
# Practical: stride tricks for sliding windows (efficient)
from numpy.lib.stride_tricks import sliding_window_view
data = np.array([1, 2, 3, 4, 5, 6, 7])
windows = sliding_window_view(data, window_shape=3)
# [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]]
print(windows)
# Custom stride: every-other-element view
every_other = data[::2] # (3,) elements, stride = 2*4 = 8 bytes
print(every_other.strides) # (8,)
Q8. What is np.where? How is it different from np.select?
import numpy as np
a = np.array([-3, -1, 0, 2, 5, -4, 7])
# np.where(condition, x, y): element-wise conditional
result = np.where(a > 0, a, 0) # ReLU activation
print(result) # [0, 0, 0, 2, 5, 0, 7]
# np.where with multiple arrays
b = np.array([10, 10, 10, 10, 10, 10, 10])
result2 = np.where(a > 0, a, b) # take from a if positive, else from b
# np.where(condition) -- returns indices (like argwhere)
indices = np.where(a > 0)
print(indices) # (array([3, 4, 6]),) -- indices where condition is True
print(a[indices]) # [2, 5, 7]
# np.select: multiple conditions (like if-elif-else chain)
conditions = [a < -2, a.between(-2, 2) if hasattr(a, 'between') else (a >= -2) & (a <= 2), a > 2]
choices = ["very_negative", "neutral", "positive"]
result3 = np.select(conditions, choices, default="unknown")
print(result3) # ['very_negative', 'neutral', 'neutral', 'neutral', 'positive', 'very_negative', 'positive']
MEDIUM: Broadcasting and Vectorization (Questions 9-18)
Q9. Explain NumPy broadcasting rules with examples.
Broadcasting allows operations on arrays with different shapes by "expanding" smaller arrays.
Rules (applied right-to-left):
- If arrays differ in rank, prepend 1s to the smaller array's shape.
- Sizes along each dimension must match OR one of them must be 1.
- Dimension of size 1 is stretched to match the other size.
import numpy as np
# Example 1: scalar broadcast
a = np.array([1, 2, 3])
print(a + 10) # [11, 12, 13] -- scalar treated as (1,) then (3,)
# Example 2: (3,) + (3, 1) -> (3, 3)
col = np.array([[1], [2], [3]]) # shape (3, 1)
row = np.array([10, 20, 30]) # shape (3,) -> treated as (1, 3) -> (3, 3)
print(col + row)
# [[11, 21, 31],
# [12, 22, 32],
# [13, 23, 33]]
# Example 3: batch normalization broadcasting
X = np.random.randn(100, 10) # (batch, features)
mean = X.mean(axis=0) # (10,)
std = X.std(axis=0) # (10,)
X_norm = (X - mean) / (std + 1e-8) # (100, 10) - (10,) broadcasts correctly
# Example 4: pairwise distance matrix
A = np.array([[1, 2], [3, 4], [5, 6]]) # (3, 2)
B = np.array([[1, 1], [2, 2]]) # (2, 2)
# A[:, np.newaxis] is (3, 1, 2); B is (2, 2)
# Broadcast: (3, 1, 2) and (2, 2) -> (3, 2, 2)
diff = A[:, np.newaxis, :] - B[np.newaxis, :, :] # (3, 2, 2)
dist = np.sqrt((diff**2).sum(axis=2)) # (3, 2)
Q10. How do you vectorize a Python loop with NumPy?
import numpy as np
import timeit
# Task: compute pairwise Euclidean distances between N points
# SLOW: Python loop
def pairwise_dist_loop(X):
n = len(X)
D = np.zeros((n, n))
for i in range(n):
for j in range(n):
D[i, j] = np.sqrt(((X[i] - X[j])**2).sum())
return D
# FAST: broadcasting
def pairwise_dist_vec(X):
diff = X[:, np.newaxis, :] - X[np.newaxis, :, :] # (N, N, D)
return np.sqrt((diff**2).sum(axis=2))
# FASTER: expand via ||a-b||^2 = ||a||^2 - 2a.b + ||b||^2
def pairwise_dist_fast(X):
sq_norms = (X**2).sum(axis=1) # (N,)
return np.sqrt(sq_norms[:, np.newaxis] - 2 * (X @ X.T) + sq_norms[np.newaxis, :])
X = np.random.randn(500, 10)
print(np.allclose(pairwise_dist_loop(X[:10]), pairwise_dist_vec(X[:10]))) # True
# Benchmark
t_loop = timeit.timeit(lambda: pairwise_dist_loop(X[:50]), number=3) / 3
t_vec = timeit.timeit(lambda: pairwise_dist_vec(X), number=10) / 10
t_fast = timeit.timeit(lambda: pairwise_dist_fast(X), number=10) / 10
print(f"Loop: {t_loop:.3f}s | Vectorized: {t_vec:.3f}s | Fast: {t_fast:.3f}s")
Q11. What is np.einsum? Give ML use cases.
np.einsum expresses tensor contractions with Einstein summation notation. Faster than manual transposes + dot products for complex operations.
import numpy as np
# Notation: lowercase letters are indices; repeated = summed over (contracted)
A = np.random.randn(3, 4) # matrix (i, j)
B = np.random.randn(4, 5) # matrix (j, k)
C = np.random.randn(3, 4, 5) # 3D tensor (i, j, k)
# Matrix multiplication: A @ B
print(np.einsum('ij,jk->ik', A, B).shape) # (3, 5)
# Dot product of two vectors
a, b = np.array([1, 2, 3]), np.array([4, 5, 6])
print(np.einsum('i,i->', a, b)) # 32
# Outer product
print(np.einsum('i,j->ij', a, b).shape) # (3, 3)
# Batch matrix multiplication (B, M, K) x (B, K, N) -> (B, M, N)
X = np.random.randn(8, 10, 16) # (batch, seq, d_k)
W = np.random.randn(8, 16, 32) # (batch, d_k, d_v)
out = np.einsum('bij,bjk->bik', X, W) # (8, 10, 32)
# Multi-head attention score: (B, H, T, D) x (B, H, D, T) -> (B, H, T, T)
Q = np.random.randn(2, 4, 16, 64)
K = np.random.randn(2, 4, 16, 64)
scores = np.einsum('bhid,bhjd->bhij', Q, K) # (2, 4, 16, 16)
# Transpose: equivalent to .T
print(np.einsum('ij->ji', A).shape) # (4, 3)
# Trace (sum of diagonal)
print(np.einsum('ii->', np.eye(4))) # 4.0
Q12. What is np.linalg? Cover the key operations used in ML.
import numpy as np
A = np.array([[2, 1], [5, 3]], dtype=float)
# Matrix inverse
A_inv = np.linalg.inv(A)
print(A @ A_inv) # Identity matrix (approximately)
# Determinant
det = np.linalg.det(A)
print(det) # 1.0
# Eigenvalues and eigenvectors (for PCA, covariance matrices)
eigenvalues, eigenvectors = np.linalg.eig(A)
# Symmetric matrices: eigh is faster and guaranteed real eigenvalues
cov = np.cov(np.random.randn(3, 100)) # 3x3 covariance matrix
eigvals, eigvecs = np.linalg.eigh(cov)
# SVD (Singular Value Decomposition) -- foundation of PCA, matrix factorization
M = np.random.randn(5, 4)
U, S, Vt = np.linalg.svd(M, full_matrices=False)
print(U.shape, S.shape, Vt.shape) # (5, 4), (4,), (4, 4)
# Reconstruct: M = U @ np.diag(S) @ Vt
M_reconstructed = U @ np.diag(S) @ Vt
print(np.allclose(M, M_reconstructed)) # True
# Rank-k approximation (dimensionality reduction)
k = 2
M_approx = U[:, :k] @ np.diag(S[:k]) @ Vt[:k, :]
# Solve linear system Ax = b
b = np.array([4, 7], dtype=float)
x = np.linalg.solve(A, b)
print(A @ x) # should equal b
# Least squares (overdetermined systems)
X = np.random.randn(100, 5)
y = np.random.randn(100)
coeffs, residuals, rank, sv = np.linalg.lstsq(X, y, rcond=None)
# Matrix norm
frob_norm = np.linalg.norm(A, 'fro') # Frobenius norm
spec_norm = np.linalg.norm(A, 2) # Spectral norm (largest singular value)
Q13. How do you implement common ML operations from scratch using NumPy?
import numpy as np
# Softmax (numerically stable)
def softmax(x):
x = x - x.max(axis=-1, keepdims=True) # subtract max for stability
e_x = np.exp(x)
return e_x / e_x.sum(axis=-1, keepdims=True)
# Cross-entropy loss
def cross_entropy(y_pred, y_true):
"""y_pred: (N, C) probabilities, y_true: (N,) integer class labels"""
N = y_pred.shape[0]
log_prob = -np.log(y_pred[np.arange(N), y_true] + 1e-8)
return log_prob.mean()
# Sigmoid + Binary cross-entropy
def sigmoid(x):
return np.where(x >= 0,
1 / (1 + np.exp(-x)),
np.exp(x) / (1 + np.exp(x))) # numerically stable
def bce_loss(y_pred, y_true):
return -np.mean(y_true * np.log(y_pred + 1e-8) + (1 - y_true) * np.log(1 - y_pred + 1e-8))
# Linear regression with gradient descent
def linear_regression_sgd(X, y, lr=0.01, epochs=100):
n, d = X.shape
w = np.zeros(d)
b = 0.0
for _ in range(epochs):
y_pred = X @ w + b
grad_w = (1/n) * X.T @ (y_pred - y)
grad_b = (1/n) * (y_pred - y).sum()
w -= lr * grad_w
b -= lr * grad_b
return w, b
# K-means clustering
def kmeans(X, k, n_iter=100, seed=42):
rng = np.random.default_rng(seed)
centers = X[rng.choice(len(X), k, replace=False)]
for _ in range(n_iter):
dists = np.linalg.norm(X[:, np.newaxis] - centers[np.newaxis], axis=2) # (N, k)
labels = dists.argmin(axis=1)
centers = np.array([X[labels == i].mean(axis=0) for i in range(k)])
return labels, centers
Q14. What is np.random? Explain the difference between the legacy API and Generator.
import numpy as np
# Legacy API (still works, but NOT recommended for new code)
np.random.seed(42)
a = np.random.randn(3, 3)
b = np.random.randint(0, 10, size=5)
# Modern API: np.random.default_rng (recommended since NumPy 1.17)
rng = np.random.default_rng(seed=42)
# Common distributions
uniform = rng.random((3, 3)) # Uniform [0, 1)
normal = rng.standard_normal((3, 3)) # N(0, 1)
integers = rng.integers(0, 10, (5,)) # Integer in [0, 10)
choice = rng.choice([1, 2, 3, 4, 5], size=3, replace=False) # Sample without replacement
binomial = rng.binomial(n=10, p=0.5, size=100)
# Shuffle (in-place)
arr = np.arange(10)
rng.shuffle(arr)
# Permutation (returns new array)
perm = rng.permutation(10)
# Reproducibility across processes (use SeedSequence)
ss = np.random.SeedSequence(42)
child_seeds = ss.spawn(4) # 4 independent streams for parallel workers
rngs = [np.random.default_rng(s) for s in child_seeds]
# Why Generator over legacy:
# - Reproducible across platforms and NumPy versions
# - Multiple independent streams for parallel code
# - Faster PCG64 algorithm vs Mersenne Twister
Q15. How do you use np.vectorize and when should you avoid it?
import numpy as np
import timeit
# Custom Python function that operates on scalars
def process_scalar(x, threshold=5):
if x > threshold:
return x ** 2
elif x > 0:
return x
else:
return 0
# np.vectorize: wraps scalar function to accept arrays
vfunc = np.vectorize(process_scalar, excluded=["threshold"])
arr = np.array([-2, 3, 7, -1, 10])
print(vfunc(arr, threshold=5)) # [0, 3, 49, 0, 100]
# IMPORTANT: np.vectorize is a convenience wrapper, NOT a performance tool
# It still loops in Python internally!
# Benchmark
arr_big = np.random.randn(100_000)
t_vec = timeit.timeit(lambda: vfunc(arr_big), number=10) / 10
# True vectorized equivalent
def process_vectorized(x, threshold=5):
result = np.zeros_like(x)
result = np.where(x > 0, x, result)
result = np.where(x > threshold, x**2, result)
return result
t_fast = timeit.timeit(lambda: process_vectorized(arr_big), number=10) / 10
print(f"np.vectorize: {t_vec:.3f}s | true vectorized: {t_fast:.4f}s")
# Typically 10-100x difference in favor of true vectorized
# Use np.vectorize ONLY for:
# - Quick prototyping
# - Complex logic that is hard to vectorize
# - When the function calls external C/Cython code (then overhead is negligible)
Q16. What is memory layout (C vs Fortran order)? When does it matter?
import numpy as np
import timeit
# C order (row-major): consecutive elements along last axis are contiguous in memory
# Fortran order (column-major): consecutive elements along first axis are contiguous
a_C = np.array([[1, 2, 3], [4, 5, 6]], order='C') # default
a_F = np.array([[1, 2, 3], [4, 5, 6]], order='F')
print(a_C.strides) # (12, 4) -- 4 bytes per element, row stride = 3*4
print(a_F.strides) # (4, 8) -- column-major
# Performance impact: row-wise iteration is fast for C order
def sum_rows_C(a):
return a.sum(axis=1) # sums across last axis -- contiguous for C order
def sum_cols_C(a):
return a.sum(axis=0) # sums across first axis -- NOT contiguous for C order
# Large matrix
big = np.random.randn(1000, 1000)
t_row = timeit.timeit(lambda: sum_rows_C(big), number=1000) / 1000
t_col = timeit.timeit(lambda: sum_cols_C(big), number=1000) / 1000
print(f"Row sum: {t_row*1e6:.1f}us | Col sum: {t_col*1e6:.1f}us")
# Contiguous check
print(a_C.flags['C_CONTIGUOUS']) # True
print(a_C.flags['F_CONTIGUOUS']) # False
print(a_F.flags['F_CONTIGUOUS']) # True
# ascontiguousarray: force C order (needed for some BLAS/cuBLAS operations)
a_contiguous = np.ascontiguousarray(a_F)
Q17. How do you use np.apply_along_axis, np.apply_over_axes, and when to avoid them?
import numpy as np
a = np.random.randn(4, 5, 3)
# apply_along_axis: apply 1D function along a specific axis
def normalize_1d(x):
return (x - x.mean()) / (x.std() + 1e-8)
# Normalize each row of 2D
b = np.random.randn(10, 5)
result = np.apply_along_axis(normalize_1d, axis=1, arr=b) # apply to each row
print(result.shape) # (10, 5)
# BETTER: vectorized equivalent (much faster)
result_fast = (b - b.mean(axis=1, keepdims=True)) / (b.std(axis=1, keepdims=True) + 1e-8)
# apply_over_axes: apply reduction over multiple axes
# e.g., sum over axes 0 and 2 of a (4, 5, 3) array
result_axes = np.apply_over_axes(np.sum, a, [0, 2]) # (1, 5, 1)
# Better: a.sum(axis=(0, 2), keepdims=True)
# Performance guideline:
# apply_along_axis calls a Python function per slice -- similar speed to np.vectorize
# Always prefer vectorized operations when possible
# apply_along_axis is useful when the function has complex logic not easily vectorized
# Real vectorized batch normalization
def batch_norm(X, axis=0):
mean = X.mean(axis=axis, keepdims=True)
std = X.std(axis=axis, keepdims=True)
return (X - mean) / (std + 1e-8)
Q18. What is np.meshgrid? Give a machine learning use case.
import numpy as np
import matplotlib.pyplot as plt
# meshgrid: create coordinate matrices from 1D arrays
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y) # X and Y are both (100, 100)
# Use case 1: plot decision boundary of a classifier
from sklearn.datasets import make_moons
from sklearn.svm import SVC
data, labels = make_moons(n_samples=200, noise=0.1, random_state=42)
model = SVC(kernel='rbf', probability=True)
model.fit(data, labels)
# Create grid over feature space
x_min, x_max = data[:, 0].min() - 0.5, data[:, 0].max() + 0.5
y_min, y_max = data[:, 1].min() - 0.5, data[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 200),
np.linspace(y_min, y_max, 200))
# Predict on every grid point
grid_points = np.c_[xx.ravel(), yy.ravel()]
Z = model.predict(grid_points).reshape(xx.shape)
# Use case 2: Gaussian function over 2D grid
def gaussian_2d(X, Y, sigma=1.0):
return np.exp(-(X**2 + Y**2) / (2 * sigma**2))
Z_gauss = gaussian_2d(X, Y, sigma=1.5)
print(Z_gauss.shape) # (100, 100)
# Use case 3: compute all pairwise feature interactions
f1 = np.array([1, 2, 3, 4, 5])
f2 = np.array([10, 20, 30])
F1, F2 = np.meshgrid(f1, f2, indexing='ij') # (5, 3) interaction grid
interactions = F1 * F2 # all pairwise products
HARD: Performance and Advanced Patterns (Questions 19-25)
Q19. What is numba? When should you use it instead of NumPy?
import numpy as np
from numba import jit, njit, prange
import timeit
# Pure NumPy: fast for element-wise, slow for complex loops
def rolling_std_numpy(arr, window):
result = np.empty(len(arr))
result[:window-1] = np.nan
for i in range(window-1, len(arr)):
result[i] = arr[i-window+1:i+1].std()
return result
# Numba JIT: compile Python loop to LLVM machine code
@njit(parallel=True) # no Python object overhead, parallel via OpenMP
def rolling_std_numba(arr, window):
result = np.empty(len(arr))
for i in prange(len(arr)): # parallel range
if i < window - 1:
result[i] = np.nan
else:
result[i] = arr[i-window+1:i+1].std()
return result
arr = np.random.randn(1_000_000)
# First call compiles (slow), subsequent calls fast
rolling_std_numba(arr, 20) # warmup
t_np = timeit.timeit(lambda: rolling_std_numpy(arr, 20), number=3) / 3
t_nb = timeit.timeit(lambda: rolling_std_numba(arr, 20), number=10) / 10
print(f"NumPy loop: {t_np:.3f}s | Numba: {t_nb:.4f}s")
# Use numba when:
# - Algorithm REQUIRES a loop (cannot be expressed as array operations)
# - Same loop runs millions of times
# - Custom numerical algorithms (Monte Carlo, custom optimizers)
# Avoid numba when:
# - Simple vectorized NumPy expression already exists
# - Code involves Python objects (strings, dicts, lists)
# - Startup compilation time matters (microservice cold start)
Q20. Implement a neural network forward pass with NumPy.
import numpy as np
class DenseLayer:
def __init__(self, in_features: int, out_features: int):
# He initialization for ReLU layers
self.W = np.random.randn(in_features, out_features) * np.sqrt(2 / in_features)
self.b = np.zeros(out_features)
def forward(self, x: np.ndarray) -> np.ndarray:
self.x = x # cache for backward
return x @ self.W + self.b # (N, out_features)
def backward(self, dout: np.ndarray) -> np.ndarray:
self.dW = self.x.T @ dout
self.db = dout.sum(axis=0)
return dout @ self.W.T
class ReLU:
def forward(self, x):
self.mask = x > 0
return x * self.mask
def backward(self, dout):
return dout * self.mask
class TwoLayerNet:
def __init__(self, in_dim, hidden_dim, out_dim):
self.l1 = DenseLayer(in_dim, hidden_dim)
self.relu = ReLU()
self.l2 = DenseLayer(hidden_dim, out_dim)
def forward(self, x):
return self.l2.forward(self.relu.forward(self.l1.forward(x)))
def loss_and_backward(self, x, y):
logits = self.forward(x)
# Softmax + cross-entropy gradient
probs = np.exp(logits - logits.max(axis=1, keepdims=True))
probs /= probs.sum(axis=1, keepdims=True)
N = x.shape[0]
loss = -np.log(probs[np.arange(N), y] + 1e-8).mean()
dout = probs.copy()
dout[np.arange(N), y] -= 1
dout /= N
self.l2.backward(self.relu.backward(self.l1.backward(dout)))
return loss
# Test
net = TwoLayerNet(784, 128, 10)
X = np.random.randn(32, 784)
y = np.random.randint(0, 10, 32)
loss = net.loss_and_backward(X, y)
print(f"Loss: {loss:.4f}")
Q21. How do you use np.memmap for arrays larger than RAM?
import numpy as np
import os
import tempfile
# Create a memory-mapped file
tmpfile = tempfile.mktemp(suffix='.npy')
# Write mode: create a large array on disk
fp = np.memmap(tmpfile, dtype=np.float32, mode='w+', shape=(100_000, 512))
fp[:] = np.random.randn(100_000, 512)
fp.flush() # Write to disk
del fp # Close the memmap
# Read mode: access without loading all into RAM
fp_read = np.memmap(tmpfile, dtype=np.float32, mode='r', shape=(100_000, 512))
print(fp_read.shape) # (100000, 512)
print(fp_read[0, :5]) # Access row 0 -- only loaded page is in RAM
# Slicing loads only the needed pages
batch = fp_read[0:1000] # Load 1000 rows
print(batch.shape) # (1000, 512)
# Use case: large embedding matrices or precomputed feature sets
# embeddings = np.memmap("embeddings.mmap", dtype=np.float32, mode='r', shape=(10_000_000, 768))
# Without memmap: 10M * 768 * 4 bytes = 30.7 GB RAM needed
# With memmap: only accessed pages in RAM (page = 4KB typically)
os.unlink(tmpfile)
Q22. What is np.testing? How do you write unit tests for numerical code?
import numpy as np
import pytest
def compute_softmax(x):
x_shifted = x - x.max(axis=-1, keepdims=True)
exp_x = np.exp(x_shifted)
return exp_x / exp_x.sum(axis=-1, keepdims=True)
# Testing with np.testing (precision-aware comparisons)
class TestSoftmax:
def test_output_sums_to_one(self):
x = np.array([[1.0, 2.0, 3.0], [0.5, 0.1, -0.5]])
out = compute_softmax(x)
np.testing.assert_allclose(out.sum(axis=1), np.ones(2), rtol=1e-6)
def test_output_range(self):
x = np.random.randn(10, 5)
out = compute_softmax(x)
assert (out >= 0).all() and (out <= 1).all()
def test_numerical_stability(self):
# Large values should not cause overflow
x = np.array([[1000.0, 1000.0, 1000.0]])
out = compute_softmax(x)
np.testing.assert_allclose(out, np.ones((1, 3)) / 3, rtol=1e-5)
def test_small_values(self):
x = np.array([[-1000.0, -1000.0, 0.0]])
out = compute_softmax(x)
np.testing.assert_allclose(out[0, 2], 1.0, rtol=1e-4)
def test_shape_preserved(self):
x = np.random.randn(4, 8, 10)
out = compute_softmax(x)
assert out.shape == x.shape
# np.testing key functions:
# assert_allclose(actual, desired, rtol, atol) -- floating point equality
# assert_array_equal(x, y) -- exact integer equality
# assert_array_less(x, y) -- element-wise less than
# assert_approx_equal(actual, desired, significant) -- scalar comparison
Q23. Implement batch normalization forward and backward pass in NumPy.
import numpy as np
def batch_norm_forward(x, gamma, beta, eps=1e-5):
"""
x: (N, D) input
gamma, beta: (D,) learnable scale and shift
Returns normalized output and cache for backward pass.
"""
mu = x.mean(axis=0) # (D,)
var = x.var(axis=0) # (D,)
x_hat = (x - mu) / np.sqrt(var + eps) # (N, D)
out = gamma * x_hat + beta # (N, D)
cache = (x, x_hat, mu, var, gamma, eps)
return out, cache
def batch_norm_backward(dout, cache):
"""Backprop through batch normalization."""
x, x_hat, mu, var, gamma, eps = cache
N, D = x.shape
dgamma = (dout * x_hat).sum(axis=0) # (D,)
dbeta = dout.sum(axis=0) # (D,)
dx_hat = dout * gamma # (N, D)
dvar = (dx_hat * (x - mu) * -0.5 * (var + eps)**-1.5).sum(axis=0) # (D,)
dmu = (dx_hat * (-1 / np.sqrt(var + eps))).sum(axis=0) + dvar * (-2 * (x - mu).mean(axis=0))
dx = (dx_hat / np.sqrt(var + eps)) + (dvar * 2 * (x - mu) / N) + (dmu / N)
return dx, dgamma, dbeta
# Test
np.random.seed(42)
x = np.random.randn(32, 64)
gamma = np.ones(64)
beta = np.zeros(64)
out, cache = batch_norm_forward(x, gamma, beta)
print(f"Output mean: {out.mean():.6f} (expected ~0)")
print(f"Output std: {out.std():.6f} (expected ~1)")
dout = np.random.randn(*out.shape)
dx, dgamma, dbeta = batch_norm_backward(dout, cache)
print(f"dx shape: {dx.shape}, dgamma shape: {dgamma.shape}")
Q24. How does np.fft work? Give a signal processing / ML use case.
import numpy as np
import matplotlib.pyplot as plt
# Generate a synthetic signal: 50Hz + 120Hz components
sample_rate = 1000 # Hz
t = np.linspace(0, 1, sample_rate, endpoint=False)
signal = (np.sin(2 * np.pi * 50 * t) + # 50 Hz component
0.5 * np.sin(2 * np.pi * 120 * t) + # 120 Hz component
0.1 * np.random.randn(len(t))) # noise
# FFT
fft_result = np.fft.fft(signal)
frequencies = np.fft.fftfreq(len(signal), d=1/sample_rate)
magnitude = np.abs(fft_result[:len(signal)//2]) # positive frequencies only
positive_freqs = frequencies[:len(signal)//2]
# Identify dominant frequencies
top_k_idx = magnitude.argsort()[-5:][::-1]
print("Top frequencies (Hz):", positive_freqs[top_k_idx]) # should show 50, 120
# ML use case: audio feature extraction (spectral features)
def extract_spectral_features(signal, sr=1000, n_features=20):
"""Extract FFT-based features for audio classification."""
fft_mag = np.abs(np.fft.fft(signal))[:len(signal)//2]
freqs = np.fft.fftfreq(len(signal), 1/sr)[:len(signal)//2]
# Spectral centroid (center of mass of spectrum)
centroid = np.sum(freqs * fft_mag) / np.sum(fft_mag)
# Spectral rolloff (frequency below which 85% of energy lies)
cumsum = np.cumsum(fft_mag)
rolloff_idx = np.searchsorted(cumsum, 0.85 * cumsum[-1])
# Band energy features
band_edges = np.linspace(0, sr/2, n_features + 1)
band_energies = np.array([
fft_mag[(freqs >= band_edges[i]) & (freqs < band_edges[i+1])].sum()
for i in range(n_features)
])
return np.concatenate([[centroid, freqs[rolloff_idx]], band_energies])
features = extract_spectral_features(signal)
print(f"Feature vector length: {len(features)}")
Q25. Design a NumPy pipeline for processing a large tabular dataset for ML.
import numpy as np
from typing import Tuple
class NumPyPreprocessor:
"""Production-grade NumPy preprocessing pipeline."""
def __init__(self):
self.mean_ = None
self.std_ = None
self.feature_mask_ = None
def fit(self, X: np.ndarray, y: np.ndarray = None) -> 'NumPyPreprocessor':
# Remove zero-variance features
self.feature_mask_ = X.std(axis=0) > 1e-10
X_filtered = X[:, self.feature_mask_]
# Compute normalization stats (on training data only)
self.mean_ = X_filtered.mean(axis=0)
self.std_ = np.where(X_filtered.std(axis=0) > 0, X_filtered.std(axis=0), 1.0)
return self
def transform(self, X: np.ndarray) -> np.ndarray:
X_filtered = X[:, self.feature_mask_]
return (X_filtered - self.mean_) / self.std_
def fit_transform(self, X: np.ndarray, y: np.ndarray = None) -> np.ndarray:
return self.fit(X, y).transform(X)
def train_val_test_split(X, y, val_frac=0.15, test_frac=0.15, seed=42):
rng = np.random.default_rng(seed)
idx = rng.permutation(len(X))
n_test = int(len(X) * test_frac)
n_val = int(len(X) * val_frac)
test_idx = idx[:n_test]
val_idx = idx[n_test:n_test + n_val]
train_idx = idx[n_test + n_val:]
return (X[train_idx], y[train_idx],
X[val_idx], y[val_idx],
X[test_idx], y[test_idx])
# Full pipeline
np.random.seed(42)
X_raw = np.random.randn(10_000, 50).astype(np.float32)
X_raw[:, 5] = 0 # inject zero-variance feature
y = (X_raw[:, 0] + X_raw[:, 1] > 0).astype(np.int32)
X_tr, y_tr, X_val, y_val, X_te, y_te = train_val_test_split(X_raw, y)
pp = NumPyPreprocessor()
X_tr_scaled = pp.fit_transform(X_tr)
X_val_scaled = pp.transform(X_val)
X_te_scaled = pp.transform(X_te)
print(f"Features after removing zero-variance: {X_tr_scaled.shape[1]}") # 49
print(f"Train mean (should be ~0): {X_tr_scaled.mean():.4f}")
print(f"Train std (should be ~1): {X_tr_scaled.std():.4f}")
FAQ
Q: Should I know NumPy internals deeply for data science interviews? A: Broadcasting rules, dtype trade-offs, and vectorization patterns are tested at all levels. Memory layout and stride internals come up at senior ML engineer and research scientist levels. Candidates from public preparation resources confirm that intermediate NumPy fluency (broadcasting, linalg, indexing) covers most DS interview scenarios.
Q: What is the difference between np.dot and np.matmul?
A: For 2D arrays, they are equivalent. For N-D arrays, np.matmul treats the first N-2 dimensions as batch dimensions and performs batched matrix multiplication. np.dot has different rules (sum product on last axis of first array and second-to-last of second). Use @ operator (which calls matmul) for clarity.
Q: How important is NumPy for PyTorch or TensorFlow users? A: Very important. PyTorch tensors and NumPy arrays share the same memory model (C-contiguous, strides, broadcasting). Debugging shape mismatches and understanding gradient flow requires the same mental model as NumPy. Confirm specific framework expectations on the official company careers portal before your interview round.
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Explore this topic cluster
More resources in Interview Questions
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
Paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.
Submit your story →Ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.
Start Free Mock Test →Related Articles
Airbnb Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing Airbnb's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical, behavioural,...
Airtel Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing Airtel's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical, behavioural,...
AMD Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing AMD's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical, behavioural,...
Atlassian Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing Atlassian's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical,...
Barclays Interview Questions 2026
_Last verified by [Aditya Sharma](/author/aditya-sharma/) · cross-checked against PapersAdda Hiring Pulse and...
More from PapersAdda
Accenture Interview Questions 2026 (with Answers for Freshers)
Capgemini Interview Questions 2026 (with Answers for Freshers)
HCLTech Interview Questions 2026 (TechBee + TGT, with Answers)
IBM Interview Questions 2026 (with Answers for Freshers)