placement brief / Interview Questions / interview questions / 08 Jun 2026

NumPy Interview Questions 2026: 25 Answers with Code

Q: What is the difference between np.dot and np.matmul?

For 2D arrays, they are equivalent. For N-D arrays, np.matmul treats the first N-2 dimensions as batch dimensions and performs batched matrix multiplication. np.dot has different rules (sum product on last axis of first array and second-to-last of second). Use @ operator (which calls matmul) for clarity.

25 NumPy interview questions with full code answers covering ndarray operations, broadcasting, vectorization, linear algebra, random sampling, and performance optimization for 2026.

By Aditya SharmaPublished 8 Jun 20262 sources listedSpot an error? Corrections open

4 min read last revised 8 Jun 2026

on this page§ 04

NumPy is the numerical computing foundation of the Python ML stack. Every data science, ML engineering, and quant role assumes NumPy fluency. Interviewers test array operations, broadcasting, linear algebra, and performance patterns -- not just syntax. This guide covers 25 NumPy interview questions with full answers and code examples.

PapersAdda's take: Candidates report that broadcasting and vectorization questions separate good candidates from great ones in NumPy-heavy rounds at tier-1 product companies. The most common live coding scenario: "rewrite this Python loop using NumPy". Confirm the specific interview format on the official company careers portal before you prepare.

Related articles: Pandas Interview Questions 2026 | Machine Learning Interview Questions 2026 | Data Science Interview Questions 2026 | Scikit-Learn Interview Questions 2026 | Deep Learning Interview Questions 2026

EASY: Array Fundamentals (Questions 1-8)

Q1. What is a NumPy ndarray? How does it differ from a Python list?

import numpy as np
import sys

# Python list: heterogeneous, dynamic, pointers to objects
py_list = [1, 2.0, "three"]
print(f"List type: {type(py_list[0])}, {type(py_list[1])}")  # int, float

# NumPy array: homogeneous, contiguous memory, typed
arr = np.array([1, 2, 3], dtype=np.float32)
print(arr.dtype)   # float32
print(arr.shape)   # (3,)
print(arr.ndim)    # 1
print(arr.strides) # bytes between elements: (4,) for float32

# Memory comparison
py_list = list(range(1_000_000))
np_arr = np.arange(1_000_000)
print(f"Python list: {sys.getsizeof(py_list) / 1e6:.1f} MB")  # ~8 MB (pointer array + objects)
print(f"NumPy array: {np_arr.nbytes / 1e6:.1f} MB")           # ~8 MB (int64) or ~4 MB (int32)

Property	Python list	NumPy ndarray
Dtype	Heterogeneous	Homogeneous
Memory	Scattered (pointers to objects)	Contiguous block
Speed	Slow (Python loop per element)	Fast (C/BLAS vectorized)
Operations	Loops required	Element-wise (+, *, /...)
Multidimensional	List of lists	True N-D with shape/strides

Q2. What are NumPy dtypes? How do you choose the right one?

import numpy as np

# Integer types
a_int8  = np.array([1, 2, 3], dtype=np.int8)    # -128 to 127, 1 byte
a_int16 = np.array([1, 2, 3], dtype=np.int16)   # -32768 to 32767, 2 bytes
a_int32 = np.array([1, 2, 3], dtype=np.int32)   # ~+/-2B, 4 bytes
a_int64 = np.array([1, 2, 3], dtype=np.int64)   # default integer, 8 bytes

# Float types
a_f16 = np.array([1.0], dtype=np.float16)  # half precision, 2 bytes
a_f32 = np.array([1.0], dtype=np.float32)  # single precision, 4 bytes (ML default)
a_f64 = np.array([1.0], dtype=np.float64)  # double precision (Python default), 8 bytes

# Boolean
a_bool = np.array([True, False, True], dtype=np.bool_)  # 1 byte per element

# Check type limits
print(np.iinfo(np.int8))   # min=-128, max=127
print(np.finfo(np.float32)) # eps=1.19e-07, max=3.4e+38

# Type conversion
a_float = a_int32.astype(np.float32)

# Gotcha: float16 overflow
print(np.float16(65504))   # 65504 (max)
print(np.float16(65600))   # inf (overflow)

ML guideline: use float32 for model weights (halves memory vs float64 with negligible precision loss). Use float16/bfloat16 for mixed-precision training. Use int32 for indices.

Q3. Explain NumPy array creation methods.

import numpy as np

# From data
arr = np.array([1, 2, 3, 4])
matrix = np.array([[1, 2], [3, 4]], dtype=np.float32)

# Zeros / ones / full
zeros = np.zeros((3, 4), dtype=np.float32)
ones = np.ones((2, 3))
full = np.full((2, 3), fill_value=7.0)

# Identity / diagonal
eye = np.eye(4)             # 4x4 identity
diag = np.diag([1, 2, 3])   # diagonal matrix

# Range sequences
arange = np.arange(0, 10, 2)           # [0, 2, 4, 6, 8]
linspace = np.linspace(0, 1, 11)       # 11 evenly spaced points from 0 to 1
logspace = np.logspace(0, 3, 4)        # [1, 10, 100, 1000]

# Random arrays
rng = np.random.default_rng(seed=42)   # recommended modern API
uniform = rng.random((3, 3))           # Uniform [0, 1)
normal = rng.standard_normal((3, 3))   # N(0, 1)
integers = rng.integers(0, 100, (3, 3)) # Random ints

# From shape of another array
like_arr = np.zeros_like(matrix)       # same shape and dtype, filled with 0
ones_like = np.ones_like(matrix)

Q4. What is array indexing and slicing in NumPy? What is fancy indexing?

import numpy as np

a = np.arange(12).reshape(3, 4)
# [[ 0  1  2  3]
#  [ 4  5  6  7]
#  [ 8  9 10 11]]

# Basic slicing (returns a VIEW -- shares memory)
row0 = a[0, :]           # First row
col1 = a[:, 1]           # Second column
submatrix = a[0:2, 1:3]  # Rows 0-1, cols 1-2
every_other = a[::2, ::2] # Step slicing

# Boolean indexing (returns COPY)
mask = a > 5
print(a[mask])           # [6, 7, 8, 9, 10, 11]
a[mask] = -1             # In-place modification via boolean mask

# Fancy indexing with integer arrays (returns COPY)
rows = np.array([0, 2])
cols = np.array([1, 3])
print(a[rows, cols])     # [a[0,1], a[2,3]] = [1, 11]

# Select entire rows
print(a[[0, 2], :])      # Rows 0 and 2

# np.ix_ for outer product indexing
print(a[np.ix_([0, 2], [1, 3])])  # 2x2 submatrix at rows [0,2] and cols [1,3]

# View vs copy verification
view = a[0:2, 0:2]
copy = a[[0, 1], :][:, [0, 1]]
print(np.shares_memory(a, view))   # True
print(np.shares_memory(a, copy))   # False

Q5. What is the difference between reshape, resize, and ravel/flatten?

import numpy as np

a = np.arange(12)

# reshape: returns VIEW (if possible), does not change data
b = a.reshape(3, 4)      # (12,) -> (3, 4)
c = a.reshape(3, -1)     # -1: NumPy infers the dimension -> (3, 4)
d = a.reshape(2, 2, 3)   # 3D reshape

# Important: reshape returns view when possible
print(np.shares_memory(a, b))  # True

# Reshape fails if total size doesn't match
# a.reshape(3, 5)  # ValueError: cannot reshape (12,) to (3, 5)

# ravel: return 1D VIEW (if contiguous), or copy
flat_view = b.ravel()    # Fast -- returns view of contiguous array
print(np.shares_memory(b, flat_view))  # True

# flatten: always returns a COPY
flat_copy = b.flatten()
print(np.shares_memory(b, flat_copy))  # False

# resize: modifies array IN-PLACE, can change total size (fills with 0 or repeats)
a2 = np.array([1, 2, 3])
a2.resize(5)             # [1, 2, 3, 0, 0] -- in-place, new size
a2.resize(2)             # [1, 2] -- truncate

# np.resize (function, not method): creates new array, wraps values cyclically
resized = np.resize(np.array([1, 2, 3]), (2, 4))
# [[1, 2, 3, 1],
#  [2, 3, 1, 2]]

Q6. Explain axis parameter in NumPy operations.

import numpy as np

a = np.array([[1, 2, 3],
              [4, 5, 6]])  # shape (2, 3)

# axis=0: collapse along rows (down the columns)
print(a.sum(axis=0))    # [5, 7, 9] -- sum down each column
print(a.max(axis=0))    # [4, 5, 6] -- max of each column

# axis=1: collapse along columns (across the rows)
print(a.sum(axis=1))    # [6, 15] -- sum across each row
print(a.mean(axis=1))   # [2., 5.] -- mean of each row

# keepdims: preserve dimension for broadcasting
col_means = a.mean(axis=0, keepdims=True)  # shape (1, 3)
row_means = a.mean(axis=1, keepdims=True)  # shape (2, 1)

# Normalize each row (subtract row mean)
a_row_norm = a - a.mean(axis=1, keepdims=True)

# Normalize each column (subtract column mean)
a_col_norm = a - a.mean(axis=0, keepdims=True)

# 3D example
b = np.ones((2, 3, 4))  # (batch, height, width)
print(b.sum(axis=0).shape)   # (3, 4) -- sum over batch dimension
print(b.sum(axis=(1, 2)).shape)  # (2,) -- sum over spatial dimensions

Q7. What are strides in NumPy? How do they enable views?

import numpy as np

a = np.arange(12, dtype=np.int32)   # 1D array, 4 bytes per element
print(a.strides)   # (4,) -- 4 bytes to next element

b = a.reshape(3, 4)
print(b.strides)   # (16, 4) -- 16 bytes to next row, 4 bytes to next column

# Transposing: just reverses strides (no data copy)
bT = b.T
print(bT.strides)  # (4, 16) -- 4 bytes to next row, 16 bytes to next column
print(np.shares_memory(b, bT))  # True

# Practical: stride tricks for sliding windows (efficient)
from numpy.lib.stride_tricks import sliding_window_view

data = np.array([1, 2, 3, 4, 5, 6, 7])
windows = sliding_window_view(data, window_shape=3)
# [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]]
print(windows)

# Custom stride: every-other-element view
every_other = data[::2]   # (3,) elements, stride = 2*4 = 8 bytes
print(every_other.strides)  # (8,)

Q8. What is np.where? How is it different from np.select?

import numpy as np

a = np.array([-3, -1, 0, 2, 5, -4, 7])

# np.where(condition, x, y): element-wise conditional
result = np.where(a > 0, a, 0)    # ReLU activation
print(result)  # [0, 0, 0, 2, 5, 0, 7]

# np.where with multiple arrays
b = np.array([10, 10, 10, 10, 10, 10, 10])
result2 = np.where(a > 0, a, b)   # take from a if positive, else from b

# np.where(condition) -- returns indices (like argwhere)
indices = np.where(a > 0)
print(indices)   # (array([3, 4, 6]),) -- indices where condition is True
print(a[indices])  # [2, 5, 7]

# np.select: multiple conditions (like if-elif-else chain)
conditions = [a < -2, a.between(-2, 2) if hasattr(a, 'between') else (a >= -2) & (a <= 2), a > 2]
choices    = ["very_negative", "neutral", "positive"]
result3 = np.select(conditions, choices, default="unknown")
print(result3)  # ['very_negative', 'neutral', 'neutral', 'neutral', 'positive', 'very_negative', 'positive']

MEDIUM: Broadcasting and Vectorization (Questions 9-18)

Q9. Explain NumPy broadcasting rules with examples.

Broadcasting allows operations on arrays with different shapes by "expanding" smaller arrays.

Rules (applied right-to-left):

If arrays differ in rank, prepend 1s to the smaller array's shape.
Sizes along each dimension must match OR one of them must be 1.
Dimension of size 1 is stretched to match the other size.

import numpy as np

# Example 1: scalar broadcast
a = np.array([1, 2, 3])
print(a + 10)       # [11, 12, 13] -- scalar treated as (1,) then (3,)

# Example 2: (3,) + (3, 1) -> (3, 3)
col = np.array([[1], [2], [3]])   # shape (3, 1)
row = np.array([10, 20, 30])      # shape (3,) -> treated as (1, 3) -> (3, 3)
print(col + row)
# [[11, 21, 31],
#  [12, 22, 32],
#  [13, 23, 33]]

# Example 3: batch normalization broadcasting
X = np.random.randn(100, 10)     # (batch, features)
mean = X.mean(axis=0)            # (10,)
std = X.std(axis=0)              # (10,)
X_norm = (X - mean) / (std + 1e-8)  # (100, 10) - (10,) broadcasts correctly

# Example 4: pairwise distance matrix
A = np.array([[1, 2], [3, 4], [5, 6]])  # (3, 2)
B = np.array([[1, 1], [2, 2]])          # (2, 2)
# A[:, np.newaxis] is (3, 1, 2); B is (2, 2)
# Broadcast: (3, 1, 2) and (2, 2) -> (3, 2, 2)
diff = A[:, np.newaxis, :] - B[np.newaxis, :, :]  # (3, 2, 2)
dist = np.sqrt((diff**2).sum(axis=2))              # (3, 2)

Q10. How do you vectorize a Python loop with NumPy?

import numpy as np
import timeit

# Task: compute pairwise Euclidean distances between N points

# SLOW: Python loop
def pairwise_dist_loop(X):
    n = len(X)
    D = np.zeros((n, n))
    for i in range(n):
        for j in range(n):
            D[i, j] = np.sqrt(((X[i] - X[j])**2).sum())
    return D

# FAST: broadcasting
def pairwise_dist_vec(X):
    diff = X[:, np.newaxis, :] - X[np.newaxis, :, :]  # (N, N, D)
    return np.sqrt((diff**2).sum(axis=2))

# FASTER: expand via ||a-b||^2 = ||a||^2 - 2a.b + ||b||^2
def pairwise_dist_fast(X):
    sq_norms = (X**2).sum(axis=1)                        # (N,)
    return np.sqrt(sq_norms[:, np.newaxis] - 2 * (X @ X.T) + sq_norms[np.newaxis, :])

X = np.random.randn(500, 10)
print(np.allclose(pairwise_dist_loop(X[:10]), pairwise_dist_vec(X[:10])))  # True

# Benchmark
t_loop = timeit.timeit(lambda: pairwise_dist_loop(X[:50]), number=3) / 3
t_vec  = timeit.timeit(lambda: pairwise_dist_vec(X), number=10) / 10
t_fast = timeit.timeit(lambda: pairwise_dist_fast(X), number=10) / 10
print(f"Loop: {t_loop:.3f}s | Vectorized: {t_vec:.3f}s | Fast: {t_fast:.3f}s")

Q11. What is np.einsum? Give ML use cases.

np.einsum expresses tensor contractions with Einstein summation notation. Faster than manual transposes + dot products for complex operations.

import numpy as np

# Notation: lowercase letters are indices; repeated = summed over (contracted)

A = np.random.randn(3, 4)   # matrix (i, j)
B = np.random.randn(4, 5)   # matrix (j, k)
C = np.random.randn(3, 4, 5) # 3D tensor (i, j, k)

# Matrix multiplication: A @ B
print(np.einsum('ij,jk->ik', A, B).shape)   # (3, 5)

# Dot product of two vectors
a, b = np.array([1, 2, 3]), np.array([4, 5, 6])
print(np.einsum('i,i->', a, b))   # 32

# Outer product
print(np.einsum('i,j->ij', a, b).shape)   # (3, 3)

# Batch matrix multiplication (B, M, K) x (B, K, N) -> (B, M, N)
X = np.random.randn(8, 10, 16)  # (batch, seq, d_k)
W = np.random.randn(8, 16, 32)  # (batch, d_k, d_v)
out = np.einsum('bij,bjk->bik', X, W)   # (8, 10, 32)

# Multi-head attention score: (B, H, T, D) x (B, H, D, T) -> (B, H, T, T)
Q = np.random.randn(2, 4, 16, 64)
K = np.random.randn(2, 4, 16, 64)
scores = np.einsum('bhid,bhjd->bhij', Q, K)   # (2, 4, 16, 16)

# Transpose: equivalent to .T
print(np.einsum('ij->ji', A).shape)   # (4, 3)

# Trace (sum of diagonal)
print(np.einsum('ii->', np.eye(4)))   # 4.0

Q12. What is np.linalg? Cover the key operations used in ML.

import numpy as np

A = np.array([[2, 1], [5, 3]], dtype=float)

# Matrix inverse
A_inv = np.linalg.inv(A)
print(A @ A_inv)   # Identity matrix (approximately)

# Determinant
det = np.linalg.det(A)
print(det)   # 1.0

# Eigenvalues and eigenvectors (for PCA, covariance matrices)
eigenvalues, eigenvectors = np.linalg.eig(A)

# Symmetric matrices: eigh is faster and guaranteed real eigenvalues
cov = np.cov(np.random.randn(3, 100))  # 3x3 covariance matrix
eigvals, eigvecs = np.linalg.eigh(cov)

# SVD (Singular Value Decomposition) -- foundation of PCA, matrix factorization
M = np.random.randn(5, 4)
U, S, Vt = np.linalg.svd(M, full_matrices=False)
print(U.shape, S.shape, Vt.shape)  # (5, 4), (4,), (4, 4)

# Reconstruct: M = U @ np.diag(S) @ Vt
M_reconstructed = U @ np.diag(S) @ Vt
print(np.allclose(M, M_reconstructed))  # True

# Rank-k approximation (dimensionality reduction)
k = 2
M_approx = U[:, :k] @ np.diag(S[:k]) @ Vt[:k, :]

# Solve linear system Ax = b
b = np.array([4, 7], dtype=float)
x = np.linalg.solve(A, b)
print(A @ x)  # should equal b

# Least squares (overdetermined systems)
X = np.random.randn(100, 5)
y = np.random.randn(100)
coeffs, residuals, rank, sv = np.linalg.lstsq(X, y, rcond=None)

# Matrix norm
frob_norm = np.linalg.norm(A, 'fro')     # Frobenius norm
spec_norm = np.linalg.norm(A, 2)         # Spectral norm (largest singular value)

Q13. How do you implement common ML operations from scratch using NumPy?

import numpy as np

# Softmax (numerically stable)
def softmax(x):
    x = x - x.max(axis=-1, keepdims=True)   # subtract max for stability
    e_x = np.exp(x)
    return e_x / e_x.sum(axis=-1, keepdims=True)

# Cross-entropy loss
def cross_entropy(y_pred, y_true):
    """y_pred: (N, C) probabilities, y_true: (N,) integer class labels"""
    N = y_pred.shape[0]
    log_prob = -np.log(y_pred[np.arange(N), y_true] + 1e-8)
    return log_prob.mean()

# Sigmoid + Binary cross-entropy
def sigmoid(x):
    return np.where(x >= 0,
                    1 / (1 + np.exp(-x)),
                    np.exp(x) / (1 + np.exp(x)))  # numerically stable

def bce_loss(y_pred, y_true):
    return -np.mean(y_true * np.log(y_pred + 1e-8) + (1 - y_true) * np.log(1 - y_pred + 1e-8))

# Linear regression with gradient descent
def linear_regression_sgd(X, y, lr=0.01, epochs=100):
    n, d = X.shape
    w = np.zeros(d)
    b = 0.0
    for _ in range(epochs):
        y_pred = X @ w + b
        grad_w = (1/n) * X.T @ (y_pred - y)
        grad_b = (1/n) * (y_pred - y).sum()
        w -= lr * grad_w
        b -= lr * grad_b
    return w, b

# K-means clustering
def kmeans(X, k, n_iter=100, seed=42):
    rng = np.random.default_rng(seed)
    centers = X[rng.choice(len(X), k, replace=False)]
    for _ in range(n_iter):
        dists = np.linalg.norm(X[:, np.newaxis] - centers[np.newaxis], axis=2)  # (N, k)
        labels = dists.argmin(axis=1)
        centers = np.array([X[labels == i].mean(axis=0) for i in range(k)])
    return labels, centers

Q14. What is np.random? Explain the difference between the legacy API and Generator.

import numpy as np

# Legacy API (still works, but NOT recommended for new code)
np.random.seed(42)
a = np.random.randn(3, 3)
b = np.random.randint(0, 10, size=5)

# Modern API: np.random.default_rng (recommended since NumPy 1.17)
rng = np.random.default_rng(seed=42)

# Common distributions
uniform = rng.random((3, 3))               # Uniform [0, 1)
normal = rng.standard_normal((3, 3))       # N(0, 1)
integers = rng.integers(0, 10, (5,))       # Integer in [0, 10)
choice = rng.choice([1, 2, 3, 4, 5], size=3, replace=False)  # Sample without replacement
binomial = rng.binomial(n=10, p=0.5, size=100)

# Shuffle (in-place)
arr = np.arange(10)
rng.shuffle(arr)

# Permutation (returns new array)
perm = rng.permutation(10)

# Reproducibility across processes (use SeedSequence)
ss = np.random.SeedSequence(42)
child_seeds = ss.spawn(4)    # 4 independent streams for parallel workers
rngs = [np.random.default_rng(s) for s in child_seeds]

# Why Generator over legacy:
# - Reproducible across platforms and NumPy versions
# - Multiple independent streams for parallel code
# - Faster PCG64 algorithm vs Mersenne Twister

Q15. How do you use np.vectorize and when should you avoid it?

import numpy as np
import timeit

# Custom Python function that operates on scalars
def process_scalar(x, threshold=5):
    if x > threshold:
        return x ** 2
    elif x > 0:
        return x
    else:
        return 0

# np.vectorize: wraps scalar function to accept arrays
vfunc = np.vectorize(process_scalar, excluded=["threshold"])
arr = np.array([-2, 3, 7, -1, 10])
print(vfunc(arr, threshold=5))  # [0, 3, 49, 0, 100]

# IMPORTANT: np.vectorize is a convenience wrapper, NOT a performance tool
# It still loops in Python internally!

# Benchmark
arr_big = np.random.randn(100_000)
t_vec = timeit.timeit(lambda: vfunc(arr_big), number=10) / 10

# True vectorized equivalent
def process_vectorized(x, threshold=5):
    result = np.zeros_like(x)
    result = np.where(x > 0, x, result)
    result = np.where(x > threshold, x**2, result)
    return result

t_fast = timeit.timeit(lambda: process_vectorized(arr_big), number=10) / 10
print(f"np.vectorize: {t_vec:.3f}s | true vectorized: {t_fast:.4f}s")
# Typically 10-100x difference in favor of true vectorized

# Use np.vectorize ONLY for:
# - Quick prototyping
# - Complex logic that is hard to vectorize
# - When the function calls external C/Cython code (then overhead is negligible)

Q16. What is memory layout (C vs Fortran order)? When does it matter?

import numpy as np
import timeit

# C order (row-major): consecutive elements along last axis are contiguous in memory
# Fortran order (column-major): consecutive elements along first axis are contiguous

a_C = np.array([[1, 2, 3], [4, 5, 6]], order='C')   # default
a_F = np.array([[1, 2, 3], [4, 5, 6]], order='F')

print(a_C.strides)   # (12, 4) -- 4 bytes per element, row stride = 3*4
print(a_F.strides)   # (4, 8) -- column-major

# Performance impact: row-wise iteration is fast for C order
def sum_rows_C(a):
    return a.sum(axis=1)   # sums across last axis -- contiguous for C order

def sum_cols_C(a):
    return a.sum(axis=0)   # sums across first axis -- NOT contiguous for C order

# Large matrix
big = np.random.randn(1000, 1000)
t_row = timeit.timeit(lambda: sum_rows_C(big), number=1000) / 1000
t_col = timeit.timeit(lambda: sum_cols_C(big), number=1000) / 1000
print(f"Row sum: {t_row*1e6:.1f}us | Col sum: {t_col*1e6:.1f}us")

# Contiguous check
print(a_C.flags['C_CONTIGUOUS'])   # True
print(a_C.flags['F_CONTIGUOUS'])   # False
print(a_F.flags['F_CONTIGUOUS'])   # True

# ascontiguousarray: force C order (needed for some BLAS/cuBLAS operations)
a_contiguous = np.ascontiguousarray(a_F)

Q17. How do you use np.apply_along_axis, np.apply_over_axes, and when to avoid them?

import numpy as np

a = np.random.randn(4, 5, 3)

# apply_along_axis: apply 1D function along a specific axis
def normalize_1d(x):
    return (x - x.mean()) / (x.std() + 1e-8)

# Normalize each row of 2D
b = np.random.randn(10, 5)
result = np.apply_along_axis(normalize_1d, axis=1, arr=b)  # apply to each row
print(result.shape)  # (10, 5)

# BETTER: vectorized equivalent (much faster)
result_fast = (b - b.mean(axis=1, keepdims=True)) / (b.std(axis=1, keepdims=True) + 1e-8)

# apply_over_axes: apply reduction over multiple axes
# e.g., sum over axes 0 and 2 of a (4, 5, 3) array
result_axes = np.apply_over_axes(np.sum, a, [0, 2])  # (1, 5, 1)
# Better: a.sum(axis=(0, 2), keepdims=True)

# Performance guideline:
# apply_along_axis calls a Python function per slice -- similar speed to np.vectorize
# Always prefer vectorized operations when possible
# apply_along_axis is useful when the function has complex logic not easily vectorized

# Real vectorized batch normalization
def batch_norm(X, axis=0):
    mean = X.mean(axis=axis, keepdims=True)
    std = X.std(axis=axis, keepdims=True)
    return (X - mean) / (std + 1e-8)

Q18. What is np.meshgrid? Give a machine learning use case.

import numpy as np
import matplotlib.pyplot as plt

# meshgrid: create coordinate matrices from 1D arrays
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)  # X and Y are both (100, 100)

# Use case 1: plot decision boundary of a classifier
from sklearn.datasets import make_moons
from sklearn.svm import SVC

data, labels = make_moons(n_samples=200, noise=0.1, random_state=42)
model = SVC(kernel='rbf', probability=True)
model.fit(data, labels)

# Create grid over feature space
x_min, x_max = data[:, 0].min() - 0.5, data[:, 0].max() + 0.5
y_min, y_max = data[:, 1].min() - 0.5, data[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 200),
                      np.linspace(y_min, y_max, 200))

# Predict on every grid point
grid_points = np.c_[xx.ravel(), yy.ravel()]
Z = model.predict(grid_points).reshape(xx.shape)

# Use case 2: Gaussian function over 2D grid
def gaussian_2d(X, Y, sigma=1.0):
    return np.exp(-(X**2 + Y**2) / (2 * sigma**2))

Z_gauss = gaussian_2d(X, Y, sigma=1.5)
print(Z_gauss.shape)  # (100, 100)

# Use case 3: compute all pairwise feature interactions
f1 = np.array([1, 2, 3, 4, 5])
f2 = np.array([10, 20, 30])
F1, F2 = np.meshgrid(f1, f2, indexing='ij')  # (5, 3) interaction grid
interactions = F1 * F2   # all pairwise products

HARD: Performance and Advanced Patterns (Questions 19-25)

Q19. What is numba? When should you use it instead of NumPy?

import numpy as np
from numba import jit, njit, prange
import timeit

# Pure NumPy: fast for element-wise, slow for complex loops
def rolling_std_numpy(arr, window):
    result = np.empty(len(arr))
    result[:window-1] = np.nan
    for i in range(window-1, len(arr)):
        result[i] = arr[i-window+1:i+1].std()
    return result

# Numba JIT: compile Python loop to LLVM machine code
@njit(parallel=True)  # no Python object overhead, parallel via OpenMP
def rolling_std_numba(arr, window):
    result = np.empty(len(arr))
    for i in prange(len(arr)):   # parallel range
        if i < window - 1:
            result[i] = np.nan
        else:
            result[i] = arr[i-window+1:i+1].std()
    return result

arr = np.random.randn(1_000_000)

# First call compiles (slow), subsequent calls fast
rolling_std_numba(arr, 20)  # warmup
t_np = timeit.timeit(lambda: rolling_std_numpy(arr, 20), number=3) / 3
t_nb = timeit.timeit(lambda: rolling_std_numba(arr, 20), number=10) / 10
print(f"NumPy loop: {t_np:.3f}s | Numba: {t_nb:.4f}s")

# Use numba when:
# - Algorithm REQUIRES a loop (cannot be expressed as array operations)
# - Same loop runs millions of times
# - Custom numerical algorithms (Monte Carlo, custom optimizers)

# Avoid numba when:
# - Simple vectorized NumPy expression already exists
# - Code involves Python objects (strings, dicts, lists)
# - Startup compilation time matters (microservice cold start)

Q20. Implement a neural network forward pass with NumPy.

import numpy as np

class DenseLayer:
    def __init__(self, in_features: int, out_features: int):
        # He initialization for ReLU layers
        self.W = np.random.randn(in_features, out_features) * np.sqrt(2 / in_features)
        self.b = np.zeros(out_features)

    def forward(self, x: np.ndarray) -> np.ndarray:
        self.x = x  # cache for backward
        return x @ self.W + self.b   # (N, out_features)

    def backward(self, dout: np.ndarray) -> np.ndarray:
        self.dW = self.x.T @ dout
        self.db = dout.sum(axis=0)
        return dout @ self.W.T

class ReLU:
    def forward(self, x):
        self.mask = x > 0
        return x * self.mask

    def backward(self, dout):
        return dout * self.mask

class TwoLayerNet:
    def __init__(self, in_dim, hidden_dim, out_dim):
        self.l1 = DenseLayer(in_dim, hidden_dim)
        self.relu = ReLU()
        self.l2 = DenseLayer(hidden_dim, out_dim)

    def forward(self, x):
        return self.l2.forward(self.relu.forward(self.l1.forward(x)))

    def loss_and_backward(self, x, y):
        logits = self.forward(x)
        # Softmax + cross-entropy gradient
        probs = np.exp(logits - logits.max(axis=1, keepdims=True))
        probs /= probs.sum(axis=1, keepdims=True)
        N = x.shape[0]
        loss = -np.log(probs[np.arange(N), y] + 1e-8).mean()
        dout = probs.copy()
        dout[np.arange(N), y] -= 1
        dout /= N
        self.l2.backward(self.relu.backward(self.l1.backward(dout)))
        return loss

# Test
net = TwoLayerNet(784, 128, 10)
X = np.random.randn(32, 784)
y = np.random.randint(0, 10, 32)
loss = net.loss_and_backward(X, y)
print(f"Loss: {loss:.4f}")

Q21. How do you use np.memmap for arrays larger than RAM?

import numpy as np
import os
import tempfile

# Create a memory-mapped file
tmpfile = tempfile.mktemp(suffix='.npy')

# Write mode: create a large array on disk
fp = np.memmap(tmpfile, dtype=np.float32, mode='w+', shape=(100_000, 512))
fp[:] = np.random.randn(100_000, 512)
fp.flush()  # Write to disk
del fp      # Close the memmap

# Read mode: access without loading all into RAM
fp_read = np.memmap(tmpfile, dtype=np.float32, mode='r', shape=(100_000, 512))
print(fp_read.shape)     # (100000, 512)
print(fp_read[0, :5])    # Access row 0 -- only loaded page is in RAM

# Slicing loads only the needed pages
batch = fp_read[0:1000]   # Load 1000 rows
print(batch.shape)        # (1000, 512)

# Use case: large embedding matrices or precomputed feature sets
# embeddings = np.memmap("embeddings.mmap", dtype=np.float32, mode='r', shape=(10_000_000, 768))
# Without memmap: 10M * 768 * 4 bytes = 30.7 GB RAM needed
# With memmap: only accessed pages in RAM (page = 4KB typically)

os.unlink(tmpfile)

Q22. What is np.testing? How do you write unit tests for numerical code?

import numpy as np
import pytest

def compute_softmax(x):
    x_shifted = x - x.max(axis=-1, keepdims=True)
    exp_x = np.exp(x_shifted)
    return exp_x / exp_x.sum(axis=-1, keepdims=True)

# Testing with np.testing (precision-aware comparisons)
class TestSoftmax:
    def test_output_sums_to_one(self):
        x = np.array([[1.0, 2.0, 3.0], [0.5, 0.1, -0.5]])
        out = compute_softmax(x)
        np.testing.assert_allclose(out.sum(axis=1), np.ones(2), rtol=1e-6)

    def test_output_range(self):
        x = np.random.randn(10, 5)
        out = compute_softmax(x)
        assert (out >= 0).all() and (out <= 1).all()

    def test_numerical_stability(self):
        # Large values should not cause overflow
        x = np.array([[1000.0, 1000.0, 1000.0]])
        out = compute_softmax(x)
        np.testing.assert_allclose(out, np.ones((1, 3)) / 3, rtol=1e-5)

    def test_small_values(self):
        x = np.array([[-1000.0, -1000.0, 0.0]])
        out = compute_softmax(x)
        np.testing.assert_allclose(out[0, 2], 1.0, rtol=1e-4)

    def test_shape_preserved(self):
        x = np.random.randn(4, 8, 10)
        out = compute_softmax(x)
        assert out.shape == x.shape

# np.testing key functions:
# assert_allclose(actual, desired, rtol, atol) -- floating point equality
# assert_array_equal(x, y) -- exact integer equality
# assert_array_less(x, y) -- element-wise less than
# assert_approx_equal(actual, desired, significant) -- scalar comparison

Q23. Implement batch normalization forward and backward pass in NumPy.

import numpy as np

def batch_norm_forward(x, gamma, beta, eps=1e-5):
    """
    x: (N, D) input
    gamma, beta: (D,) learnable scale and shift
    Returns normalized output and cache for backward pass.
    """
    mu = x.mean(axis=0)                # (D,)
    var = x.var(axis=0)                # (D,)
    x_hat = (x - mu) / np.sqrt(var + eps)   # (N, D)
    out = gamma * x_hat + beta         # (N, D)
    cache = (x, x_hat, mu, var, gamma, eps)
    return out, cache

def batch_norm_backward(dout, cache):
    """Backprop through batch normalization."""
    x, x_hat, mu, var, gamma, eps = cache
    N, D = x.shape

    dgamma = (dout * x_hat).sum(axis=0)    # (D,)
    dbeta = dout.sum(axis=0)               # (D,)

    dx_hat = dout * gamma                  # (N, D)
    dvar = (dx_hat * (x - mu) * -0.5 * (var + eps)**-1.5).sum(axis=0)  # (D,)
    dmu = (dx_hat * (-1 / np.sqrt(var + eps))).sum(axis=0) + dvar * (-2 * (x - mu).mean(axis=0))
    dx = (dx_hat / np.sqrt(var + eps)) + (dvar * 2 * (x - mu) / N) + (dmu / N)

    return dx, dgamma, dbeta

# Test
np.random.seed(42)
x = np.random.randn(32, 64)
gamma = np.ones(64)
beta = np.zeros(64)

out, cache = batch_norm_forward(x, gamma, beta)
print(f"Output mean: {out.mean():.6f} (expected ~0)")
print(f"Output std:  {out.std():.6f} (expected ~1)")

dout = np.random.randn(*out.shape)
dx, dgamma, dbeta = batch_norm_backward(dout, cache)
print(f"dx shape: {dx.shape}, dgamma shape: {dgamma.shape}")

Q24. How does np.fft work? Give a signal processing / ML use case.

import numpy as np
import matplotlib.pyplot as plt

# Generate a synthetic signal: 50Hz + 120Hz components
sample_rate = 1000  # Hz
t = np.linspace(0, 1, sample_rate, endpoint=False)
signal = (np.sin(2 * np.pi * 50 * t) +       # 50 Hz component
          0.5 * np.sin(2 * np.pi * 120 * t) + # 120 Hz component
          0.1 * np.random.randn(len(t)))       # noise

# FFT
fft_result = np.fft.fft(signal)
frequencies = np.fft.fftfreq(len(signal), d=1/sample_rate)
magnitude = np.abs(fft_result[:len(signal)//2])  # positive frequencies only
positive_freqs = frequencies[:len(signal)//2]

# Identify dominant frequencies
top_k_idx = magnitude.argsort()[-5:][::-1]
print("Top frequencies (Hz):", positive_freqs[top_k_idx])  # should show 50, 120

# ML use case: audio feature extraction (spectral features)
def extract_spectral_features(signal, sr=1000, n_features=20):
    """Extract FFT-based features for audio classification."""
    fft_mag = np.abs(np.fft.fft(signal))[:len(signal)//2]
    freqs = np.fft.fftfreq(len(signal), 1/sr)[:len(signal)//2]

    # Spectral centroid (center of mass of spectrum)
    centroid = np.sum(freqs * fft_mag) / np.sum(fft_mag)

    # Spectral rolloff (frequency below which 85% of energy lies)
    cumsum = np.cumsum(fft_mag)
    rolloff_idx = np.searchsorted(cumsum, 0.85 * cumsum[-1])

    # Band energy features
    band_edges = np.linspace(0, sr/2, n_features + 1)
    band_energies = np.array([
        fft_mag[(freqs >= band_edges[i]) & (freqs < band_edges[i+1])].sum()
        for i in range(n_features)
    ])
    return np.concatenate([[centroid, freqs[rolloff_idx]], band_energies])

features = extract_spectral_features(signal)
print(f"Feature vector length: {len(features)}")

Q25. Design a NumPy pipeline for processing a large tabular dataset for ML.

import numpy as np
from typing import Tuple

class NumPyPreprocessor:
    """Production-grade NumPy preprocessing pipeline."""

    def __init__(self):
        self.mean_ = None
        self.std_ = None
        self.feature_mask_ = None

    def fit(self, X: np.ndarray, y: np.ndarray = None) -> 'NumPyPreprocessor':
        # Remove zero-variance features
        self.feature_mask_ = X.std(axis=0) > 1e-10
        X_filtered = X[:, self.feature_mask_]

        # Compute normalization stats (on training data only)
        self.mean_ = X_filtered.mean(axis=0)
        self.std_ = np.where(X_filtered.std(axis=0) > 0, X_filtered.std(axis=0), 1.0)
        return self

    def transform(self, X: np.ndarray) -> np.ndarray:
        X_filtered = X[:, self.feature_mask_]
        return (X_filtered - self.mean_) / self.std_

    def fit_transform(self, X: np.ndarray, y: np.ndarray = None) -> np.ndarray:
        return self.fit(X, y).transform(X)

def train_val_test_split(X, y, val_frac=0.15, test_frac=0.15, seed=42):
    rng = np.random.default_rng(seed)
    idx = rng.permutation(len(X))
    n_test = int(len(X) * test_frac)
    n_val = int(len(X) * val_frac)
    test_idx = idx[:n_test]
    val_idx = idx[n_test:n_test + n_val]
    train_idx = idx[n_test + n_val:]
    return (X[train_idx], y[train_idx],
            X[val_idx], y[val_idx],
            X[test_idx], y[test_idx])

# Full pipeline
np.random.seed(42)
X_raw = np.random.randn(10_000, 50).astype(np.float32)
X_raw[:, 5] = 0  # inject zero-variance feature
y = (X_raw[:, 0] + X_raw[:, 1] > 0).astype(np.int32)

X_tr, y_tr, X_val, y_val, X_te, y_te = train_val_test_split(X_raw, y)
pp = NumPyPreprocessor()
X_tr_scaled = pp.fit_transform(X_tr)
X_val_scaled = pp.transform(X_val)
X_te_scaled = pp.transform(X_te)

print(f"Features after removing zero-variance: {X_tr_scaled.shape[1]}")  # 49
print(f"Train mean (should be ~0): {X_tr_scaled.mean():.4f}")
print(f"Train std (should be ~1): {X_tr_scaled.std():.4f}")

FAQ

Q: Should I know NumPy internals deeply for data science interviews?

A: Broadcasting rules, dtype trade-offs, and vectorization patterns are tested at all levels. Memory layout and stride internals come up at senior ML engineer and research scientist levels. Candidates from public preparation resources confirm that intermediate NumPy fluency (broadcasting, linalg, indexing) covers most DS interview scenarios.

Q: What is the difference between np.dot and np.matmul?

A: For 2D arrays, they are equivalent. For N-D arrays, np.matmul treats the first N-2 dimensions as batch dimensions and performs batched matrix multiplication. np.dot has different rules (sum product on last axis of first array and second-to-last of second). Use @ operator (which calls matmul) for clarity.

Q: How important is NumPy for PyTorch or TensorFlow users?

A: Very important. PyTorch tensors and NumPy arrays share the same memory model (C-contiguous, strides, broadcasting). Debugging shape mismatches and understanding gradient flow requires the same mental model as NumPy. Confirm specific framework expectations on the official company careers portal before your interview round.

Sources and review notesreviewed 8 Jun 2026

Article-specific sources

Verification window

Page last edited 8 Jun 2026 by Aditya Sharma. A review date records an editorial edit, not a guarantee that every external fact is still current.

Evidence labels

Official notices, candidate reports, offer documents, and editorial practice questions carry different confidence levels. The visible source list lets you inspect the evidence instead of relying on a blanket verification badge.

Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

topic cluster

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.

Submit your story →

ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start free mock test →

related guides

Interview Questions

Share this guide

Twitter LinkedIn W WhatsApp

NumPy Interview Questions 2026: 25 Answers with Code

EASY: Array Fundamentals (Questions 1-8)

Q1. What is a NumPy ndarray? How does it differ from a Python list?

Q2. What are NumPy dtypes? How do you choose the right one?

Q3. Explain NumPy array creation methods.

Q4. What is array indexing and slicing in NumPy? What is fancy indexing?

Q5. What is the difference between reshape, resize, and ravel/flatten?

Q6. Explain axis parameter in NumPy operations.

Q7. What are strides in NumPy? How do they enable views?

Q8. What is np.where? How is it different from np.select?

MEDIUM: Broadcasting and Vectorization (Questions 9-18)

Q9. Explain NumPy broadcasting rules with examples.

Q10. How do you vectorize a Python loop with NumPy?

Q11. What is np.einsum? Give ML use cases.

Q12. What is np.linalg? Cover the key operations used in ML.

Q13. How do you implement common ML operations from scratch using NumPy?

Q14. What is np.random? Explain the difference between the legacy API and Generator.

Q15. How do you use np.vectorize and when should you avoid it?

Q16. What is memory layout (C vs Fortran order)? When does it matter?

Q17. How do you use np.apply_along_axis, np.apply_over_axes, and when to avoid them?

Q18. What is np.meshgrid? Give a machine learning use case.

HARD: Performance and Advanced Patterns (Questions 19-25)

Q19. What is numba? When should you use it instead of NumPy?

Q20. Implement a neural network forward pass with NumPy.

Q21. How do you use np.memmap for arrays larger than RAM?

Q22. What is np.testing? How do you write unit tests for numerical code?

Q23. Implement batch normalization forward and backward pass in NumPy.

Q24. How does np.fft work? Give a signal processing / ML use case.

Q25. Design a NumPy pipeline for processing a large tabular dataset for ML.

FAQ

Q: Should I know NumPy internals deeply for data science interviews?

Q: What is the difference between np.dot and np.matmul?

Q: How important is NumPy for PyTorch or TensorFlow users?

More resources in Interview Questions

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

Data Science Interview Questions 2026: 30 Answers with Code

Pandas Interview Questions 2026: 28 Answers with Code

Scikit-Learn Interview Questions 2026: 28 Answers with Code

Airflow Interview Questions 2026: 25 Answers with Code

Apache Spark Interview Questions 2026: 28 Answers with Code

Share this guide