TensorFlow Interview Questions 2026: 28 Answers with Code

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
TensorFlow 2.x with Keras is the dominant enterprise ML framework at Google, Google Cloud, and thousands of production ML teams worldwide. While PyTorch leads in research, TensorFlow wins on deployment tooling: TFLite for mobile, TF Serving for production APIs, TFX for pipelines, and Google's TPU infrastructure. This guide covers 28 TensorFlow interview questions with complete code examples.
PapersAdda's take: If you're interviewing at a Google-adjacent company, a large bank, or any company with a legacy ML infrastructure, you will encounter TensorFlow. Know TF 2.x / Keras well, understand GradientTape for custom training, and be ready to explain TFLite deployment. Candidates report that GradientTape custom training loops and tf.data pipeline design are the two most frequently tested TensorFlow topics at Google and banking ML teams. According to candidate accounts from public preparation resources, TFLite deployment questions appear in interviews for mobile ML roles. Confirm the specific technology stack and interview format on the official careers portal before your round.
Related articles: PyTorch Interview Questions 2026 | Deep Learning Interview Questions 2026 | MLOps Interview Questions 2026 | Machine Learning Interview Questions 2026 | AI/ML Interview Questions 2026
Which Companies Ask TensorFlow Questions?
| Company Type | Why They Use TF |
|---|---|
| Google and subsidiaries | Created TensorFlow; internal standard |
| Cloud ML services (GCP, AWS SageMaker) | Native TF integration |
| Enterprise finance and banking | Production reliability, TF Serving |
| Mobile-first companies | TFLite for on-device ML |
| Research at some labs | JAX (TF sibling) for TPU work |
EASY: TF 2.x and Keras Fundamentals (Questions 1-10)
Q1. What is the difference between TensorFlow 1.x and TensorFlow 2.x?
| Aspect | TF 1.x | TF 2.x |
|---|---|---|
| Execution | Deferred (build graph, then session.run) | Eager (immediate, like Python) |
| API | Low-level graph API | Keras-first, high-level |
| Debugging | Hard (graph is opaque) | Easy (standard Python debugging) |
| Performance | Optimized graphs | Eager by default; tf.function for graphs |
| Migration | N/A | tf.compat.v1 shim available |
import tensorflow as tf
print(tf.__version__) # 2.15 or higher in 2026
# TF 2.x: eager execution by default
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])
c = tf.matmul(a, b)
print(c.numpy()) # immediate result, no session needed
# tf.function: compiles to graph for performance
@tf.function(jit_compile=True) # XLA compilation for TPU/GPU speedup
def matrix_power(x, n):
result = tf.eye(x.shape[0], dtype=x.dtype)
for _ in range(n):
result = tf.matmul(result, x)
return result
Q2. How do you build a neural network with Keras? What are the three API styles?
| API | Use When | Flexibility |
|---|---|---|
| Sequential | Simple linear stacks | Low |
| Functional | Multiple inputs/outputs, branches | High |
| Subclassing (Model) | Custom forward pass logic | Highest |
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Sequential API
seq_model = keras.Sequential([
layers.Dense(256, activation='relu', input_shape=(784,)),
layers.BatchNormalization(),
layers.Dropout(0.3),
layers.Dense(128, activation='relu'),
layers.Dense(10, activation='softmax')
])
# Functional API (handles multiple inputs/outputs, skip connections)
inputs = keras.Input(shape=(784,))
x = layers.Dense(256, activation='gelu')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.3)(x)
x = layers.Dense(128, activation='gelu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
func_model = keras.Model(inputs, outputs, name='mlp')
# Subclassing API (full PyTorch-like control)
class ResBlock(keras.Model):
def __init__(self, units):
super().__init__()
self.dense1 = layers.Dense(units, activation='gelu')
self.dense2 = layers.Dense(units)
self.bn = layers.BatchNormalization()
self.proj = layers.Dense(units)
def call(self, x, training=False):
residual = self.proj(x)
out = self.dense1(x)
out = self.dense2(out)
out = self.bn(out, training=training)
return keras.activations.gelu(out + residual)
Q3. What is GradientTape and how do you write a custom training loop?
import tensorflow as tf
# Build model and optimizer
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
tf.keras.layers.Dense(10)
])
optimizer = tf.keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-2)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
train_acc = tf.keras.metrics.SparseCategoricalAccuracy()
# Custom training step
@tf.function # compile to graph for speed
def train_step(x, y):
with tf.GradientTape() as tape:
logits = model(x, training=True)
loss = loss_fn(y, logits)
loss += sum(model.losses) # regularization losses
gradients = tape.gradient(loss, model.trainable_variables)
# Gradient clipping
gradients, _ = tf.clip_by_global_norm(gradients, 1.0)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_acc.update_state(y, logits)
return loss
# Training loop
for epoch in range(n_epochs):
train_acc.reset_state()
for x_batch, y_batch in train_dataset:
loss = train_step(x_batch, y_batch)
print(f"Epoch {epoch+1}: loss={loss:.4f}, acc={train_acc.result():.4f}")
Q4. What is tf.data? How do you build an efficient input pipeline?
import tensorflow as tf
# From numpy arrays
dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
# From files (lazy loading)
file_dataset = tf.data.Dataset.list_files('data/*.tfrecord')
# Efficient pipeline
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 64
def augment(image, label):
image = tf.image.random_flip_left_right(image)
image = tf.image.random_brightness(image, 0.2)
image = tf.image.random_contrast(image, 0.8, 1.2)
return image, label
train_pipeline = (
dataset
.shuffle(buffer_size=10000, seed=42) # shuffle before batch
.map(augment, num_parallel_calls=AUTOTUNE) # parallel preprocessing
.batch(BATCH_SIZE, drop_remainder=True)
.prefetch(AUTOTUNE) # overlap data loading with GPU
)
# TFRecord pipeline (most efficient for large datasets)
def parse_tfrecord(serialized_example):
feature_spec = {
'image': tf.io.FixedLenFeature([], tf.string),
'label': tf.io.FixedLenFeature([], tf.int64)
}
parsed = tf.io.parse_single_example(serialized_example, feature_spec)
image = tf.io.decode_jpeg(parsed['image'])
image = tf.image.resize(image, [224, 224]) / 255.0
return image, parsed['label']
tfrecord_dataset = (
file_dataset
.interleave(tf.data.TFRecordDataset,
cycle_length=4, num_parallel_calls=AUTOTUNE)
.map(parse_tfrecord, num_parallel_calls=AUTOTUNE)
.batch(BATCH_SIZE)
.prefetch(AUTOTUNE)
)
Q5. How do you implement regularization in Keras?
from tensorflow.keras import layers, regularizers
# L2 weight regularization
dense = layers.Dense(
256,
activation='relu',
kernel_regularizer=regularizers.L2(l2=0.001),
bias_regularizer=regularizers.L2(l2=0.001)
)
# Dropout
dropout = layers.Dropout(rate=0.3)
# Batch Normalization (also has regularization effect)
bn = layers.BatchNormalization(
momentum=0.99, # running average for mean/variance
epsilon=1e-3
)
# Spatial Dropout (for CNN: drop entire feature maps)
spatial_dropout = layers.SpatialDropout2D(rate=0.2)
# Full regularized CNN block
def regularized_conv_block(filters, kernel_size=3):
return tf.keras.Sequential([
layers.Conv2D(filters, kernel_size, padding='same', use_bias=False,
kernel_regularizer=regularizers.L2(1e-4)),
layers.BatchNormalization(),
layers.ReLU(),
layers.SpatialDropout2D(0.1)
])
Q6. What are callbacks in Keras? What are the essential ones?
import tensorflow as tf
callbacks = [
# Stop training when val_loss stops improving
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=5,
restore_best_weights=True,
verbose=1
),
# Reduce LR when plateau detected
tf.keras.callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=0.5, # multiply LR by 0.5
patience=3,
min_lr=1e-7,
verbose=1
),
# Save best model checkpoint
tf.keras.callbacks.ModelCheckpoint(
filepath='checkpoints/model_{epoch:03d}_{val_accuracy:.4f}.keras',
monitor='val_accuracy',
save_best_only=True,
save_weights_only=False
),
# TensorBoard logging
tf.keras.callbacks.TensorBoard(
log_dir='./logs',
histogram_freq=1, # log weight histograms every epoch
write_graph=True,
update_freq='epoch'
),
# Custom callback
class LearningRateLogger(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
lr = float(self.model.optimizer.learning_rate)
print(f"\nEpoch {epoch}: LR = {lr:.6f}")
]
model.fit(train_data, validation_data=val_data,
epochs=100, callbacks=callbacks)
Q7. How do you save and load models in TensorFlow?
import tensorflow as tf
# SavedModel format (recommended, framework-independent)
model.save('saved_model/my_model') # directory with assets + variables
loaded = tf.saved_model.load('saved_model/my_model')
# Keras format (.keras, new in TF 2.12)
model.save('model.keras') # single file, Keras only
loaded = tf.keras.models.load_model('model.keras')
# Weights only (for checkpoint/transfer learning)
model.save_weights('weights/checkpoint')
model.load_weights('weights/checkpoint')
# HDF5 format (legacy)
model.save('model.h5')
loaded = tf.keras.models.load_model('model.h5')
# Inspect a SavedModel
print(tf.saved_model.load('saved_model/my_model').signatures)
# Load and run inference
infer = tf.saved_model.load('saved_model/my_model').signatures['serving_default']
output = infer(input_tensor=tf.constant(X_test[:5]))
Q8. What is tf.function and when should you use it?
import tensorflow as tf
import time
model = tf.keras.Sequential([tf.keras.layers.Dense(1024, activation='relu'),
tf.keras.layers.Dense(10)])
# Without tf.function: eager, Python overhead
def eager_predict(x):
return model(x, training=False)
# With tf.function: compiled graph
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 100], dtype=tf.float32)])
def fast_predict(x):
return model(x, training=False)
x = tf.random.normal([1000, 100])
# Warmup (first call traces the graph)
_ = fast_predict(x)
t = time.time()
for _ in range(100):
eager_predict(x)
print(f"Eager: {time.time()-t:.2f}s")
t = time.time()
for _ in range(100):
fast_predict(x)
print(f"tf.function: {time.time()-t:.2f}s") # typically 2-5x faster
# Gotcha: retracing
# If you call with different dtypes/shapes without input_signature,
# TF will retrace each time (expensive). Use input_signature to prevent this.
Q9. How do you implement custom layers and losses in Keras?
import tensorflow as tf
from tensorflow import keras
# Custom Layer
class ScaledDotProductAttention(keras.layers.Layer):
def __init__(self, d_k, **kwargs):
super().__init__(**kwargs)
self.d_k = d_k
self.scale = d_k ** -0.5
def call(self, Q, K, V, mask=None):
scores = tf.matmul(Q, K, transpose_b=True) * self.scale
if mask is not None:
scores += (1.0 - tf.cast(mask, tf.float32)) * (-1e9)
weights = tf.nn.softmax(scores, axis=-1)
return tf.matmul(weights, V), weights
def get_config(self): # needed for model.save()
config = super().get_config()
config.update({'d_k': self.d_k})
return config
# Custom Loss
class FocalLoss(keras.losses.Loss):
def __init__(self, gamma=2.0, alpha=0.25, **kwargs):
super().__init__(**kwargs)
self.gamma = gamma
self.alpha = alpha
def call(self, y_true, y_pred):
y_pred = tf.clip_by_value(y_pred, 1e-7, 1 - 1e-7)
ce = -y_true * tf.math.log(y_pred)
pt = tf.where(y_true == 1, y_pred, 1 - y_pred)
focal_weight = self.alpha * (1 - pt) ** self.gamma
return tf.reduce_mean(tf.reduce_sum(focal_weight * ce, axis=-1))
# Custom Metric
class F1Score(keras.metrics.Metric):
def __init__(self, threshold=0.5, **kwargs):
super().__init__(**kwargs)
self.threshold = threshold
self.tp = self.add_weight(name='tp', initializer='zeros')
self.fp = self.add_weight(name='fp', initializer='zeros')
self.fn = self.add_weight(name='fn', initializer='zeros')
def update_state(self, y_true, y_pred, sample_weight=None):
y_pred = tf.cast(y_pred >= self.threshold, tf.float32)
self.tp.assign_add(tf.reduce_sum(y_true * y_pred))
self.fp.assign_add(tf.reduce_sum((1 - y_true) * y_pred))
self.fn.assign_add(tf.reduce_sum(y_true * (1 - y_pred)))
def result(self):
precision = self.tp / (self.tp + self.fp + 1e-7)
recall = self.tp / (self.tp + self.fn + 1e-7)
return 2 * precision * recall / (precision + recall + 1e-7)
def reset_state(self):
for v in self.variables:
v.assign(tf.zeros_like(v))
Q10. What is mixed precision training in TensorFlow?
import tensorflow as tf
# Enable mixed precision (float16 compute, float32 weights)
policy = tf.keras.mixed_precision.Policy('mixed_float16')
tf.keras.mixed_precision.set_global_policy(policy)
# Build model (same code; weights auto-cast)
model = tf.keras.Sequential([
tf.keras.layers.Dense(1024, activation='relu'),
tf.keras.layers.Dense(10, dtype='float32') # keep final layer float32
])
# Loss scaling prevents gradient underflow in float16
optimizer = tf.keras.optimizers.AdamW(1e-3)
optimizer = tf.keras.mixed_precision.LossScaleOptimizer(optimizer)
# In custom loop:
@tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
logits = model(x, training=True)
loss = loss_fn(y, logits)
scaled_loss = optimizer.get_scaled_loss(loss)
scaled_grads = tape.gradient(scaled_loss, model.trainable_variables)
grads = optimizer.get_unscaled_gradients(scaled_grads)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
return loss
# bfloat16 (better for TPU, no loss scaling needed)
tpu_policy = tf.keras.mixed_precision.Policy('mixed_bfloat16')
MEDIUM: Advanced Keras and TF Ecosystem (Questions 11-20)
Q11. How do you implement transfer learning with TensorFlow Hub?
import tensorflow_hub as hub
import tensorflow as tf
# EfficientNetV2 from TF Hub
hub_url = 'https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_ft1k_l/feature_vector/2'
feature_extractor = hub.KerasLayer(hub_url, trainable=False, input_shape=(384, 384, 3))
# Build classification model
model = tf.keras.Sequential([
feature_extractor,
tf.keras.layers.Dense(256, activation='gelu'),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(num_classes, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-3),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Two-phase fine-tuning
# Phase 1: train only head
feature_extractor.trainable = False
model.fit(train_ds, epochs=5, validation_data=val_ds)
# Phase 2: unfreeze backbone with small LR
feature_extractor.trainable = True
model.compile(
optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-5), # 100x smaller LR
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
model.fit(train_ds, epochs=15, validation_data=val_ds)
Q12. What is the TFRecord format and why is it used for large datasets?
- Sequential reads (no seeking): much faster than reading individual image files
- Lossless encoding of heterogeneous data (images, text, labels)
- Works seamlessly with
tf.datapipeline
import tensorflow as tf
import numpy as np
# Write TFRecords
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def serialize_example(image_array, label):
encoded = tf.image.encode_jpeg(image_array).numpy()
feature = {
'image': _bytes_feature(encoded),
'label': _int64_feature(label),
'height': _int64_feature(image_array.shape[0]),
'width': _int64_feature(image_array.shape[1])
}
proto = tf.train.Example(features=tf.train.Features(feature=feature))
return proto.SerializeToString()
with tf.io.TFRecordWriter('train.tfrecord') as writer:
for img, lbl in zip(images, labels):
writer.write(serialize_example(img, lbl))
# Read TFRecords
feature_spec = {
'image': tf.io.FixedLenFeature([], tf.string),
'label': tf.io.FixedLenFeature([], tf.int64)
}
@tf.function
def parse_example(serialized):
example = tf.io.parse_single_example(serialized, feature_spec)
image = tf.io.decode_jpeg(example['image'], channels=3)
image = tf.image.resize(image, [224, 224]) / 255.0
return image, example['label']
dataset = (tf.data.TFRecordDataset('train.tfrecord')
.map(parse_example, num_parallel_calls=tf.data.AUTOTUNE)
.batch(64).prefetch(tf.data.AUTOTUNE))
Q13. How do you use distributed training with tf.distribute?
import tensorflow as tf
# Multi-GPU on single machine
strategy = tf.distribute.MirroredStrategy()
# Multi-machine (parameter server)
ps_strategy = tf.distribute.experimental.ParameterServerStrategy(
cluster_resolver=tf.distribute.cluster_resolver.TFConfigClusterResolver()
)
# TPU strategy
resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
tpu_strategy = tf.distribute.TPUStrategy(resolver)
# Usage pattern (same code works across strategies)
with strategy.scope():
model = tf.keras.Sequential([
tf.keras.layers.Dense(1024, activation='relu', input_shape=(784,)),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.AdamW(1e-3),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Adjust batch size: GLOBAL_BATCH = per_replica_batch * num_replicas
GLOBAL_BATCH_SIZE = 64 * strategy.num_replicas_in_sync
train_dataset = train_dataset.rebatch(GLOBAL_BATCH_SIZE)
model.fit(train_dataset, epochs=20)
# Custom training loop with distribute
@tf.function
def distributed_train_step(dataset_inputs):
per_replica_losses = strategy.run(train_step, args=(dataset_inputs,))
return strategy.reduce(tf.distribute.ReduceOp.SUM,
per_replica_losses, axis=None)
Q14. What is TFLite and how do you convert a model for mobile deployment?
import tensorflow as tf
# Convert Keras model to TFLite
model = tf.keras.models.load_model('trained_model.keras')
# Option 1: Float32 conversion (no quality loss)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Option 2: Dynamic range quantization (minimal accuracy loss)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant = converter.convert()
# Option 3: Full INT8 quantization (best performance on edge)
def representative_dataset():
for x_batch, _ in val_dataset.take(100):
yield [x_batch]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite_int8 = converter.convert()
with open('model_int8.tflite', 'wb') as f:
f.write(tflite_int8)
# Run inference with TFLite interpreter
interpreter = tf.lite.Interpreter(model_path='model_int8.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input_details[0]['index'], X_test[:1].astype('int8'))
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]['index'])
Q15. How does TF Serving work? How do you deploy a model as a REST API?
# Save model in SavedModel format with versioning
# model/
# 1/ <- version number
# assets/
# variables/
# saved_model.pb
import tensorflow as tf
# Export for TF Serving with specific serving signature
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 224, 224, 3],
dtype=tf.float32)])
def serving_fn(x):
return {'predictions': model(x, training=False)}
tf.saved_model.save(model, 'tf_serving_model/1',
signatures={'serving_default': serving_fn})
# Start TF Serving (Docker)
docker run -p 8501:8501 \
--mount type=bind,source=$(pwd)/tf_serving_model,target=/models/my_model \
-e MODEL_NAME=my_model \
tensorflow/serving
# REST API call
curl -X POST http://localhost:8501/v1/models/my_model:predict \
-d '{"instances": [[0.1, 0.2, ..., 0.9]]}'
# Python client
import requests
import json
import numpy as np
data = json.dumps({'instances': X_test[:5].tolist()})
response = requests.post(
'http://localhost:8501/v1/models/my_model:predict',
data=data,
headers={'Content-Type': 'application/json'}
)
predictions = response.json()['predictions']
Q16. What is Keras Tuner? How do you use it for hyperparameter optimization?
import keras_tuner as kt
import tensorflow as tf
def build_model(hp):
"""Model builder function; hp is the HyperParameters object."""
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(784,)))
# Search over number of layers and units
for i in range(hp.Int('num_layers', min_value=1, max_value=5)):
units = hp.Choice(f'units_{i}', values=[64, 128, 256, 512])
model.add(tf.keras.layers.Dense(units, activation='gelu'))
model.add(tf.keras.layers.BatchNormalization())
if hp.Boolean(f'dropout_{i}'):
model.add(tf.keras.layers.Dropout(
hp.Float(f'dropout_rate_{i}', min_value=0.1, max_value=0.5, step=0.1)
))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
lr = hp.Float('lr', min_value=1e-5, max_value=1e-2, sampling='log')
model.compile(
optimizer=tf.keras.optimizers.AdamW(lr),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
return model
# Bayesian optimization search
tuner = kt.BayesianOptimization(
build_model,
objective='val_accuracy',
max_trials=50,
directory='kt_search',
project_name='mlp_tuning'
)
tuner.search(X_train, y_train, epochs=20,
validation_split=0.2,
callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)])
best_model = tuner.get_best_models(num_models=1)[0]
print(tuner.get_best_hyperparameters()[0].values)
Q17. How do you implement class activation maps (CAM) in TensorFlow?
import tensorflow as tf
import numpy as np
import cv2
def compute_gradcam(model, img_array, layer_name, class_idx=None):
"""
Compute Grad-CAM for a given image and layer.
"""
# Create a model that outputs the target layer + final output
grad_model = tf.keras.models.Model(
inputs=model.inputs,
outputs=[model.get_layer(layer_name).output, model.output]
)
with tf.GradientTape() as tape:
conv_outputs, predictions = grad_model(img_array)
if class_idx is None:
class_idx = tf.argmax(predictions[0])
class_score = predictions[:, class_idx]
# Gradients of class score w.r.t. conv layer output
grads = tape.gradient(class_score, conv_outputs) # [1, H, W, C]
pooled_grads = tf.reduce_mean(grads, axis=[0, 1, 2]) # [C]
# Weighted combination of feature maps
conv_outputs = conv_outputs[0]
heatmap = tf.reduce_sum(conv_outputs * pooled_grads, axis=-1)
heatmap = tf.nn.relu(heatmap) / (tf.reduce_max(heatmap) + 1e-8)
return heatmap.numpy(), int(class_idx)
# Example usage
img_array = tf.expand_dims(img_tensor, 0) # [1, H, W, 3]
heatmap, predicted_class = compute_gradcam(model, img_array, 'block5_conv3')
# Overlay heatmap on original image
heatmap_resized = cv2.resize(heatmap, (original_w, original_h))
heatmap_colored = cv2.applyColorMap(np.uint8(255 * heatmap_resized), cv2.COLORMAP_JET)
superimposed = cv2.addWeighted(original_img, 0.6, heatmap_colored, 0.4, 0)
Q18. What is TFX (TensorFlow Extended)? What are its components?
| Component | Purpose |
|---|---|
| ExampleGen | Ingest raw data; split into train/eval |
| StatisticsGen | Compute statistics over data |
| SchemaGen | Infer data schema (types, ranges) |
| ExampleValidator | Detect data anomalies vs schema |
| Transform | Feature engineering with tf.Transform |
| Trainer | Train model using Estimator or Keras |
| Evaluator | Compute metrics; slice-based evaluation |
| InfraValidator | Validate model can be loaded for serving |
| Pusher | Deploy to TF Serving or TFLite |
from tfx.components import (
CsvExampleGen, StatisticsGen, SchemaGen, ExampleValidator,
Transform, Trainer, Evaluator, Pusher
)
from tfx.dsl.component.experimental.decorators import component
import tensorflow_model_analysis as tfma
# Pipeline definition
def create_pipeline(pipeline_root, data_root):
example_gen = CsvExampleGen(input_base=data_root)
stats_gen = StatisticsGen(examples=example_gen.outputs['examples'])
schema_gen = SchemaGen(statistics=stats_gen.outputs['statistics'])
validator = ExampleValidator(
statistics=stats_gen.outputs['statistics'],
schema=schema_gen.outputs['schema']
)
transform = Transform(
examples=example_gen.outputs['examples'],
schema=schema_gen.outputs['schema'],
module_file='preprocessing.py'
)
trainer = Trainer(
module_file='trainer.py',
examples=transform.outputs['transformed_examples'],
schema=schema_gen.outputs['schema'],
train_args=trainer_pb2.TrainArgs(num_steps=1000),
eval_args=trainer_pb2.EvalArgs(num_steps=200)
)
return [example_gen, stats_gen, schema_gen, validator, transform, trainer]
Q19. How does TensorFlow handle text preprocessing with Keras preprocessing layers?
import tensorflow as tf
from tensorflow.keras import layers
# TextVectorization: converts raw strings to integer sequences
vectorizer = layers.TextVectorization(
max_tokens=20000,
output_mode='int',
output_sequence_length=256,
ngrams=None
)
vectorizer.adapt(text_dataset) # fit vocabulary on training data
# Embedding layer
embedding = layers.Embedding(
input_dim=20000,
output_dim=128,
mask_zero=True # handle variable-length sequences
)
# Full text classification model (preprocessing inside model)
inputs = tf.keras.Input(shape=(1,), dtype=tf.string)
x = vectorizer(inputs)
x = embedding(x)
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(64, activation='gelu')(x)
outputs = layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs, outputs)
# At inference: pass raw strings directly
model.predict(["This is a test sentence."]) # no external tokenizer needed
# For embedding-based similarity (hash trick for production)
bag_of_words = layers.TextVectorization(
max_tokens=50000,
output_mode='multi_hot' # sparse binary vector
)
tfidf = layers.TextVectorization(
max_tokens=50000,
output_mode='tf_idf'
)
Q20. What is the difference between eager execution and graph execution in TF 2.x?
| Aspect | Eager | Graph (tf.function) |
|---|---|---|
| Execution | Immediate, line by line | Deferred; compiled DAG |
| Debugging | Easy: print(), pdb, Python stack traces | Hard: print inside @tf.function needs tf.print |
| Speed | Slower (Python overhead per op) | Faster (graph optimizations, XLA) |
| Deployment | Cannot deploy without Python | Portable SavedModel |
| When to use | Prototyping, debugging | Production, training loops |
import tensorflow as tf
# Eager (default in TF 2.x)
a = tf.constant([1.0, 2.0])
print(a + 1) # immediate: tf.Tensor([2. 3.], ...)
# tf.function: tracing converts to graph
@tf.function
def compute(x, y):
# Python print only runs at TRACE time (first call)
print("Tracing!") # printed only once
tf.print("Value:", x) # printed at every EXECUTION
return x * y + tf.reduce_sum(y)
# First call: traces the function (runs Python print)
result = compute(tf.constant(2.0), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" printed once
# Second call: uses cached graph (Python print NOT called again)
result = compute(tf.constant(3.0), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" NOT printed; tf.print IS printed (graph execution)
# Gotcha: re-tracing happens if input shapes change (without input_signature)
result = compute(tf.constant([1.0, 2.0]), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" printed again (new shape for x)
HARD: Advanced TF Topics (Questions 21-28)
Q21. How do you implement a Transformer model from scratch in TensorFlow?
import tensorflow as tf
from tensorflow.keras import layers
class MultiHeadSelfAttention(layers.Layer):
def __init__(self, d_model, num_heads, **kwargs):
super().__init__(**kwargs)
assert d_model % num_heads == 0
self.d_model = d_model
self.num_heads = num_heads
self.d_k = d_model // num_heads
self.W_q = layers.Dense(d_model)
self.W_k = layers.Dense(d_model)
self.W_v = layers.Dense(d_model)
self.W_o = layers.Dense(d_model)
def split_heads(self, x, batch_size):
x = tf.reshape(x, [batch_size, -1, self.num_heads, self.d_k])
return tf.transpose(x, [0, 2, 1, 3]) # [B, H, T, d_k]
def call(self, x, mask=None, training=False):
B = tf.shape(x)[0]
Q = self.split_heads(self.W_q(x), B)
K = self.split_heads(self.W_k(x), B)
V = self.split_heads(self.W_v(x), B)
scores = tf.matmul(Q, K, transpose_b=True) / tf.sqrt(float(self.d_k))
if mask is not None:
scores += (1 - tf.cast(mask, tf.float32)) * -1e9
weights = tf.nn.softmax(scores, axis=-1)
out = tf.matmul(weights, V) # [B, H, T, d_k]
out = tf.transpose(out, [0, 2, 1, 3]) # [B, T, H, d_k]
out = tf.reshape(out, [B, -1, self.d_model]) # [B, T, d_model]
return self.W_o(out)
class TransformerBlock(layers.Layer):
def __init__(self, d_model, num_heads, dff, dropout=0.1, **kwargs):
super().__init__(**kwargs)
self.attn = MultiHeadSelfAttention(d_model, num_heads)
self.ffn = tf.keras.Sequential([
layers.Dense(dff, activation='gelu'),
layers.Dense(d_model)
])
self.norm1 = layers.LayerNormalization(epsilon=1e-6)
self.norm2 = layers.LayerNormalization(epsilon=1e-6)
self.drop1 = layers.Dropout(dropout)
self.drop2 = layers.Dropout(dropout)
def call(self, x, mask=None, training=False):
attn_out = self.attn(self.norm1(x), mask=mask, training=training)
x = x + self.drop1(attn_out, training=training) # pre-norm residual
ffn_out = self.ffn(self.norm2(x))
return x + self.drop2(ffn_out, training=training)
Q22. What is model quantization aware training (QAT) in TensorFlow?
import tensorflow as tf
import tensorflow_model_optimization as tfmot
# Load trained model
model = tf.keras.models.load_model('trained_model.keras')
# Apply quantization aware training
# This inserts fake quantization nodes into the graph
qat_model = tfmot.quantization.keras.quantize_model(model)
qat_model.compile(
optimizer=tf.keras.optimizers.Adam(1e-5), # very small LR for QAT fine-tuning
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Fine-tune with fake quantization for a few epochs (not full training)
qat_model.fit(train_dataset, epochs=3, validation_data=val_dataset,
callbacks=[tf.keras.callbacks.EarlyStopping(patience=2)])
# Convert to TFLite INT8
converter = tf.lite.TFLiteConverter.from_keras_model(qat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_qat = converter.convert()
# QAT typically recovers 0.5-2% accuracy vs PTQ
Q23. How does automatic differentiation work in TensorFlow?
- Its forward computation
- Its gradient function (the VJP: vector-Jacobian product)
During tape.gradient(loss, variables), TF traverses this tape in reverse, computing gradients via the chain rule.
import tensorflow as tf
# Basic autodiff
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
y = x ** 3 + 2 * x # y = x^3 + 2x
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy()) # dy/dx = 3x^2 + 2 = 29 at x=3
# Second-order gradients (Hessian)
x = tf.Variable([1.0, 2.0])
with tf.GradientTape() as t2:
with tf.GradientTape() as t1:
y = tf.reduce_sum(x ** 3) # y = x1^3 + x2^3
dy_dx = t1.gradient(y, x) # [3*x1^2, 3*x2^2]
d2y_dx2 = t2.gradient(dy_dx, x) # [6*x1, 6*x2]
print(d2y_dx2.numpy()) # [6.0, 12.0]
# Custom gradient (for non-differentiable ops or improved numerical stability)
@tf.custom_gradient
def log_softmax_stable(x):
result = tf.nn.log_softmax(x)
def grad(upstream):
softmax = tf.nn.softmax(x)
return upstream - tf.reduce_sum(upstream) * softmax
return result, grad
Q24. What is the difference between stateful and stateless layers in Keras?
| Type | State | Examples | Behavior Across Calls |
|---|---|---|---|
| Stateless | No trainable params or moving averages | ReLU, Dropout, Softmax | Same output for same input |
| Stateful (learnable) | Trainable weights | Dense, Conv2D, Embedding | Weights updated during training |
| Stateful (running stats) | Non-trainable moving averages | BatchNormalization | Different behavior in train vs eval |
| Recurrent (sequence state) | Hidden state across time steps | LSTM, GRU | State persists between batches if stateful=True |
import tensorflow as tf
from tensorflow.keras import layers
# LSTM with stateful=True: state preserved across batches (for long sequences)
stateful_lstm = layers.LSTM(
64,
stateful=True, # keep state across batches
return_sequences=True
)
# Requires fixed batch size: model = Sequential([Input(batch_size=32, shape=(None, features))])
# Must call model.reset_states() between sequences
# BatchNormalization: stateful during training (updates running stats)
bn = layers.BatchNormalization()
# model.trainable = False -> bn uses running_mean/running_var (eval mode)
# model.trainable = True -> bn uses batch statistics (train mode)
# GRU (often preferred over LSTM in production: fewer params, similar performance)
gru_layer = layers.GRU(
128,
return_sequences=True, # return output at every time step
return_state=True, # also return final hidden state
dropout=0.1,
recurrent_dropout=0.1,
reset_after=True # cuDNN-compatible implementation
)
Q25. How do you profile and optimize TensorFlow model training?
import tensorflow as tf
# TensorBoard Profiler (trace GPU ops)
tb_callback = tf.keras.callbacks.TensorBoard(
log_dir='./logs',
profile_batch='10,20' # profile batches 10-20
)
model.fit(dataset, callbacks=[tb_callback])
# Then: tensorboard --logdir ./logs
# Look at: GPU utilization, memory bandwidth, op timeline
# tf.profiler API
tf.profiler.experimental.start('logdir')
model.fit(dataset, epochs=1)
tf.profiler.experimental.stop()
# Common bottlenecks and fixes:
# 1. Low GPU utilization: increase batch size, num_parallel_calls, prefetch
# 2. Memory fragmentation: use tf.data, avoid Python loops in training
# 3. Slow data loading: use TFRecords + multiple workers
# XLA (Accelerated Linear Algebra) compilation
@tf.function(jit_compile=True) # enables XLA fusion
def train_step_xla(x, y):
with tf.GradientTape() as tape:
loss = loss_fn(y, model(x, training=True))
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
return loss
Q26. What is model pruning in TensorFlow? How do you implement it?
import tensorflow as tf
import tensorflow_model_optimization as tfmot
# Magnitude-based weight pruning
pruning_params = {
'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
initial_sparsity=0.0,
final_sparsity=0.5, # 50% of weights pruned
begin_step=0,
end_step=1000,
frequency=100 # prune every 100 steps
)
}
model_for_pruning = tfmot.sparsity.keras.prune_low_magnitude(
model, **pruning_params
)
model_for_pruning.compile(
optimizer=tf.keras.optimizers.Adam(1e-4),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# PruningCallback is required to update masks
callbacks = [
tfmot.sparsity.keras.UpdatePruningStep(),
tfmot.sparsity.keras.PruningSummaries(log_dir='./pruning_logs')
]
model_for_pruning.fit(train_data, epochs=10, callbacks=callbacks)
# Strip pruning wrappers (make weights actually sparse)
final_model = tfmot.sparsity.keras.strip_pruning(model_for_pruning)
# Sparsity is now "real" in weights
for layer in final_model.layers:
if hasattr(layer, 'kernel'):
sparsity = 1 - (tf.math.count_nonzero(layer.kernel) /
tf.size(layer.kernel, out_type=tf.int64)).numpy()
if sparsity > 0:
print(f"{layer.name}: {sparsity:.1%} sparse")
Q27. How do you export a model for serving with TF Serving in production?
import tensorflow as tf
# Save model with explicit serving signature
model = tf.keras.models.load_model('trained_model.keras')
@tf.function(input_signature=[
tf.TensorSpec(shape=[None, 224, 224, 3], dtype=tf.float32, name='images')
])
def serve(images):
# Preprocessing inside the serving function
images = tf.image.resize(images, [224, 224])
images = tf.cast(images, tf.float32) / 255.0
# Normalize with ImageNet stats
mean = tf.constant([0.485, 0.456, 0.406])
std = tf.constant([0.229, 0.224, 0.225])
images = (images - mean) / std
predictions = model(images, training=False)
return {
'class_ids': tf.argmax(predictions, axis=-1),
'probabilities': predictions,
'top_5_ids': tf.math.top_k(predictions, k=5).indices
}
tf.saved_model.save(
model,
'production_model/1',
signatures={'serving_default': serve}
)
# Batching (TF Serving can auto-batch requests)
# Configure in batching_parameters.txt:
# max_batch_size: 64
# batch_timeout_micros: 5000
# max_enqueued_batches: 1000
Q28. Design a TF pipeline for real-time image classification at scale.
Production pipeline design:
Serving layer:
Client -> Load Balancer -> TF Serving fleet (N replicas on GPU)
- Each TF Serving instance: 2 GPUs, auto-batching enabled
- Batch size 32, timeout 10ms (P99 latency target: 50ms)
- gRPC for internal (faster), REST for external clients
Model:
- EfficientNetV2-S: best accuracy/latency for 384x384 input
- Quantized INT8 (TFLite on edge) or FP16 (TF Serving on GPU)
- Model version: A/B test via TF Serving model config
Input pipeline:
- Preprocessing inside the saved model (resize, normalize)
- No client-side preprocessing required
Monitoring:
- Prometheus + Grafana for latency/throughput
- Model performance: accuracy on labelled production samples
- Drift detection: feature distribution vs training baseline
# TF Serving gRPC client
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2, prediction_service_pb2_grpc
def predict_grpc(images, stub):
request = predict_pb2.PredictRequest()
request.model_spec.name = 'my_model'
request.model_spec.signature_name = 'serving_default'
request.inputs['images'].CopyFrom(
tf.make_tensor_proto(images, dtype=tf.float32)
)
result = stub.Predict(request, timeout=10)
return tf.make_ndarray(result.outputs['class_ids'])
channel = grpc.insecure_channel('localhost:8500')
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
TensorFlow Ecosystem at a Glance
| Component | Purpose | Alternative |
|---|---|---|
| Keras | High-level model API | PyTorch nn.Module |
| tf.data | Input pipeline | PyTorch DataLoader |
| TFLite | Mobile/edge deployment | ONNX + ORT, CoreML |
| TF Serving | Production REST/gRPC serving | TorchServe, Triton |
| TFX | End-to-end ML pipelines | MLflow, Kubeflow |
| TF Hub | Pre-trained models | HuggingFace Hub |
| TensorBoard | Training visualization | W&B, MLflow UI |
| TFMA | Model evaluation | Custom evaluation |
FAQ
Q: Should I learn TensorFlow or PyTorch for ML interviews? A: Both. PyTorch is the standard for research and many startups. TensorFlow is required for Google roles and is standard at many enterprises. Start with PyTorch for intuition; learn TF 2.x/Keras for deployment tooling.
Q: What is the difference between Keras and TensorFlow?
A: Keras is the high-level API that lives inside TensorFlow 2.x. You import it as tf.keras. Keras can also run on top of JAX or PyTorch as of 2024 (Keras 3 / multi-backend).
Q: What is JAX and how does it relate to TensorFlow? A: JAX is a Python library from Google for numerical computing with automatic differentiation. It is the foundation for Google's internal research (replacing TF in many teams) and is used by DeepMind, Anthropic, and others. It uses NumPy syntax but runs on accelerators via XLA.
Related articles on PapersAdda:
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Explore this topic cluster
More resources in Interview Questions
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
Paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.
Submit your story →Ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.
Start Free Mock Test →Related Articles
Airbnb Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing Airbnb's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical, behavioural,...
Airtel Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing Airtel's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical, behavioural,...
AMD Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing AMD's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical, behavioural,...
Atlassian Interview Questions 2026: Top Tech, HR & Behavioural Q&As for Freshers
Clearing Atlassian's fresher loop in 2026 comes down to preparing for the exact mix of questions across technical,...
Barclays Interview Questions 2026
_Last verified by [Aditya Sharma](/author/aditya-sharma/) · cross-checked against PapersAdda Hiring Pulse and...
More from PapersAdda
Accenture Interview Questions 2026 (with Answers for Freshers)
Capgemini Interview Questions 2026 (with Answers for Freshers)
HCLTech Interview Questions 2026 (TechBee + TGT, with Answers)
IBM Interview Questions 2026 (with Answers for Freshers)