placement brief / Interview Questions / interview questions / 08 Jun 2026

TensorFlow Interview Questions 2026: 28 Answers with Code

28 TensorFlow interview questions with full code answers covering TF 2.x Keras, custom training, tf.data pipelines, TFLite, TF Serving, and distributed training for 2026 interviews.

By Aditya SharmaPublished 8 Jun 20262 sources listedSpot an error? Corrections open

5 min read last revised 8 Jun 2026

on this page§ 06

TensorFlow 2.x with Keras is the dominant enterprise ML framework at Google, Google Cloud, and thousands of production ML teams worldwide. While PyTorch leads in research, TensorFlow wins on deployment tooling: TFLite for mobile, TF Serving for production APIs, TFX for pipelines, and Google's TPU infrastructure. This guide covers 28 TensorFlow interview questions with complete code examples.

PapersAdda's take: If you're interviewing at a Google-adjacent company, a large bank, or any company with a legacy ML infrastructure, you will encounter TensorFlow. Know TF 2.x / Keras well, understand GradientTape for custom training, and be ready to explain TFLite deployment. Candidates report that GradientTape custom training loops and tf.data pipeline design are the two most frequently tested TensorFlow topics at Google and banking ML teams. According to candidate accounts from public preparation resources, TFLite deployment questions appear in interviews for mobile ML roles. Confirm the specific technology stack and interview format on the official careers portal before your round.

Related articles: PyTorch Interview Questions 2026 | Deep Learning Interview Questions 2026 | MLOps Interview Questions 2026 | Machine Learning Interview Questions 2026 | AI/ML Interview Questions 2026

Which Companies Ask TensorFlow Questions?

Company Type	Why They Use TF
Google and subsidiaries	Created TensorFlow; internal standard
Cloud ML services (GCP, AWS SageMaker)	Native TF integration
Enterprise finance and banking	Production reliability, TF Serving
Mobile-first companies	TFLite for on-device ML
Research at some labs	JAX (TF sibling) for TPU work

EASY: TF 2.x and Keras Fundamentals (Questions 1-10)

Q1. What is the difference between TensorFlow 1.x and TensorFlow 2.x?

Aspect	TF 1.x	TF 2.x
Execution	Deferred (build graph, then session.run)	Eager (immediate, like Python)
API	Low-level graph API	Keras-first, high-level
Debugging	Hard (graph is opaque)	Easy (standard Python debugging)
Performance	Optimized graphs	Eager by default; tf.function for graphs
Migration	N/A	tf.compat.v1 shim available

import tensorflow as tf

print(tf.__version__)   # 2.15 or higher in 2026

# TF 2.x: eager execution by default
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])
c = tf.matmul(a, b)
print(c.numpy())   # immediate result, no session needed

# tf.function: compiles to graph for performance
@tf.function(jit_compile=True)  # XLA compilation for TPU/GPU speedup
def matrix_power(x, n):
    result = tf.eye(x.shape[0], dtype=x.dtype)
    for _ in range(n):
        result = tf.matmul(result, x)
    return result

Q2. How do you build a neural network with Keras? What are the three API styles?

API	Use When	Flexibility
Sequential	Simple linear stacks	Low
Functional	Multiple inputs/outputs, branches	High
Subclassing (Model)	Custom forward pass logic	Highest

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Sequential API
seq_model = keras.Sequential([
    layers.Dense(256, activation='relu', input_shape=(784,)),
    layers.BatchNormalization(),
    layers.Dropout(0.3),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Functional API (handles multiple inputs/outputs, skip connections)
inputs = keras.Input(shape=(784,))
x = layers.Dense(256, activation='gelu')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.3)(x)
x = layers.Dense(128, activation='gelu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
func_model = keras.Model(inputs, outputs, name='mlp')

# Subclassing API (full PyTorch-like control)
class ResBlock(keras.Model):
    def __init__(self, units):
        super().__init__()
        self.dense1 = layers.Dense(units, activation='gelu')
        self.dense2 = layers.Dense(units)
        self.bn     = layers.BatchNormalization()
        self.proj   = layers.Dense(units)

    def call(self, x, training=False):
        residual = self.proj(x)
        out = self.dense1(x)
        out = self.dense2(out)
        out = self.bn(out, training=training)
        return keras.activations.gelu(out + residual)

Q3. What is GradientTape and how do you write a custom training loop?

import tensorflow as tf

# Build model and optimizer
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(10)
])
optimizer = tf.keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-2)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
train_acc = tf.keras.metrics.SparseCategoricalAccuracy()

# Custom training step
@tf.function   # compile to graph for speed
def train_step(x, y):
    with tf.GradientTape() as tape:
        logits = model(x, training=True)
        loss   = loss_fn(y, logits)
        loss  += sum(model.losses)   # regularization losses
    gradients = tape.gradient(loss, model.trainable_variables)
    # Gradient clipping
    gradients, _ = tf.clip_by_global_norm(gradients, 1.0)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    train_acc.update_state(y, logits)
    return loss

# Training loop
for epoch in range(n_epochs):
    train_acc.reset_state()
    for x_batch, y_batch in train_dataset:
        loss = train_step(x_batch, y_batch)
    print(f"Epoch {epoch+1}: loss={loss:.4f}, acc={train_acc.result():.4f}")

Q4. What is tf.data? How do you build an efficient input pipeline?

import tensorflow as tf

# From numpy arrays
dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))

# From files (lazy loading)
file_dataset = tf.data.Dataset.list_files('data/*.tfrecord')

# Efficient pipeline
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 64

def augment(image, label):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_brightness(image, 0.2)
    image = tf.image.random_contrast(image, 0.8, 1.2)
    return image, label

train_pipeline = (
    dataset
    .shuffle(buffer_size=10000, seed=42)    # shuffle before batch
    .map(augment, num_parallel_calls=AUTOTUNE)  # parallel preprocessing
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(AUTOTUNE)                     # overlap data loading with GPU
)

# TFRecord pipeline (most efficient for large datasets)
def parse_tfrecord(serialized_example):
    feature_spec = {
        'image': tf.io.FixedLenFeature([], tf.string),
        'label': tf.io.FixedLenFeature([], tf.int64)
    }
    parsed = tf.io.parse_single_example(serialized_example, feature_spec)
    image = tf.io.decode_jpeg(parsed['image'])
    image = tf.image.resize(image, [224, 224]) / 255.0
    return image, parsed['label']

tfrecord_dataset = (
    file_dataset
    .interleave(tf.data.TFRecordDataset,
                cycle_length=4, num_parallel_calls=AUTOTUNE)
    .map(parse_tfrecord, num_parallel_calls=AUTOTUNE)
    .batch(BATCH_SIZE)
    .prefetch(AUTOTUNE)
)

Q5. How do you implement regularization in Keras?

from tensorflow.keras import layers, regularizers

# L2 weight regularization
dense = layers.Dense(
    256,
    activation='relu',
    kernel_regularizer=regularizers.L2(l2=0.001),
    bias_regularizer=regularizers.L2(l2=0.001)
)

# Dropout
dropout = layers.Dropout(rate=0.3)

# Batch Normalization (also has regularization effect)
bn = layers.BatchNormalization(
    momentum=0.99,   # running average for mean/variance
    epsilon=1e-3
)

# Spatial Dropout (for CNN: drop entire feature maps)
spatial_dropout = layers.SpatialDropout2D(rate=0.2)

# Full regularized CNN block
def regularized_conv_block(filters, kernel_size=3):
    return tf.keras.Sequential([
        layers.Conv2D(filters, kernel_size, padding='same', use_bias=False,
                       kernel_regularizer=regularizers.L2(1e-4)),
        layers.BatchNormalization(),
        layers.ReLU(),
        layers.SpatialDropout2D(0.1)
    ])

Q6. What are callbacks in Keras? What are the essential ones?

import tensorflow as tf

callbacks = [
    # Stop training when val_loss stops improving
    tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True,
        verbose=1
    ),

    # Reduce LR when plateau detected
    tf.keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,     # multiply LR by 0.5
        patience=3,
        min_lr=1e-7,
        verbose=1
    ),

    # Save best model checkpoint
    tf.keras.callbacks.ModelCheckpoint(
        filepath='checkpoints/model_{epoch:03d}_{val_accuracy:.4f}.keras',
        monitor='val_accuracy',
        save_best_only=True,
        save_weights_only=False
    ),

    # TensorBoard logging
    tf.keras.callbacks.TensorBoard(
        log_dir='./logs',
        histogram_freq=1,       # log weight histograms every epoch
        write_graph=True,
        update_freq='epoch'
    ),

    # Custom callback
    class LearningRateLogger(tf.keras.callbacks.Callback):
        def on_epoch_end(self, epoch, logs=None):
            lr = float(self.model.optimizer.learning_rate)
            print(f"\nEpoch {epoch}: LR = {lr:.6f}")
]

model.fit(train_data, validation_data=val_data,
          epochs=100, callbacks=callbacks)

Q7. How do you save and load models in TensorFlow?

import tensorflow as tf

# SavedModel format (recommended, framework-independent)
model.save('saved_model/my_model')      # directory with assets + variables
loaded = tf.saved_model.load('saved_model/my_model')

# Keras format (.keras, new in TF 2.12)
model.save('model.keras')               # single file, Keras only
loaded = tf.keras.models.load_model('model.keras')

# Weights only (for checkpoint/transfer learning)
model.save_weights('weights/checkpoint')
model.load_weights('weights/checkpoint')

# HDF5 format (legacy)
model.save('model.h5')
loaded = tf.keras.models.load_model('model.h5')

# Inspect a SavedModel
print(tf.saved_model.load('saved_model/my_model').signatures)

# Load and run inference
infer = tf.saved_model.load('saved_model/my_model').signatures['serving_default']
output = infer(input_tensor=tf.constant(X_test[:5]))

Q8. What is tf.function and when should you use it?

import tensorflow as tf
import time

model = tf.keras.Sequential([tf.keras.layers.Dense(1024, activation='relu'),
                               tf.keras.layers.Dense(10)])

# Without tf.function: eager, Python overhead
def eager_predict(x):
    return model(x, training=False)

# With tf.function: compiled graph
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 100], dtype=tf.float32)])
def fast_predict(x):
    return model(x, training=False)

x = tf.random.normal([1000, 100])

# Warmup (first call traces the graph)
_ = fast_predict(x)

t = time.time()
for _ in range(100):
    eager_predict(x)
print(f"Eager: {time.time()-t:.2f}s")

t = time.time()
for _ in range(100):
    fast_predict(x)
print(f"tf.function: {time.time()-t:.2f}s")  # typically 2-5x faster

# Gotcha: retracing
# If you call with different dtypes/shapes without input_signature,
# TF will retrace each time (expensive). Use input_signature to prevent this.

Q9. How do you implement custom layers and losses in Keras?

import tensorflow as tf
from tensorflow import keras

# Custom Layer
class ScaledDotProductAttention(keras.layers.Layer):
    def __init__(self, d_k, **kwargs):
        super().__init__(**kwargs)
        self.d_k = d_k
        self.scale = d_k ** -0.5

    def call(self, Q, K, V, mask=None):
        scores = tf.matmul(Q, K, transpose_b=True) * self.scale
        if mask is not None:
            scores += (1.0 - tf.cast(mask, tf.float32)) * (-1e9)
        weights = tf.nn.softmax(scores, axis=-1)
        return tf.matmul(weights, V), weights

    def get_config(self):   # needed for model.save()
        config = super().get_config()
        config.update({'d_k': self.d_k})
        return config

# Custom Loss
class FocalLoss(keras.losses.Loss):
    def __init__(self, gamma=2.0, alpha=0.25, **kwargs):
        super().__init__(**kwargs)
        self.gamma = gamma
        self.alpha = alpha

    def call(self, y_true, y_pred):
        y_pred = tf.clip_by_value(y_pred, 1e-7, 1 - 1e-7)
        ce = -y_true * tf.math.log(y_pred)
        pt = tf.where(y_true == 1, y_pred, 1 - y_pred)
        focal_weight = self.alpha * (1 - pt) ** self.gamma
        return tf.reduce_mean(tf.reduce_sum(focal_weight * ce, axis=-1))

# Custom Metric
class F1Score(keras.metrics.Metric):
    def __init__(self, threshold=0.5, **kwargs):
        super().__init__(**kwargs)
        self.threshold = threshold
        self.tp = self.add_weight(name='tp', initializer='zeros')
        self.fp = self.add_weight(name='fp', initializer='zeros')
        self.fn = self.add_weight(name='fn', initializer='zeros')

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.cast(y_pred >= self.threshold, tf.float32)
        self.tp.assign_add(tf.reduce_sum(y_true * y_pred))
        self.fp.assign_add(tf.reduce_sum((1 - y_true) * y_pred))
        self.fn.assign_add(tf.reduce_sum(y_true * (1 - y_pred)))

    def result(self):
        precision = self.tp / (self.tp + self.fp + 1e-7)
        recall    = self.tp / (self.tp + self.fn + 1e-7)
        return 2 * precision * recall / (precision + recall + 1e-7)

    def reset_state(self):
        for v in self.variables:
            v.assign(tf.zeros_like(v))

Q10. What is mixed precision training in TensorFlow?

import tensorflow as tf

# Enable mixed precision (float16 compute, float32 weights)
policy = tf.keras.mixed_precision.Policy('mixed_float16')
tf.keras.mixed_precision.set_global_policy(policy)

# Build model (same code; weights auto-cast)
model = tf.keras.Sequential([
    tf.keras.layers.Dense(1024, activation='relu'),
    tf.keras.layers.Dense(10, dtype='float32')   # keep final layer float32
])

# Loss scaling prevents gradient underflow in float16
optimizer = tf.keras.optimizers.AdamW(1e-3)
optimizer = tf.keras.mixed_precision.LossScaleOptimizer(optimizer)

# In custom loop:
@tf.function
def train_step(x, y):
    with tf.GradientTape() as tape:
        logits = model(x, training=True)
        loss = loss_fn(y, logits)
        scaled_loss = optimizer.get_scaled_loss(loss)
    scaled_grads = tape.gradient(scaled_loss, model.trainable_variables)
    grads = optimizer.get_unscaled_gradients(scaled_grads)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

# bfloat16 (better for TPU, no loss scaling needed)
tpu_policy = tf.keras.mixed_precision.Policy('mixed_bfloat16')

MEDIUM: Advanced Keras and TF Ecosystem (Questions 11-20)

Q11. How do you implement transfer learning with TensorFlow Hub?

import tensorflow_hub as hub
import tensorflow as tf

# EfficientNetV2 from TF Hub
hub_url = 'https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_ft1k_l/feature_vector/2'
feature_extractor = hub.KerasLayer(hub_url, trainable=False, input_shape=(384, 384, 3))

# Build classification model
model = tf.keras.Sequential([
    feature_extractor,
    tf.keras.layers.Dense(256, activation='gelu'),
    tf.keras.layers.Dropout(0.3),
    tf.keras.layers.Dense(num_classes, activation='softmax')
])

model.compile(
    optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-3),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Two-phase fine-tuning
# Phase 1: train only head
feature_extractor.trainable = False
model.fit(train_ds, epochs=5, validation_data=val_ds)

# Phase 2: unfreeze backbone with small LR
feature_extractor.trainable = True
model.compile(
    optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-5),  # 100x smaller LR
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)
model.fit(train_ds, epochs=15, validation_data=val_ds)

Q12. What is the TFRecord format and why is it used for large datasets?

Sequential reads (no seeking): much faster than reading individual image files
Lossless encoding of heterogeneous data (images, text, labels)
Works seamlessly with tf.data pipeline

import tensorflow as tf
import numpy as np

# Write TFRecords
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def serialize_example(image_array, label):
    encoded = tf.image.encode_jpeg(image_array).numpy()
    feature = {
        'image': _bytes_feature(encoded),
        'label': _int64_feature(label),
        'height': _int64_feature(image_array.shape[0]),
        'width':  _int64_feature(image_array.shape[1])
    }
    proto = tf.train.Example(features=tf.train.Features(feature=feature))
    return proto.SerializeToString()

with tf.io.TFRecordWriter('train.tfrecord') as writer:
    for img, lbl in zip(images, labels):
        writer.write(serialize_example(img, lbl))

# Read TFRecords
feature_spec = {
    'image': tf.io.FixedLenFeature([], tf.string),
    'label': tf.io.FixedLenFeature([], tf.int64)
}

@tf.function
def parse_example(serialized):
    example = tf.io.parse_single_example(serialized, feature_spec)
    image = tf.io.decode_jpeg(example['image'], channels=3)
    image = tf.image.resize(image, [224, 224]) / 255.0
    return image, example['label']

dataset = (tf.data.TFRecordDataset('train.tfrecord')
           .map(parse_example, num_parallel_calls=tf.data.AUTOTUNE)
           .batch(64).prefetch(tf.data.AUTOTUNE))

Q13. How do you use distributed training with tf.distribute?

import tensorflow as tf

# Multi-GPU on single machine
strategy = tf.distribute.MirroredStrategy()

# Multi-machine (parameter server)
ps_strategy = tf.distribute.experimental.ParameterServerStrategy(
    cluster_resolver=tf.distribute.cluster_resolver.TFConfigClusterResolver()
)

# TPU strategy
resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
tpu_strategy = tf.distribute.TPUStrategy(resolver)

# Usage pattern (same code works across strategies)
with strategy.scope():
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(1024, activation='relu', input_shape=(784,)),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    model.compile(
        optimizer=tf.keras.optimizers.AdamW(1e-3),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )

# Adjust batch size: GLOBAL_BATCH = per_replica_batch * num_replicas
GLOBAL_BATCH_SIZE = 64 * strategy.num_replicas_in_sync
train_dataset = train_dataset.rebatch(GLOBAL_BATCH_SIZE)

model.fit(train_dataset, epochs=20)

# Custom training loop with distribute
@tf.function
def distributed_train_step(dataset_inputs):
    per_replica_losses = strategy.run(train_step, args=(dataset_inputs,))
    return strategy.reduce(tf.distribute.ReduceOp.SUM,
                            per_replica_losses, axis=None)

Q14. What is TFLite and how do you convert a model for mobile deployment?

import tensorflow as tf

# Convert Keras model to TFLite
model = tf.keras.models.load_model('trained_model.keras')

# Option 1: Float32 conversion (no quality loss)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Option 2: Dynamic range quantization (minimal accuracy loss)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant = converter.convert()

# Option 3: Full INT8 quantization (best performance on edge)
def representative_dataset():
    for x_batch, _ in val_dataset.take(100):
        yield [x_batch]

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type  = tf.int8
converter.inference_output_type = tf.int8
tflite_int8 = converter.convert()

with open('model_int8.tflite', 'wb') as f:
    f.write(tflite_int8)

# Run inference with TFLite interpreter
interpreter = tf.lite.Interpreter(model_path='model_int8.tflite')
interpreter.allocate_tensors()

input_details  = interpreter.get_input_details()
output_details = interpreter.get_output_details()

interpreter.set_tensor(input_details[0]['index'], X_test[:1].astype('int8'))
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]['index'])

Q15. How does TF Serving work? How do you deploy a model as a REST API?

# Save model in SavedModel format with versioning
# model/
#   1/         <- version number
#     assets/
#     variables/
#     saved_model.pb

import tensorflow as tf

# Export for TF Serving with specific serving signature
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 224, 224, 3],
                                              dtype=tf.float32)])
def serving_fn(x):
    return {'predictions': model(x, training=False)}

tf.saved_model.save(model, 'tf_serving_model/1',
                     signatures={'serving_default': serving_fn})

# Start TF Serving (Docker)
docker run -p 8501:8501 \
  --mount type=bind,source=$(pwd)/tf_serving_model,target=/models/my_model \
  -e MODEL_NAME=my_model \
  tensorflow/serving

# REST API call
curl -X POST http://localhost:8501/v1/models/my_model:predict \
  -d '{"instances": [[0.1, 0.2, ..., 0.9]]}'

# Python client
import requests
import json
import numpy as np

data = json.dumps({'instances': X_test[:5].tolist()})
response = requests.post(
    'http://localhost:8501/v1/models/my_model:predict',
    data=data,
    headers={'Content-Type': 'application/json'}
)
predictions = response.json()['predictions']

Q16. What is Keras Tuner? How do you use it for hyperparameter optimization?

import keras_tuner as kt
import tensorflow as tf

def build_model(hp):
    """Model builder function; hp is the HyperParameters object."""
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Input(shape=(784,)))

    # Search over number of layers and units
    for i in range(hp.Int('num_layers', min_value=1, max_value=5)):
        units = hp.Choice(f'units_{i}', values=[64, 128, 256, 512])
        model.add(tf.keras.layers.Dense(units, activation='gelu'))
        model.add(tf.keras.layers.BatchNormalization())
        if hp.Boolean(f'dropout_{i}'):
            model.add(tf.keras.layers.Dropout(
                hp.Float(f'dropout_rate_{i}', min_value=0.1, max_value=0.5, step=0.1)
            ))

    model.add(tf.keras.layers.Dense(10, activation='softmax'))

    lr = hp.Float('lr', min_value=1e-5, max_value=1e-2, sampling='log')
    model.compile(
        optimizer=tf.keras.optimizers.AdamW(lr),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    return model

# Bayesian optimization search
tuner = kt.BayesianOptimization(
    build_model,
    objective='val_accuracy',
    max_trials=50,
    directory='kt_search',
    project_name='mlp_tuning'
)

tuner.search(X_train, y_train, epochs=20,
              validation_split=0.2,
              callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)])

best_model = tuner.get_best_models(num_models=1)[0]
print(tuner.get_best_hyperparameters()[0].values)

Q17. How do you implement class activation maps (CAM) in TensorFlow?

import tensorflow as tf
import numpy as np
import cv2

def compute_gradcam(model, img_array, layer_name, class_idx=None):
    """
    Compute Grad-CAM for a given image and layer.
    """
    # Create a model that outputs the target layer + final output
    grad_model = tf.keras.models.Model(
        inputs=model.inputs,
        outputs=[model.get_layer(layer_name).output, model.output]
    )

    with tf.GradientTape() as tape:
        conv_outputs, predictions = grad_model(img_array)
        if class_idx is None:
            class_idx = tf.argmax(predictions[0])
        class_score = predictions[:, class_idx]

    # Gradients of class score w.r.t. conv layer output
    grads = tape.gradient(class_score, conv_outputs)   # [1, H, W, C]
    pooled_grads = tf.reduce_mean(grads, axis=[0, 1, 2])  # [C]

    # Weighted combination of feature maps
    conv_outputs = conv_outputs[0]
    heatmap = tf.reduce_sum(conv_outputs * pooled_grads, axis=-1)
    heatmap = tf.nn.relu(heatmap) / (tf.reduce_max(heatmap) + 1e-8)

    return heatmap.numpy(), int(class_idx)

# Example usage
img_array = tf.expand_dims(img_tensor, 0)  # [1, H, W, 3]
heatmap, predicted_class = compute_gradcam(model, img_array, 'block5_conv3')

# Overlay heatmap on original image
heatmap_resized = cv2.resize(heatmap, (original_w, original_h))
heatmap_colored = cv2.applyColorMap(np.uint8(255 * heatmap_resized), cv2.COLORMAP_JET)
superimposed = cv2.addWeighted(original_img, 0.6, heatmap_colored, 0.4, 0)

Q18. What is TFX (TensorFlow Extended)? What are its components?

Component	Purpose
ExampleGen	Ingest raw data; split into train/eval
StatisticsGen	Compute statistics over data
SchemaGen	Infer data schema (types, ranges)
ExampleValidator	Detect data anomalies vs schema
Transform	Feature engineering with tf.Transform
Trainer	Train model using Estimator or Keras
Evaluator	Compute metrics; slice-based evaluation
InfraValidator	Validate model can be loaded for serving
Pusher	Deploy to TF Serving or TFLite

from tfx.components import (
    CsvExampleGen, StatisticsGen, SchemaGen, ExampleValidator,
    Transform, Trainer, Evaluator, Pusher
)
from tfx.dsl.component.experimental.decorators import component
import tensorflow_model_analysis as tfma

# Pipeline definition
def create_pipeline(pipeline_root, data_root):
    example_gen = CsvExampleGen(input_base=data_root)
    stats_gen   = StatisticsGen(examples=example_gen.outputs['examples'])
    schema_gen  = SchemaGen(statistics=stats_gen.outputs['statistics'])
    validator   = ExampleValidator(
        statistics=stats_gen.outputs['statistics'],
        schema=schema_gen.outputs['schema']
    )
    transform = Transform(
        examples=example_gen.outputs['examples'],
        schema=schema_gen.outputs['schema'],
        module_file='preprocessing.py'
    )
    trainer = Trainer(
        module_file='trainer.py',
        examples=transform.outputs['transformed_examples'],
        schema=schema_gen.outputs['schema'],
        train_args=trainer_pb2.TrainArgs(num_steps=1000),
        eval_args=trainer_pb2.EvalArgs(num_steps=200)
    )
    return [example_gen, stats_gen, schema_gen, validator, transform, trainer]

Q19. How does TensorFlow handle text preprocessing with Keras preprocessing layers?

import tensorflow as tf
from tensorflow.keras import layers

# TextVectorization: converts raw strings to integer sequences
vectorizer = layers.TextVectorization(
    max_tokens=20000,
    output_mode='int',
    output_sequence_length=256,
    ngrams=None
)
vectorizer.adapt(text_dataset)   # fit vocabulary on training data

# Embedding layer
embedding = layers.Embedding(
    input_dim=20000,
    output_dim=128,
    mask_zero=True   # handle variable-length sequences
)

# Full text classification model (preprocessing inside model)
inputs = tf.keras.Input(shape=(1,), dtype=tf.string)
x = vectorizer(inputs)
x = embedding(x)
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(64, activation='gelu')(x)
outputs = layers.Dense(1, activation='sigmoid')(x)

model = tf.keras.Model(inputs, outputs)

# At inference: pass raw strings directly
model.predict(["This is a test sentence."])  # no external tokenizer needed

# For embedding-based similarity (hash trick for production)
bag_of_words = layers.TextVectorization(
    max_tokens=50000,
    output_mode='multi_hot'   # sparse binary vector
)
tfidf = layers.TextVectorization(
    max_tokens=50000,
    output_mode='tf_idf'
)

Q20. What is the difference between eager execution and graph execution in TF 2.x?

Aspect	Eager	Graph (tf.function)
Execution	Immediate, line by line	Deferred; compiled DAG
Debugging	Easy: print(), pdb, Python stack traces	Hard: print inside @tf.function needs tf.print
Speed	Slower (Python overhead per op)	Faster (graph optimizations, XLA)
Deployment	Cannot deploy without Python	Portable SavedModel
When to use	Prototyping, debugging	Production, training loops

import tensorflow as tf

# Eager (default in TF 2.x)
a = tf.constant([1.0, 2.0])
print(a + 1)   # immediate: tf.Tensor([2. 3.], ...)

# tf.function: tracing converts to graph
@tf.function
def compute(x, y):
    # Python print only runs at TRACE time (first call)
    print("Tracing!")       # printed only once
    tf.print("Value:", x)   # printed at every EXECUTION
    return x * y + tf.reduce_sum(y)

# First call: traces the function (runs Python print)
result = compute(tf.constant(2.0), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" printed once

# Second call: uses cached graph (Python print NOT called again)
result = compute(tf.constant(3.0), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" NOT printed; tf.print IS printed (graph execution)

# Gotcha: re-tracing happens if input shapes change (without input_signature)
result = compute(tf.constant([1.0, 2.0]), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" printed again (new shape for x)

HARD: Advanced TF Topics (Questions 21-28)

Q21. How do you implement a Transformer model from scratch in TensorFlow?

import tensorflow as tf
from tensorflow.keras import layers

class MultiHeadSelfAttention(layers.Layer):
    def __init__(self, d_model, num_heads, **kwargs):
        super().__init__(**kwargs)
        assert d_model % num_heads == 0
        self.d_model = d_model
        self.num_heads = num_heads
        self.d_k = d_model // num_heads

        self.W_q = layers.Dense(d_model)
        self.W_k = layers.Dense(d_model)
        self.W_v = layers.Dense(d_model)
        self.W_o = layers.Dense(d_model)

    def split_heads(self, x, batch_size):
        x = tf.reshape(x, [batch_size, -1, self.num_heads, self.d_k])
        return tf.transpose(x, [0, 2, 1, 3])  # [B, H, T, d_k]

    def call(self, x, mask=None, training=False):
        B = tf.shape(x)[0]
        Q = self.split_heads(self.W_q(x), B)
        K = self.split_heads(self.W_k(x), B)
        V = self.split_heads(self.W_v(x), B)

        scores = tf.matmul(Q, K, transpose_b=True) / tf.sqrt(float(self.d_k))
        if mask is not None:
            scores += (1 - tf.cast(mask, tf.float32)) * -1e9
        weights = tf.nn.softmax(scores, axis=-1)

        out = tf.matmul(weights, V)                     # [B, H, T, d_k]
        out = tf.transpose(out, [0, 2, 1, 3])           # [B, T, H, d_k]
        out = tf.reshape(out, [B, -1, self.d_model])    # [B, T, d_model]
        return self.W_o(out)

class TransformerBlock(layers.Layer):
    def __init__(self, d_model, num_heads, dff, dropout=0.1, **kwargs):
        super().__init__(**kwargs)
        self.attn  = MultiHeadSelfAttention(d_model, num_heads)
        self.ffn   = tf.keras.Sequential([
            layers.Dense(dff, activation='gelu'),
            layers.Dense(d_model)
        ])
        self.norm1 = layers.LayerNormalization(epsilon=1e-6)
        self.norm2 = layers.LayerNormalization(epsilon=1e-6)
        self.drop1 = layers.Dropout(dropout)
        self.drop2 = layers.Dropout(dropout)

    def call(self, x, mask=None, training=False):
        attn_out = self.attn(self.norm1(x), mask=mask, training=training)
        x = x + self.drop1(attn_out, training=training)   # pre-norm residual
        ffn_out = self.ffn(self.norm2(x))
        return x + self.drop2(ffn_out, training=training)

Q22. What is model quantization aware training (QAT) in TensorFlow?

import tensorflow as tf
import tensorflow_model_optimization as tfmot

# Load trained model
model = tf.keras.models.load_model('trained_model.keras')

# Apply quantization aware training
# This inserts fake quantization nodes into the graph
qat_model = tfmot.quantization.keras.quantize_model(model)

qat_model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-5),   # very small LR for QAT fine-tuning
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Fine-tune with fake quantization for a few epochs (not full training)
qat_model.fit(train_dataset, epochs=3, validation_data=val_dataset,
               callbacks=[tf.keras.callbacks.EarlyStopping(patience=2)])

# Convert to TFLite INT8
converter = tf.lite.TFLiteConverter.from_keras_model(qat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_qat = converter.convert()

# QAT typically recovers 0.5-2% accuracy vs PTQ

Q23. How does automatic differentiation work in TensorFlow?

Its forward computation
Its gradient function (the VJP: vector-Jacobian product)

During tape.gradient(loss, variables), TF traverses this tape in reverse, computing gradients via the chain rule.

import tensorflow as tf

# Basic autodiff
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
    y = x ** 3 + 2 * x    # y = x^3 + 2x
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy())   # dy/dx = 3x^2 + 2 = 29 at x=3

# Second-order gradients (Hessian)
x = tf.Variable([1.0, 2.0])
with tf.GradientTape() as t2:
    with tf.GradientTape() as t1:
        y = tf.reduce_sum(x ** 3)   # y = x1^3 + x2^3
    dy_dx = t1.gradient(y, x)       # [3*x1^2, 3*x2^2]
d2y_dx2 = t2.gradient(dy_dx, x)    # [6*x1, 6*x2]
print(d2y_dx2.numpy())  # [6.0, 12.0]

# Custom gradient (for non-differentiable ops or improved numerical stability)
@tf.custom_gradient
def log_softmax_stable(x):
    result = tf.nn.log_softmax(x)
    def grad(upstream):
        softmax = tf.nn.softmax(x)
        return upstream - tf.reduce_sum(upstream) * softmax
    return result, grad

Q24. What is the difference between stateful and stateless layers in Keras?

Type	State	Examples	Behavior Across Calls
Stateless	No trainable params or moving averages	ReLU, Dropout, Softmax	Same output for same input
Stateful (learnable)	Trainable weights	Dense, Conv2D, Embedding	Weights updated during training
Stateful (running stats)	Non-trainable moving averages	BatchNormalization	Different behavior in train vs eval
Recurrent (sequence state)	Hidden state across time steps	LSTM, GRU	State persists between batches if stateful=True

import tensorflow as tf
from tensorflow.keras import layers

# LSTM with stateful=True: state preserved across batches (for long sequences)
stateful_lstm = layers.LSTM(
    64,
    stateful=True,      # keep state across batches
    return_sequences=True
)
# Requires fixed batch size: model = Sequential([Input(batch_size=32, shape=(None, features))])
# Must call model.reset_states() between sequences

# BatchNormalization: stateful during training (updates running stats)
bn = layers.BatchNormalization()
# model.trainable = False -> bn uses running_mean/running_var (eval mode)
# model.trainable = True  -> bn uses batch statistics (train mode)

# GRU (often preferred over LSTM in production: fewer params, similar performance)
gru_layer = layers.GRU(
    128,
    return_sequences=True,   # return output at every time step
    return_state=True,       # also return final hidden state
    dropout=0.1,
    recurrent_dropout=0.1,
    reset_after=True         # cuDNN-compatible implementation
)

Q25. How do you profile and optimize TensorFlow model training?

import tensorflow as tf

# TensorBoard Profiler (trace GPU ops)
tb_callback = tf.keras.callbacks.TensorBoard(
    log_dir='./logs',
    profile_batch='10,20'   # profile batches 10-20
)
model.fit(dataset, callbacks=[tb_callback])
# Then: tensorboard --logdir ./logs
# Look at: GPU utilization, memory bandwidth, op timeline

# tf.profiler API
tf.profiler.experimental.start('logdir')
model.fit(dataset, epochs=1)
tf.profiler.experimental.stop()

# Common bottlenecks and fixes:
# 1. Low GPU utilization: increase batch size, num_parallel_calls, prefetch
# 2. Memory fragmentation: use tf.data, avoid Python loops in training
# 3. Slow data loading: use TFRecords + multiple workers

# XLA (Accelerated Linear Algebra) compilation
@tf.function(jit_compile=True)  # enables XLA fusion
def train_step_xla(x, y):
    with tf.GradientTape() as tape:
        loss = loss_fn(y, model(x, training=True))
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

Q26. What is model pruning in TensorFlow? How do you implement it?

import tensorflow as tf
import tensorflow_model_optimization as tfmot

# Magnitude-based weight pruning
pruning_params = {
    'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
        initial_sparsity=0.0,
        final_sparsity=0.5,        # 50% of weights pruned
        begin_step=0,
        end_step=1000,
        frequency=100              # prune every 100 steps
    )
}

model_for_pruning = tfmot.sparsity.keras.prune_low_magnitude(
    model, **pruning_params
)
model_for_pruning.compile(
    optimizer=tf.keras.optimizers.Adam(1e-4),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# PruningCallback is required to update masks
callbacks = [
    tfmot.sparsity.keras.UpdatePruningStep(),
    tfmot.sparsity.keras.PruningSummaries(log_dir='./pruning_logs')
]
model_for_pruning.fit(train_data, epochs=10, callbacks=callbacks)

# Strip pruning wrappers (make weights actually sparse)
final_model = tfmot.sparsity.keras.strip_pruning(model_for_pruning)

# Sparsity is now "real" in weights
for layer in final_model.layers:
    if hasattr(layer, 'kernel'):
        sparsity = 1 - (tf.math.count_nonzero(layer.kernel) /
                         tf.size(layer.kernel, out_type=tf.int64)).numpy()
        if sparsity > 0:
            print(f"{layer.name}: {sparsity:.1%} sparse")

Q27. How do you export a model for serving with TF Serving in production?

import tensorflow as tf

# Save model with explicit serving signature
model = tf.keras.models.load_model('trained_model.keras')

@tf.function(input_signature=[
    tf.TensorSpec(shape=[None, 224, 224, 3], dtype=tf.float32, name='images')
])
def serve(images):
    # Preprocessing inside the serving function
    images = tf.image.resize(images, [224, 224])
    images = tf.cast(images, tf.float32) / 255.0
    # Normalize with ImageNet stats
    mean = tf.constant([0.485, 0.456, 0.406])
    std  = tf.constant([0.229, 0.224, 0.225])
    images = (images - mean) / std
    predictions = model(images, training=False)
    return {
        'class_ids': tf.argmax(predictions, axis=-1),
        'probabilities': predictions,
        'top_5_ids': tf.math.top_k(predictions, k=5).indices
    }

tf.saved_model.save(
    model,
    'production_model/1',
    signatures={'serving_default': serve}
)

# Batching (TF Serving can auto-batch requests)
# Configure in batching_parameters.txt:
# max_batch_size: 64
# batch_timeout_micros: 5000
# max_enqueued_batches: 1000

Q28. Design a TF pipeline for real-time image classification at scale.

Production pipeline design:

Serving layer:
  Client -> Load Balancer -> TF Serving fleet (N replicas on GPU)
  - Each TF Serving instance: 2 GPUs, auto-batching enabled
  - Batch size 32, timeout 10ms (P99 latency target: 50ms)
  - gRPC for internal (faster), REST for external clients

Model:
  - EfficientNetV2-S: best accuracy/latency for 384x384 input
  - Quantized INT8 (TFLite on edge) or FP16 (TF Serving on GPU)
  - Model version: A/B test via TF Serving model config

Input pipeline:
  - Preprocessing inside the saved model (resize, normalize)
  - No client-side preprocessing required

Monitoring:
  - Prometheus + Grafana for latency/throughput
  - Model performance: accuracy on labelled production samples
  - Drift detection: feature distribution vs training baseline

# TF Serving gRPC client
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2, prediction_service_pb2_grpc

def predict_grpc(images, stub):
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'my_model'
    request.model_spec.signature_name = 'serving_default'
    request.inputs['images'].CopyFrom(
        tf.make_tensor_proto(images, dtype=tf.float32)
    )
    result = stub.Predict(request, timeout=10)
    return tf.make_ndarray(result.outputs['class_ids'])

channel = grpc.insecure_channel('localhost:8500')
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

TensorFlow Ecosystem at a Glance

Component	Purpose	Alternative
Keras	High-level model API	PyTorch nn.Module
tf.data	Input pipeline	PyTorch DataLoader
TFLite	Mobile/edge deployment	ONNX + ORT, CoreML
TF Serving	Production REST/gRPC serving	TorchServe, Triton
TFX	End-to-end ML pipelines	MLflow, Kubeflow
TF Hub	Pre-trained models	HuggingFace Hub
TensorBoard	Training visualization	W&B, MLflow UI
TFMA	Model evaluation	Custom evaluation

FAQ

Q: Should I learn TensorFlow or PyTorch for ML interviews?

A: Both. PyTorch is the standard for research and many startups. TensorFlow is required for Google roles and is standard at many enterprises. Start with PyTorch for intuition; learn TF 2.x/Keras for deployment tooling.

Q: What is the difference between Keras and TensorFlow?

A: Keras is the high-level API that lives inside TensorFlow 2.x. You import it as tf.keras. Keras can also run on top of JAX or PyTorch as of 2024 (Keras 3 / multi-backend).

Q: What is JAX and how does it relate to TensorFlow?

A: JAX is a Python library from Google for numerical computing with automatic differentiation. It is the foundation for Google's internal research (replacing TF in many teams) and is used by DeepMind, Anthropic, and others. It uses NumPy syntax but runs on accelerators via XLA.

Related articles on PapersAdda:

Sources and review notesreviewed 8 Jun 2026

Article-specific sources

Verification window

Page last edited 8 Jun 2026 by Aditya Sharma. A review date records an editorial edit, not a guarantee that every external fact is still current.

Evidence labels

Official notices, candidate reports, offer documents, and editorial practice questions carry different confidence levels. The visible source list lets you inspect the evidence instead of relying on a blanket verification badge.

Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

topic cluster

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.

Submit your story →

ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start free mock test →

related guides

Interview Questions

Share this guide

Twitter LinkedIn W WhatsApp