issue 117apr 27mmxxvi
est. 2017
Sun, 27 Apr 2026
vol. IX · no. 117
PapersAdda
placement intelligence, since 2017
640+ briefs · 24 campuses · by reservation
verified offers · sourced from r/developersIndia
razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1

TensorFlow Interview Questions 2026: 28 Answers with Code

24 min read
Interview Questions
Updated: 8 Jun 2026
Aditya Sharma
Aditya's Edit

PapersAdda 2026 Placement Cycle

By Aditya Sharma·Founder & Editor, PapersAdda

What changed in 2026 drives

Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.

What I'd actually study for this

  • 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
  • 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
  • 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
  • 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken

Where most candidates trip up

The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.

Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.

TensorFlow 2.x with Keras is the dominant enterprise ML framework at Google, Google Cloud, and thousands of production ML teams worldwide. While PyTorch leads in research, TensorFlow wins on deployment tooling: TFLite for mobile, TF Serving for production APIs, TFX for pipelines, and Google's TPU infrastructure. This guide covers 28 TensorFlow interview questions with complete code examples.

PapersAdda's take: If you're interviewing at a Google-adjacent company, a large bank, or any company with a legacy ML infrastructure, you will encounter TensorFlow. Know TF 2.x / Keras well, understand GradientTape for custom training, and be ready to explain TFLite deployment. Candidates report that GradientTape custom training loops and tf.data pipeline design are the two most frequently tested TensorFlow topics at Google and banking ML teams. According to candidate accounts from public preparation resources, TFLite deployment questions appear in interviews for mobile ML roles. Confirm the specific technology stack and interview format on the official careers portal before your round.

Related articles: PyTorch Interview Questions 2026 | Deep Learning Interview Questions 2026 | MLOps Interview Questions 2026 | Machine Learning Interview Questions 2026 | AI/ML Interview Questions 2026


Which Companies Ask TensorFlow Questions?

Company TypeWhy They Use TF
Google and subsidiariesCreated TensorFlow; internal standard
Cloud ML services (GCP, AWS SageMaker)Native TF integration
Enterprise finance and bankingProduction reliability, TF Serving
Mobile-first companiesTFLite for on-device ML
Research at some labsJAX (TF sibling) for TPU work

EASY: TF 2.x and Keras Fundamentals (Questions 1-10)

Q1. What is the difference between TensorFlow 1.x and TensorFlow 2.x?

AspectTF 1.xTF 2.x
ExecutionDeferred (build graph, then session.run)Eager (immediate, like Python)
APILow-level graph APIKeras-first, high-level
DebuggingHard (graph is opaque)Easy (standard Python debugging)
PerformanceOptimized graphsEager by default; tf.function for graphs
MigrationN/Atf.compat.v1 shim available
import tensorflow as tf

print(tf.__version__)   # 2.15 or higher in 2026

# TF 2.x: eager execution by default
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])
c = tf.matmul(a, b)
print(c.numpy())   # immediate result, no session needed

# tf.function: compiles to graph for performance
@tf.function(jit_compile=True)  # XLA compilation for TPU/GPU speedup
def matrix_power(x, n):
    result = tf.eye(x.shape[0], dtype=x.dtype)
    for _ in range(n):
        result = tf.matmul(result, x)
    return result

Q2. How do you build a neural network with Keras? What are the three API styles?

APIUse WhenFlexibility
SequentialSimple linear stacksLow
FunctionalMultiple inputs/outputs, branchesHigh
Subclassing (Model)Custom forward pass logicHighest
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Sequential API
seq_model = keras.Sequential([
    layers.Dense(256, activation='relu', input_shape=(784,)),
    layers.BatchNormalization(),
    layers.Dropout(0.3),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Functional API (handles multiple inputs/outputs, skip connections)
inputs = keras.Input(shape=(784,))
x = layers.Dense(256, activation='gelu')(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.3)(x)
x = layers.Dense(128, activation='gelu')(x)
outputs = layers.Dense(10, activation='softmax')(x)
func_model = keras.Model(inputs, outputs, name='mlp')

# Subclassing API (full PyTorch-like control)
class ResBlock(keras.Model):
    def __init__(self, units):
        super().__init__()
        self.dense1 = layers.Dense(units, activation='gelu')
        self.dense2 = layers.Dense(units)
        self.bn     = layers.BatchNormalization()
        self.proj   = layers.Dense(units)

    def call(self, x, training=False):
        residual = self.proj(x)
        out = self.dense1(x)
        out = self.dense2(out)
        out = self.bn(out, training=training)
        return keras.activations.gelu(out + residual)

Q3. What is GradientTape and how do you write a custom training loop?

import tensorflow as tf

# Build model and optimizer
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(10)
])
optimizer = tf.keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-2)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
train_acc = tf.keras.metrics.SparseCategoricalAccuracy()

# Custom training step
@tf.function   # compile to graph for speed
def train_step(x, y):
    with tf.GradientTape() as tape:
        logits = model(x, training=True)
        loss   = loss_fn(y, logits)
        loss  += sum(model.losses)   # regularization losses
    gradients = tape.gradient(loss, model.trainable_variables)
    # Gradient clipping
    gradients, _ = tf.clip_by_global_norm(gradients, 1.0)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    train_acc.update_state(y, logits)
    return loss

# Training loop
for epoch in range(n_epochs):
    train_acc.reset_state()
    for x_batch, y_batch in train_dataset:
        loss = train_step(x_batch, y_batch)
    print(f"Epoch {epoch+1}: loss={loss:.4f}, acc={train_acc.result():.4f}")

Q4. What is tf.data? How do you build an efficient input pipeline?

import tensorflow as tf

# From numpy arrays
dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))

# From files (lazy loading)
file_dataset = tf.data.Dataset.list_files('data/*.tfrecord')

# Efficient pipeline
AUTOTUNE = tf.data.AUTOTUNE
BATCH_SIZE = 64

def augment(image, label):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_brightness(image, 0.2)
    image = tf.image.random_contrast(image, 0.8, 1.2)
    return image, label

train_pipeline = (
    dataset
    .shuffle(buffer_size=10000, seed=42)    # shuffle before batch
    .map(augment, num_parallel_calls=AUTOTUNE)  # parallel preprocessing
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(AUTOTUNE)                     # overlap data loading with GPU
)

# TFRecord pipeline (most efficient for large datasets)
def parse_tfrecord(serialized_example):
    feature_spec = {
        'image': tf.io.FixedLenFeature([], tf.string),
        'label': tf.io.FixedLenFeature([], tf.int64)
    }
    parsed = tf.io.parse_single_example(serialized_example, feature_spec)
    image = tf.io.decode_jpeg(parsed['image'])
    image = tf.image.resize(image, [224, 224]) / 255.0
    return image, parsed['label']

tfrecord_dataset = (
    file_dataset
    .interleave(tf.data.TFRecordDataset,
                cycle_length=4, num_parallel_calls=AUTOTUNE)
    .map(parse_tfrecord, num_parallel_calls=AUTOTUNE)
    .batch(BATCH_SIZE)
    .prefetch(AUTOTUNE)
)

Q5. How do you implement regularization in Keras?

from tensorflow.keras import layers, regularizers

# L2 weight regularization
dense = layers.Dense(
    256,
    activation='relu',
    kernel_regularizer=regularizers.L2(l2=0.001),
    bias_regularizer=regularizers.L2(l2=0.001)
)

# Dropout
dropout = layers.Dropout(rate=0.3)

# Batch Normalization (also has regularization effect)
bn = layers.BatchNormalization(
    momentum=0.99,   # running average for mean/variance
    epsilon=1e-3
)

# Spatial Dropout (for CNN: drop entire feature maps)
spatial_dropout = layers.SpatialDropout2D(rate=0.2)

# Full regularized CNN block
def regularized_conv_block(filters, kernel_size=3):
    return tf.keras.Sequential([
        layers.Conv2D(filters, kernel_size, padding='same', use_bias=False,
                       kernel_regularizer=regularizers.L2(1e-4)),
        layers.BatchNormalization(),
        layers.ReLU(),
        layers.SpatialDropout2D(0.1)
    ])

Q6. What are callbacks in Keras? What are the essential ones?

import tensorflow as tf

callbacks = [
    # Stop training when val_loss stops improving
    tf.keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True,
        verbose=1
    ),

    # Reduce LR when plateau detected
    tf.keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,     # multiply LR by 0.5
        patience=3,
        min_lr=1e-7,
        verbose=1
    ),

    # Save best model checkpoint
    tf.keras.callbacks.ModelCheckpoint(
        filepath='checkpoints/model_{epoch:03d}_{val_accuracy:.4f}.keras',
        monitor='val_accuracy',
        save_best_only=True,
        save_weights_only=False
    ),

    # TensorBoard logging
    tf.keras.callbacks.TensorBoard(
        log_dir='./logs',
        histogram_freq=1,       # log weight histograms every epoch
        write_graph=True,
        update_freq='epoch'
    ),

    # Custom callback
    class LearningRateLogger(tf.keras.callbacks.Callback):
        def on_epoch_end(self, epoch, logs=None):
            lr = float(self.model.optimizer.learning_rate)
            print(f"\nEpoch {epoch}: LR = {lr:.6f}")
]

model.fit(train_data, validation_data=val_data,
          epochs=100, callbacks=callbacks)

Q7. How do you save and load models in TensorFlow?

import tensorflow as tf

# SavedModel format (recommended, framework-independent)
model.save('saved_model/my_model')      # directory with assets + variables
loaded = tf.saved_model.load('saved_model/my_model')

# Keras format (.keras, new in TF 2.12)
model.save('model.keras')               # single file, Keras only
loaded = tf.keras.models.load_model('model.keras')

# Weights only (for checkpoint/transfer learning)
model.save_weights('weights/checkpoint')
model.load_weights('weights/checkpoint')

# HDF5 format (legacy)
model.save('model.h5')
loaded = tf.keras.models.load_model('model.h5')

# Inspect a SavedModel
print(tf.saved_model.load('saved_model/my_model').signatures)

# Load and run inference
infer = tf.saved_model.load('saved_model/my_model').signatures['serving_default']
output = infer(input_tensor=tf.constant(X_test[:5]))

Q8. What is tf.function and when should you use it?

import tensorflow as tf
import time

model = tf.keras.Sequential([tf.keras.layers.Dense(1024, activation='relu'),
                               tf.keras.layers.Dense(10)])

# Without tf.function: eager, Python overhead
def eager_predict(x):
    return model(x, training=False)

# With tf.function: compiled graph
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 100], dtype=tf.float32)])
def fast_predict(x):
    return model(x, training=False)

x = tf.random.normal([1000, 100])

# Warmup (first call traces the graph)
_ = fast_predict(x)

t = time.time()
for _ in range(100):
    eager_predict(x)
print(f"Eager: {time.time()-t:.2f}s")

t = time.time()
for _ in range(100):
    fast_predict(x)
print(f"tf.function: {time.time()-t:.2f}s")  # typically 2-5x faster

# Gotcha: retracing
# If you call with different dtypes/shapes without input_signature,
# TF will retrace each time (expensive). Use input_signature to prevent this.

Q9. How do you implement custom layers and losses in Keras?

import tensorflow as tf
from tensorflow import keras

# Custom Layer
class ScaledDotProductAttention(keras.layers.Layer):
    def __init__(self, d_k, **kwargs):
        super().__init__(**kwargs)
        self.d_k = d_k
        self.scale = d_k ** -0.5

    def call(self, Q, K, V, mask=None):
        scores = tf.matmul(Q, K, transpose_b=True) * self.scale
        if mask is not None:
            scores += (1.0 - tf.cast(mask, tf.float32)) * (-1e9)
        weights = tf.nn.softmax(scores, axis=-1)
        return tf.matmul(weights, V), weights

    def get_config(self):   # needed for model.save()
        config = super().get_config()
        config.update({'d_k': self.d_k})
        return config

# Custom Loss
class FocalLoss(keras.losses.Loss):
    def __init__(self, gamma=2.0, alpha=0.25, **kwargs):
        super().__init__(**kwargs)
        self.gamma = gamma
        self.alpha = alpha

    def call(self, y_true, y_pred):
        y_pred = tf.clip_by_value(y_pred, 1e-7, 1 - 1e-7)
        ce = -y_true * tf.math.log(y_pred)
        pt = tf.where(y_true == 1, y_pred, 1 - y_pred)
        focal_weight = self.alpha * (1 - pt) ** self.gamma
        return tf.reduce_mean(tf.reduce_sum(focal_weight * ce, axis=-1))

# Custom Metric
class F1Score(keras.metrics.Metric):
    def __init__(self, threshold=0.5, **kwargs):
        super().__init__(**kwargs)
        self.threshold = threshold
        self.tp = self.add_weight(name='tp', initializer='zeros')
        self.fp = self.add_weight(name='fp', initializer='zeros')
        self.fn = self.add_weight(name='fn', initializer='zeros')

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.cast(y_pred >= self.threshold, tf.float32)
        self.tp.assign_add(tf.reduce_sum(y_true * y_pred))
        self.fp.assign_add(tf.reduce_sum((1 - y_true) * y_pred))
        self.fn.assign_add(tf.reduce_sum(y_true * (1 - y_pred)))

    def result(self):
        precision = self.tp / (self.tp + self.fp + 1e-7)
        recall    = self.tp / (self.tp + self.fn + 1e-7)
        return 2 * precision * recall / (precision + recall + 1e-7)

    def reset_state(self):
        for v in self.variables:
            v.assign(tf.zeros_like(v))

Q10. What is mixed precision training in TensorFlow?

import tensorflow as tf

# Enable mixed precision (float16 compute, float32 weights)
policy = tf.keras.mixed_precision.Policy('mixed_float16')
tf.keras.mixed_precision.set_global_policy(policy)

# Build model (same code; weights auto-cast)
model = tf.keras.Sequential([
    tf.keras.layers.Dense(1024, activation='relu'),
    tf.keras.layers.Dense(10, dtype='float32')   # keep final layer float32
])

# Loss scaling prevents gradient underflow in float16
optimizer = tf.keras.optimizers.AdamW(1e-3)
optimizer = tf.keras.mixed_precision.LossScaleOptimizer(optimizer)

# In custom loop:
@tf.function
def train_step(x, y):
    with tf.GradientTape() as tape:
        logits = model(x, training=True)
        loss = loss_fn(y, logits)
        scaled_loss = optimizer.get_scaled_loss(loss)
    scaled_grads = tape.gradient(scaled_loss, model.trainable_variables)
    grads = optimizer.get_unscaled_gradients(scaled_grads)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

# bfloat16 (better for TPU, no loss scaling needed)
tpu_policy = tf.keras.mixed_precision.Policy('mixed_bfloat16')

MEDIUM: Advanced Keras and TF Ecosystem (Questions 11-20)

Q11. How do you implement transfer learning with TensorFlow Hub?

import tensorflow_hub as hub
import tensorflow as tf

# EfficientNetV2 from TF Hub
hub_url = 'https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_ft1k_l/feature_vector/2'
feature_extractor = hub.KerasLayer(hub_url, trainable=False, input_shape=(384, 384, 3))

# Build classification model
model = tf.keras.Sequential([
    feature_extractor,
    tf.keras.layers.Dense(256, activation='gelu'),
    tf.keras.layers.Dropout(0.3),
    tf.keras.layers.Dense(num_classes, activation='softmax')
])

model.compile(
    optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-3),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Two-phase fine-tuning
# Phase 1: train only head
feature_extractor.trainable = False
model.fit(train_ds, epochs=5, validation_data=val_ds)

# Phase 2: unfreeze backbone with small LR
feature_extractor.trainable = True
model.compile(
    optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-5),  # 100x smaller LR
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)
model.fit(train_ds, epochs=15, validation_data=val_ds)

Q12. What is the TFRecord format and why is it used for large datasets?

  • Sequential reads (no seeking): much faster than reading individual image files
  • Lossless encoding of heterogeneous data (images, text, labels)
  • Works seamlessly with tf.data pipeline
import tensorflow as tf
import numpy as np

# Write TFRecords
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def serialize_example(image_array, label):
    encoded = tf.image.encode_jpeg(image_array).numpy()
    feature = {
        'image': _bytes_feature(encoded),
        'label': _int64_feature(label),
        'height': _int64_feature(image_array.shape[0]),
        'width':  _int64_feature(image_array.shape[1])
    }
    proto = tf.train.Example(features=tf.train.Features(feature=feature))
    return proto.SerializeToString()

with tf.io.TFRecordWriter('train.tfrecord') as writer:
    for img, lbl in zip(images, labels):
        writer.write(serialize_example(img, lbl))

# Read TFRecords
feature_spec = {
    'image': tf.io.FixedLenFeature([], tf.string),
    'label': tf.io.FixedLenFeature([], tf.int64)
}

@tf.function
def parse_example(serialized):
    example = tf.io.parse_single_example(serialized, feature_spec)
    image = tf.io.decode_jpeg(example['image'], channels=3)
    image = tf.image.resize(image, [224, 224]) / 255.0
    return image, example['label']

dataset = (tf.data.TFRecordDataset('train.tfrecord')
           .map(parse_example, num_parallel_calls=tf.data.AUTOTUNE)
           .batch(64).prefetch(tf.data.AUTOTUNE))

Q13. How do you use distributed training with tf.distribute?

import tensorflow as tf

# Multi-GPU on single machine
strategy = tf.distribute.MirroredStrategy()

# Multi-machine (parameter server)
ps_strategy = tf.distribute.experimental.ParameterServerStrategy(
    cluster_resolver=tf.distribute.cluster_resolver.TFConfigClusterResolver()
)

# TPU strategy
resolver = tf.distribute.cluster_resolver.TPUClusterResolver()
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
tpu_strategy = tf.distribute.TPUStrategy(resolver)

# Usage pattern (same code works across strategies)
with strategy.scope():
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(1024, activation='relu', input_shape=(784,)),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    model.compile(
        optimizer=tf.keras.optimizers.AdamW(1e-3),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )

# Adjust batch size: GLOBAL_BATCH = per_replica_batch * num_replicas
GLOBAL_BATCH_SIZE = 64 * strategy.num_replicas_in_sync
train_dataset = train_dataset.rebatch(GLOBAL_BATCH_SIZE)

model.fit(train_dataset, epochs=20)

# Custom training loop with distribute
@tf.function
def distributed_train_step(dataset_inputs):
    per_replica_losses = strategy.run(train_step, args=(dataset_inputs,))
    return strategy.reduce(tf.distribute.ReduceOp.SUM,
                            per_replica_losses, axis=None)

Q14. What is TFLite and how do you convert a model for mobile deployment?

import tensorflow as tf

# Convert Keras model to TFLite
model = tf.keras.models.load_model('trained_model.keras')

# Option 1: Float32 conversion (no quality loss)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Option 2: Dynamic range quantization (minimal accuracy loss)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant = converter.convert()

# Option 3: Full INT8 quantization (best performance on edge)
def representative_dataset():
    for x_batch, _ in val_dataset.take(100):
        yield [x_batch]

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type  = tf.int8
converter.inference_output_type = tf.int8
tflite_int8 = converter.convert()

with open('model_int8.tflite', 'wb') as f:
    f.write(tflite_int8)

# Run inference with TFLite interpreter
interpreter = tf.lite.Interpreter(model_path='model_int8.tflite')
interpreter.allocate_tensors()

input_details  = interpreter.get_input_details()
output_details = interpreter.get_output_details()

interpreter.set_tensor(input_details[0]['index'], X_test[:1].astype('int8'))
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]['index'])

Q15. How does TF Serving work? How do you deploy a model as a REST API?

# Save model in SavedModel format with versioning
# model/
#   1/         <- version number
#     assets/
#     variables/
#     saved_model.pb
import tensorflow as tf

# Export for TF Serving with specific serving signature
@tf.function(input_signature=[tf.TensorSpec(shape=[None, 224, 224, 3],
                                              dtype=tf.float32)])
def serving_fn(x):
    return {'predictions': model(x, training=False)}

tf.saved_model.save(model, 'tf_serving_model/1',
                     signatures={'serving_default': serving_fn})
# Start TF Serving (Docker)
docker run -p 8501:8501 \
  --mount type=bind,source=$(pwd)/tf_serving_model,target=/models/my_model \
  -e MODEL_NAME=my_model \
  tensorflow/serving

# REST API call
curl -X POST http://localhost:8501/v1/models/my_model:predict \
  -d '{"instances": [[0.1, 0.2, ..., 0.9]]}'
# Python client
import requests
import json
import numpy as np

data = json.dumps({'instances': X_test[:5].tolist()})
response = requests.post(
    'http://localhost:8501/v1/models/my_model:predict',
    data=data,
    headers={'Content-Type': 'application/json'}
)
predictions = response.json()['predictions']

Q16. What is Keras Tuner? How do you use it for hyperparameter optimization?

import keras_tuner as kt
import tensorflow as tf

def build_model(hp):
    """Model builder function; hp is the HyperParameters object."""
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Input(shape=(784,)))

    # Search over number of layers and units
    for i in range(hp.Int('num_layers', min_value=1, max_value=5)):
        units = hp.Choice(f'units_{i}', values=[64, 128, 256, 512])
        model.add(tf.keras.layers.Dense(units, activation='gelu'))
        model.add(tf.keras.layers.BatchNormalization())
        if hp.Boolean(f'dropout_{i}'):
            model.add(tf.keras.layers.Dropout(
                hp.Float(f'dropout_rate_{i}', min_value=0.1, max_value=0.5, step=0.1)
            ))

    model.add(tf.keras.layers.Dense(10, activation='softmax'))

    lr = hp.Float('lr', min_value=1e-5, max_value=1e-2, sampling='log')
    model.compile(
        optimizer=tf.keras.optimizers.AdamW(lr),
        loss='sparse_categorical_crossentropy',
        metrics=['accuracy']
    )
    return model

# Bayesian optimization search
tuner = kt.BayesianOptimization(
    build_model,
    objective='val_accuracy',
    max_trials=50,
    directory='kt_search',
    project_name='mlp_tuning'
)

tuner.search(X_train, y_train, epochs=20,
              validation_split=0.2,
              callbacks=[tf.keras.callbacks.EarlyStopping(patience=3)])

best_model = tuner.get_best_models(num_models=1)[0]
print(tuner.get_best_hyperparameters()[0].values)

Q17. How do you implement class activation maps (CAM) in TensorFlow?

import tensorflow as tf
import numpy as np
import cv2

def compute_gradcam(model, img_array, layer_name, class_idx=None):
    """
    Compute Grad-CAM for a given image and layer.
    """
    # Create a model that outputs the target layer + final output
    grad_model = tf.keras.models.Model(
        inputs=model.inputs,
        outputs=[model.get_layer(layer_name).output, model.output]
    )

    with tf.GradientTape() as tape:
        conv_outputs, predictions = grad_model(img_array)
        if class_idx is None:
            class_idx = tf.argmax(predictions[0])
        class_score = predictions[:, class_idx]

    # Gradients of class score w.r.t. conv layer output
    grads = tape.gradient(class_score, conv_outputs)   # [1, H, W, C]
    pooled_grads = tf.reduce_mean(grads, axis=[0, 1, 2])  # [C]

    # Weighted combination of feature maps
    conv_outputs = conv_outputs[0]
    heatmap = tf.reduce_sum(conv_outputs * pooled_grads, axis=-1)
    heatmap = tf.nn.relu(heatmap) / (tf.reduce_max(heatmap) + 1e-8)

    return heatmap.numpy(), int(class_idx)

# Example usage
img_array = tf.expand_dims(img_tensor, 0)  # [1, H, W, 3]
heatmap, predicted_class = compute_gradcam(model, img_array, 'block5_conv3')

# Overlay heatmap on original image
heatmap_resized = cv2.resize(heatmap, (original_w, original_h))
heatmap_colored = cv2.applyColorMap(np.uint8(255 * heatmap_resized), cv2.COLORMAP_JET)
superimposed = cv2.addWeighted(original_img, 0.6, heatmap_colored, 0.4, 0)

Q18. What is TFX (TensorFlow Extended)? What are its components?

ComponentPurpose
ExampleGenIngest raw data; split into train/eval
StatisticsGenCompute statistics over data
SchemaGenInfer data schema (types, ranges)
ExampleValidatorDetect data anomalies vs schema
TransformFeature engineering with tf.Transform
TrainerTrain model using Estimator or Keras
EvaluatorCompute metrics; slice-based evaluation
InfraValidatorValidate model can be loaded for serving
PusherDeploy to TF Serving or TFLite
from tfx.components import (
    CsvExampleGen, StatisticsGen, SchemaGen, ExampleValidator,
    Transform, Trainer, Evaluator, Pusher
)
from tfx.dsl.component.experimental.decorators import component
import tensorflow_model_analysis as tfma

# Pipeline definition
def create_pipeline(pipeline_root, data_root):
    example_gen = CsvExampleGen(input_base=data_root)
    stats_gen   = StatisticsGen(examples=example_gen.outputs['examples'])
    schema_gen  = SchemaGen(statistics=stats_gen.outputs['statistics'])
    validator   = ExampleValidator(
        statistics=stats_gen.outputs['statistics'],
        schema=schema_gen.outputs['schema']
    )
    transform = Transform(
        examples=example_gen.outputs['examples'],
        schema=schema_gen.outputs['schema'],
        module_file='preprocessing.py'
    )
    trainer = Trainer(
        module_file='trainer.py',
        examples=transform.outputs['transformed_examples'],
        schema=schema_gen.outputs['schema'],
        train_args=trainer_pb2.TrainArgs(num_steps=1000),
        eval_args=trainer_pb2.EvalArgs(num_steps=200)
    )
    return [example_gen, stats_gen, schema_gen, validator, transform, trainer]

Q19. How does TensorFlow handle text preprocessing with Keras preprocessing layers?

import tensorflow as tf
from tensorflow.keras import layers

# TextVectorization: converts raw strings to integer sequences
vectorizer = layers.TextVectorization(
    max_tokens=20000,
    output_mode='int',
    output_sequence_length=256,
    ngrams=None
)
vectorizer.adapt(text_dataset)   # fit vocabulary on training data

# Embedding layer
embedding = layers.Embedding(
    input_dim=20000,
    output_dim=128,
    mask_zero=True   # handle variable-length sequences
)

# Full text classification model (preprocessing inside model)
inputs = tf.keras.Input(shape=(1,), dtype=tf.string)
x = vectorizer(inputs)
x = embedding(x)
x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(64, activation='gelu')(x)
outputs = layers.Dense(1, activation='sigmoid')(x)

model = tf.keras.Model(inputs, outputs)

# At inference: pass raw strings directly
model.predict(["This is a test sentence."])  # no external tokenizer needed

# For embedding-based similarity (hash trick for production)
bag_of_words = layers.TextVectorization(
    max_tokens=50000,
    output_mode='multi_hot'   # sparse binary vector
)
tfidf = layers.TextVectorization(
    max_tokens=50000,
    output_mode='tf_idf'
)

Q20. What is the difference between eager execution and graph execution in TF 2.x?

AspectEagerGraph (tf.function)
ExecutionImmediate, line by lineDeferred; compiled DAG
DebuggingEasy: print(), pdb, Python stack tracesHard: print inside @tf.function needs tf.print
SpeedSlower (Python overhead per op)Faster (graph optimizations, XLA)
DeploymentCannot deploy without PythonPortable SavedModel
When to usePrototyping, debuggingProduction, training loops
import tensorflow as tf

# Eager (default in TF 2.x)
a = tf.constant([1.0, 2.0])
print(a + 1)   # immediate: tf.Tensor([2. 3.], ...)

# tf.function: tracing converts to graph
@tf.function
def compute(x, y):
    # Python print only runs at TRACE time (first call)
    print("Tracing!")       # printed only once
    tf.print("Value:", x)   # printed at every EXECUTION
    return x * y + tf.reduce_sum(y)

# First call: traces the function (runs Python print)
result = compute(tf.constant(2.0), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" printed once

# Second call: uses cached graph (Python print NOT called again)
result = compute(tf.constant(3.0), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" NOT printed; tf.print IS printed (graph execution)

# Gotcha: re-tracing happens if input shapes change (without input_signature)
result = compute(tf.constant([1.0, 2.0]), tf.constant([1.0, 2.0, 3.0]))
# "Tracing!" printed again (new shape for x)

HARD: Advanced TF Topics (Questions 21-28)

Q21. How do you implement a Transformer model from scratch in TensorFlow?

import tensorflow as tf
from tensorflow.keras import layers

class MultiHeadSelfAttention(layers.Layer):
    def __init__(self, d_model, num_heads, **kwargs):
        super().__init__(**kwargs)
        assert d_model % num_heads == 0
        self.d_model = d_model
        self.num_heads = num_heads
        self.d_k = d_model // num_heads

        self.W_q = layers.Dense(d_model)
        self.W_k = layers.Dense(d_model)
        self.W_v = layers.Dense(d_model)
        self.W_o = layers.Dense(d_model)

    def split_heads(self, x, batch_size):
        x = tf.reshape(x, [batch_size, -1, self.num_heads, self.d_k])
        return tf.transpose(x, [0, 2, 1, 3])  # [B, H, T, d_k]

    def call(self, x, mask=None, training=False):
        B = tf.shape(x)[0]
        Q = self.split_heads(self.W_q(x), B)
        K = self.split_heads(self.W_k(x), B)
        V = self.split_heads(self.W_v(x), B)

        scores = tf.matmul(Q, K, transpose_b=True) / tf.sqrt(float(self.d_k))
        if mask is not None:
            scores += (1 - tf.cast(mask, tf.float32)) * -1e9
        weights = tf.nn.softmax(scores, axis=-1)

        out = tf.matmul(weights, V)                     # [B, H, T, d_k]
        out = tf.transpose(out, [0, 2, 1, 3])           # [B, T, H, d_k]
        out = tf.reshape(out, [B, -1, self.d_model])    # [B, T, d_model]
        return self.W_o(out)

class TransformerBlock(layers.Layer):
    def __init__(self, d_model, num_heads, dff, dropout=0.1, **kwargs):
        super().__init__(**kwargs)
        self.attn  = MultiHeadSelfAttention(d_model, num_heads)
        self.ffn   = tf.keras.Sequential([
            layers.Dense(dff, activation='gelu'),
            layers.Dense(d_model)
        ])
        self.norm1 = layers.LayerNormalization(epsilon=1e-6)
        self.norm2 = layers.LayerNormalization(epsilon=1e-6)
        self.drop1 = layers.Dropout(dropout)
        self.drop2 = layers.Dropout(dropout)

    def call(self, x, mask=None, training=False):
        attn_out = self.attn(self.norm1(x), mask=mask, training=training)
        x = x + self.drop1(attn_out, training=training)   # pre-norm residual
        ffn_out = self.ffn(self.norm2(x))
        return x + self.drop2(ffn_out, training=training)

Q22. What is model quantization aware training (QAT) in TensorFlow?

import tensorflow as tf
import tensorflow_model_optimization as tfmot

# Load trained model
model = tf.keras.models.load_model('trained_model.keras')

# Apply quantization aware training
# This inserts fake quantization nodes into the graph
qat_model = tfmot.quantization.keras.quantize_model(model)

qat_model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-5),   # very small LR for QAT fine-tuning
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Fine-tune with fake quantization for a few epochs (not full training)
qat_model.fit(train_dataset, epochs=3, validation_data=val_dataset,
               callbacks=[tf.keras.callbacks.EarlyStopping(patience=2)])

# Convert to TFLite INT8
converter = tf.lite.TFLiteConverter.from_keras_model(qat_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_qat = converter.convert()

# QAT typically recovers 0.5-2% accuracy vs PTQ

Q23. How does automatic differentiation work in TensorFlow?

  1. Its forward computation
  2. Its gradient function (the VJP: vector-Jacobian product)

During tape.gradient(loss, variables), TF traverses this tape in reverse, computing gradients via the chain rule.

import tensorflow as tf

# Basic autodiff
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
    y = x ** 3 + 2 * x    # y = x^3 + 2x
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy())   # dy/dx = 3x^2 + 2 = 29 at x=3

# Second-order gradients (Hessian)
x = tf.Variable([1.0, 2.0])
with tf.GradientTape() as t2:
    with tf.GradientTape() as t1:
        y = tf.reduce_sum(x ** 3)   # y = x1^3 + x2^3
    dy_dx = t1.gradient(y, x)       # [3*x1^2, 3*x2^2]
d2y_dx2 = t2.gradient(dy_dx, x)    # [6*x1, 6*x2]
print(d2y_dx2.numpy())  # [6.0, 12.0]

# Custom gradient (for non-differentiable ops or improved numerical stability)
@tf.custom_gradient
def log_softmax_stable(x):
    result = tf.nn.log_softmax(x)
    def grad(upstream):
        softmax = tf.nn.softmax(x)
        return upstream - tf.reduce_sum(upstream) * softmax
    return result, grad

Q24. What is the difference between stateful and stateless layers in Keras?

TypeStateExamplesBehavior Across Calls
StatelessNo trainable params or moving averagesReLU, Dropout, SoftmaxSame output for same input
Stateful (learnable)Trainable weightsDense, Conv2D, EmbeddingWeights updated during training
Stateful (running stats)Non-trainable moving averagesBatchNormalizationDifferent behavior in train vs eval
Recurrent (sequence state)Hidden state across time stepsLSTM, GRUState persists between batches if stateful=True
import tensorflow as tf
from tensorflow.keras import layers

# LSTM with stateful=True: state preserved across batches (for long sequences)
stateful_lstm = layers.LSTM(
    64,
    stateful=True,      # keep state across batches
    return_sequences=True
)
# Requires fixed batch size: model = Sequential([Input(batch_size=32, shape=(None, features))])
# Must call model.reset_states() between sequences

# BatchNormalization: stateful during training (updates running stats)
bn = layers.BatchNormalization()
# model.trainable = False -> bn uses running_mean/running_var (eval mode)
# model.trainable = True  -> bn uses batch statistics (train mode)

# GRU (often preferred over LSTM in production: fewer params, similar performance)
gru_layer = layers.GRU(
    128,
    return_sequences=True,   # return output at every time step
    return_state=True,       # also return final hidden state
    dropout=0.1,
    recurrent_dropout=0.1,
    reset_after=True         # cuDNN-compatible implementation
)

Q25. How do you profile and optimize TensorFlow model training?

import tensorflow as tf

# TensorBoard Profiler (trace GPU ops)
tb_callback = tf.keras.callbacks.TensorBoard(
    log_dir='./logs',
    profile_batch='10,20'   # profile batches 10-20
)
model.fit(dataset, callbacks=[tb_callback])
# Then: tensorboard --logdir ./logs
# Look at: GPU utilization, memory bandwidth, op timeline

# tf.profiler API
tf.profiler.experimental.start('logdir')
model.fit(dataset, epochs=1)
tf.profiler.experimental.stop()

# Common bottlenecks and fixes:
# 1. Low GPU utilization: increase batch size, num_parallel_calls, prefetch
# 2. Memory fragmentation: use tf.data, avoid Python loops in training
# 3. Slow data loading: use TFRecords + multiple workers

# XLA (Accelerated Linear Algebra) compilation
@tf.function(jit_compile=True)  # enables XLA fusion
def train_step_xla(x, y):
    with tf.GradientTape() as tape:
        loss = loss_fn(y, model(x, training=True))
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

Q26. What is model pruning in TensorFlow? How do you implement it?

import tensorflow as tf
import tensorflow_model_optimization as tfmot

# Magnitude-based weight pruning
pruning_params = {
    'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
        initial_sparsity=0.0,
        final_sparsity=0.5,        # 50% of weights pruned
        begin_step=0,
        end_step=1000,
        frequency=100              # prune every 100 steps
    )
}

model_for_pruning = tfmot.sparsity.keras.prune_low_magnitude(
    model, **pruning_params
)
model_for_pruning.compile(
    optimizer=tf.keras.optimizers.Adam(1e-4),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# PruningCallback is required to update masks
callbacks = [
    tfmot.sparsity.keras.UpdatePruningStep(),
    tfmot.sparsity.keras.PruningSummaries(log_dir='./pruning_logs')
]
model_for_pruning.fit(train_data, epochs=10, callbacks=callbacks)

# Strip pruning wrappers (make weights actually sparse)
final_model = tfmot.sparsity.keras.strip_pruning(model_for_pruning)

# Sparsity is now "real" in weights
for layer in final_model.layers:
    if hasattr(layer, 'kernel'):
        sparsity = 1 - (tf.math.count_nonzero(layer.kernel) /
                         tf.size(layer.kernel, out_type=tf.int64)).numpy()
        if sparsity > 0:
            print(f"{layer.name}: {sparsity:.1%} sparse")

Q27. How do you export a model for serving with TF Serving in production?

import tensorflow as tf

# Save model with explicit serving signature
model = tf.keras.models.load_model('trained_model.keras')

@tf.function(input_signature=[
    tf.TensorSpec(shape=[None, 224, 224, 3], dtype=tf.float32, name='images')
])
def serve(images):
    # Preprocessing inside the serving function
    images = tf.image.resize(images, [224, 224])
    images = tf.cast(images, tf.float32) / 255.0
    # Normalize with ImageNet stats
    mean = tf.constant([0.485, 0.456, 0.406])
    std  = tf.constant([0.229, 0.224, 0.225])
    images = (images - mean) / std
    predictions = model(images, training=False)
    return {
        'class_ids': tf.argmax(predictions, axis=-1),
        'probabilities': predictions,
        'top_5_ids': tf.math.top_k(predictions, k=5).indices
    }

tf.saved_model.save(
    model,
    'production_model/1',
    signatures={'serving_default': serve}
)

# Batching (TF Serving can auto-batch requests)
# Configure in batching_parameters.txt:
# max_batch_size: 64
# batch_timeout_micros: 5000
# max_enqueued_batches: 1000

Q28. Design a TF pipeline for real-time image classification at scale.

Production pipeline design:

Serving layer:
  Client -> Load Balancer -> TF Serving fleet (N replicas on GPU)
  - Each TF Serving instance: 2 GPUs, auto-batching enabled
  - Batch size 32, timeout 10ms (P99 latency target: 50ms)
  - gRPC for internal (faster), REST for external clients

Model:
  - EfficientNetV2-S: best accuracy/latency for 384x384 input
  - Quantized INT8 (TFLite on edge) or FP16 (TF Serving on GPU)
  - Model version: A/B test via TF Serving model config

Input pipeline:
  - Preprocessing inside the saved model (resize, normalize)
  - No client-side preprocessing required

Monitoring:
  - Prometheus + Grafana for latency/throughput
  - Model performance: accuracy on labelled production samples
  - Drift detection: feature distribution vs training baseline
# TF Serving gRPC client
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2, prediction_service_pb2_grpc

def predict_grpc(images, stub):
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'my_model'
    request.model_spec.signature_name = 'serving_default'
    request.inputs['images'].CopyFrom(
        tf.make_tensor_proto(images, dtype=tf.float32)
    )
    result = stub.Predict(request, timeout=10)
    return tf.make_ndarray(result.outputs['class_ids'])

channel = grpc.insecure_channel('localhost:8500')
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

TensorFlow Ecosystem at a Glance

ComponentPurposeAlternative
KerasHigh-level model APIPyTorch nn.Module
tf.dataInput pipelinePyTorch DataLoader
TFLiteMobile/edge deploymentONNX + ORT, CoreML
TF ServingProduction REST/gRPC servingTorchServe, Triton
TFXEnd-to-end ML pipelinesMLflow, Kubeflow
TF HubPre-trained modelsHuggingFace Hub
TensorBoardTraining visualizationW&B, MLflow UI
TFMAModel evaluationCustom evaluation

FAQ

Q: Should I learn TensorFlow or PyTorch for ML interviews? A: Both. PyTorch is the standard for research and many startups. TensorFlow is required for Google roles and is standard at many enterprises. Start with PyTorch for intuition; learn TF 2.x/Keras for deployment tooling.

Q: What is the difference between Keras and TensorFlow? A: Keras is the high-level API that lives inside TensorFlow 2.x. You import it as tf.keras. Keras can also run on top of JAX or PyTorch as of 2024 (Keras 3 / multi-backend).

Q: What is JAX and how does it relate to TensorFlow? A: JAX is a Python library from Google for numerical computing with automatic differentiation. It is the foundation for Google's internal research (replacing TF in many teams) and is used by DeepMind, Anthropic, and others. It uses NumPy syntax but runs on accelerators via XLA.


Related articles on PapersAdda:

Methodology applied to this articlelast verified 8 Jun 2026
Sources used
Public exam-pattern documents, official recruiter pages, and verified candidate reports on r/developersIndia and LinkedIn.
Verification window
Page last edited 8 Jun 2026 by Aditya Sharma. Numbers and patterns sanity-checked against the most recent 2026 cycle drives we tracked.
What we did NOT do
  • No fabricated salary numbers or success rates. If we quote a range, it's sourced.
  • No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
  • No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

Explore this topic cluster

More resources in Interview Questions

Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.

Paid contributor programme

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.

Submit your story →

Ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start Free Mock Test →

Related Articles

More from PapersAdda

Share this guide: