GCP Interview Questions 2026: Google Cloud Platform Core Services & Architecture
> Candidates report that GCP interviews in 2026 focus heavily on BigQuery, GKE, Cloud Run, Vertex AI, and data engineering with Dataflow and Pub/Sub. Confirm...

What changed in 2026 drives
Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.
What I'd actually study for this
- 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
- 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
- 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
- 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken
Where most candidates trip up
The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.
Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.
Candidates report that GCP interviews in 2026 focus heavily on BigQuery, GKE, Cloud Run, Vertex AI, and data engineering with Dataflow and Pub/Sub. Confirm current service names and features on the official Google Cloud documentation before your interview -- GCP service naming and features evolve frequently.
Google Cloud Platform interviews cover core infrastructure (Compute, Storage, Networking), managed data services (BigQuery, Bigtable, Spanner), analytics pipelines (Dataflow, Pub/Sub), and ML infrastructure (Vertex AI). This guide covers all major domains. Interviewers at Google Cloud, Airtel, Reliance Jio, Infosys, and TCS digital units ask about GCP regularly in 2026. Understanding when to use serverless versus container-based versus VM-based compute is the most commonly tested decision in senior cloud interviews.
Core Compute
Q1. Compare Compute Engine, App Engine, Cloud Run, GKE, and Cloud Functions.
| Service | Model | Use case | Cold start |
|---|---|---|---|
| Compute Engine | IaaS (VMs) | Full OS control, lift-and-shift, GPU workloads | None (always running) |
| App Engine Standard | PaaS (sandboxed) | Fast HTTP apps (Python, Java, Go, Node.js) | Seconds |
| App Engine Flexible | PaaS (Docker containers) | Custom runtimes, background processing | Minutes |
| Cloud Run | Serverless containers | Stateless HTTP services, any language via container | under 1 second |
| GKE (Autopilot) | Managed Kubernetes | Complex microservices, stateful workloads | None |
| Cloud Functions | Serverless functions | Event-driven triggers, glue code | Seconds |
| Batch | Managed job scheduler | HPC, batch compute with MPI, auto-provisioned VMs | N/A |
Cloud Run specifics:
# A Cloud Run service is just a container that handles HTTP requests
# Cloud Run handles: scaling, load balancing, HTTPS, domain mapping
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app
# Deploy
# gcloud run deploy my-service --image gcr.io/myproject/my-app --region us-central1 --allow-unauthenticated
GKE Autopilot vs Standard:
- Standard: manage node pools (VM types, sizes, autoscaling config). Full control.
- Autopilot: Google manages nodes entirely. Pay per pod CPU/memory request. Simpler operations, slightly higher per-unit cost.
The choice between Autopilot and Standard GKE often comes down to operational maturity. Teams early in their Kubernetes journey benefit from Autopilot's simplified model. Teams with specific hardware requirements, GPU workloads, or complex scheduling rules generally choose Standard mode.
Q2. What are GCP machine types, and how do you choose the right one?
Machine type families:
| Family | Purpose | Examples |
|---|---|---|
| General Purpose | Balanced CPU/memory | e2 (cost), n2/n2d, c3 (performance) |
| Compute Optimized | High CPU-to-memory | c2, c3 |
| Memory Optimized | High memory | m1/m2/m3 (up to 12 TB) |
| Accelerator Optimized | GPU | a2 (A100), g2 (L4) |
| Storage Optimized | High local SSD | z3 |
Custom machine types: GCP unique feature -- define exact vCPU and memory without being locked to standard sizes.
# Create custom VM: 6 vCPUs, 20 GB memory (instead of forced n1-standard-8 with 30 GB)
gcloud compute instances create my-vm \
--custom-cpu=6 \
--custom-memory=20GB \
--zone=us-central1-a
Preemptible VMs / Spot VMs:
- Preemptible: up to 24 hours, 30-second eviction notice, 60-91% cheaper.
- Spot VMs (replacement for preemptible): no max lifetime, dynamic pricing, same eviction model.
- Use for: fault-tolerant batch, ML training checkpointed jobs, stateless workloads.
Committed Use Discounts (CUDs):
- 1-year: ~37% discount. 3-year: ~55% discount.
- Machine-level (specific VM family/region) or resource-based (flexible, like AWS Savings Plans).
- Automatically applied to eligible usage.
GCP also offers sustained use discounts that apply automatically when you run a VM for more than 25% of a month -- no commitment needed. This makes GCP attractive for workloads that run consistently without requiring upfront reserved capacity purchases.
Storage
Q3. What are GCP's storage options, and when do you use each?
| Service | Type | Latency | Use case |
|---|---|---|---|
| Cloud Storage | Object | ms | Unstructured data, data lake, backups |
| Persistent Disk | Block (SSD/HDD) | ms | VM boot disks, databases |
| Filestore | NFS file system | ms | Shared file storage for GKE, HPC |
| Cloud Bigtable | NoSQL wide-column | <10ms | Time-series, IoT, HBase-compatible |
| Firestore | NoSQL document | ms | Mobile/web apps, real-time sync |
| Cloud Spanner | Relational (global) | ms | Global ACID transactions |
| BigQuery | Columnar OLAP | Seconds | Analytics, data warehouse |
| Memorystore | In-memory (Redis) | <1ms | Cache, session store |
Cloud Storage classes:
Standard: frequently accessed data
Nearline: accessed < once/month (backup)
Coldline: accessed < once/quarter (DR)
Archive: accessed < once/year (long-term)
Lifecycle rules: auto-transition between classes
gsutil lifecycle set lifecycle.json gs://my-bucket
Cloud Storage access control:
# Uniform bucket-level access (recommended, disables per-object ACLs)
gsutil uniformbucketlevelaccess set on gs://my-bucket
# IAM binding: grant read to specific service account
gsutil iam ch serviceAccount:[email protected]:objectViewer gs://my-bucket
# Signed URL (time-limited access without authentication)
gsutil signurl -d 1h key.json gs://my-bucket/private-file.csv
BigQuery
Q4. How does BigQuery work internally, and what makes it fast at scale?
Architecture:
Query -> Dremel (MPP query engine, tree of servers)
|
Colossus (distributed file system, stores ColumnIO/Capacitor format)
Storage and compute are FULLY SEPARATED:
- No cluster to size
- Slots (virtual CPUs) allocated per query
- Storage charged per GB/month
- Query charged per TB scanned (or flat rate via reservations)
Why BigQuery is fast:
-
Columnar storage: Query only accesses needed columns. A 100-column table scan for 3 columns reads 3% of data.
-
Dremel execution: Massively parallel -- query tree distributes work to thousands of leaf nodes reading from Colossus.
-
In-memory shuffles: Intermediate shuffle data held in Jupiter network fabric.
-
Query results cache: Identical queries return cached results at no cost (24-hour cache).
BigQuery's serverless model means there is no cluster to size, no index to build, and no VACUUM to run. The engineering team's operational burden is dramatically lower compared to traditional data warehouses like Redshift or Snowflake clusters. This operational simplicity is one of the main reasons enterprises adopt BigQuery as their primary analytics platform.
Partitioned tables:
-- Partition by ingestion time (automatic)
CREATE TABLE mydataset.events
PARTITION BY _PARTITIONDATE
OPTIONS (partition_expiration_days = 90)
AS SELECT * FROM mydataset.raw_events;
-- Partition by column date
CREATE TABLE mydataset.sales
PARTITION BY DATE(order_date)
AS SELECT * FROM ...;
-- Query with partition pruning (only scans relevant partitions)
SELECT SUM(amount) FROM mydataset.sales
WHERE DATE(order_date) BETWEEN '2026-01-01' AND '2026-03-31';
-- Check partitions
SELECT * FROM mydataset.INFORMATION_SCHEMA.PARTITIONS
WHERE table_name = 'sales';
Clustered tables:
-- Cluster by frequently filtered/joined columns (within each partition)
CREATE TABLE mydataset.events
PARTITION BY DATE(event_date)
CLUSTER BY user_id, event_type
AS SELECT * FROM mydataset.raw_events;
-- Block-level sorting means BigQuery skips irrelevant blocks
SELECT * FROM mydataset.events
WHERE event_date = '2026-06-08' AND user_id = '12345';
Q5. How do you optimize BigQuery queries for cost and performance?
Cost optimization:
-- 1. SELECT only needed columns (avoid SELECT *)
-- BAD: scans all columns (100 GB table -> 100 GB billed)
SELECT * FROM mydataset.events WHERE date = '2026-06-08';
-- GOOD: scans 2 columns only
SELECT user_id, event_type FROM mydataset.events WHERE date = '2026-06-08';
-- 2. Partition pruning (always filter on partition column)
-- BAD: full table scan
SELECT COUNT(*) FROM mydataset.events WHERE event_type = 'purchase';
-- GOOD: only scans June 2026 partition
SELECT COUNT(*) FROM mydataset.events
WHERE DATE(event_date) >= '2026-06-01' AND event_type = 'purchase';
-- 3. Preview costs before running
-- BigQuery UI: shows estimated bytes processed before execution
-- CLI: --dry_run flag
bq query --dry_run --use_legacy_sql=false \
"SELECT * FROM mydataset.large_table WHERE date = '2026-06-08'"
-- 4. Approximate functions for exploration
SELECT APPROX_COUNT_DISTINCT(user_id) FROM mydataset.events; -- much cheaper than COUNT(DISTINCT)
SELECT APPROX_QUANTILES(amount, 100) FROM mydataset.orders; -- faster percentiles
Performance optimization:
-- 5. Avoid self-joins -- use window functions instead
-- BAD (self-join for running total):
SELECT a.date, SUM(b.revenue)
FROM sales a JOIN sales b ON b.date <= a.date
GROUP BY a.date;
-- GOOD (window function):
SELECT date, SUM(revenue) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM sales;
-- 6. Flatten repeated records before joining (UNNEST)
SELECT u.user_id, e.event_type
FROM users u
CROSS JOIN UNNEST(u.events) AS e; -- unnest STRUCT/ARRAY fields
-- 7. Use CTEs for readability + intermediate caching
WITH daily_sales AS (
SELECT DATE(order_date) AS day, SUM(amount) AS revenue
FROM orders GROUP BY 1
),
running_total AS (
SELECT day, revenue,
SUM(revenue) OVER (ORDER BY day) AS cumulative
FROM daily_sales
)
SELECT * FROM running_total;
Q6. What is BigQuery ML, and what models does it support?
BigQuery ML lets you train and deploy ML models directly in BigQuery using SQL.
-- Train a logistic regression model
CREATE OR REPLACE MODEL mydataset.churn_model
OPTIONS(
model_type = 'logistic_reg',
input_label_cols = ['churned'],
max_iterations = 20,
learn_rate = 0.1
) AS
SELECT
tenure_months,
monthly_charges,
total_charges,
contract_type,
payment_method,
churned
FROM mydataset.customer_features
WHERE partition_date >= '2025-01-01';
-- Evaluate model
SELECT * FROM ML.EVALUATE(MODEL mydataset.churn_model,
(SELECT * FROM mydataset.customer_features WHERE partition_date = '2026-06-01')
);
-- Predict
SELECT customer_id, predicted_churned, predicted_churned_probs
FROM ML.PREDICT(MODEL mydataset.churn_model,
(SELECT * FROM mydataset.new_customers)
);
Supported model types:
| Model | SQL option |
|---|---|
| Linear regression | linear_reg |
| Logistic regression | logistic_reg |
| K-Means clustering | kmeans |
| Matrix factorization | matrix_factorization |
| Time series (ARIMA+) | arima_plus |
| XGBoost | boosted_tree_classifier/regressor |
| DNN | dnn_classifier/regressor |
| AutoML | automl_classifier/regressor |
| Remote model (Vertex AI) | Import external models |
Data Engineering
Q7. What is Cloud Pub/Sub, and how does it compare to Kafka?
Cloud Pub/Sub is Google's managed messaging service for asynchronous decoupled communication.
Architecture:
Publisher -> Topic -> Subscription -> Subscriber
Multiple subscriptions per topic:
- Subscription A: Analytics service
- Subscription B: Email service
- Subscription C: Audit log
Each subscription maintains its own offset -- independent consumption.
Key features:
- At-least-once delivery: Messages delivered at least once (deduplication key for exactly-once).
- Push vs Pull subscriptions: Pull (subscriber calls receive()), Push (Pub/Sub calls subscriber endpoint).
- Message retention: 10 minutes to 7 days (configurable).
- Ordering keys: Messages with same ordering key delivered in order within a region.
- Dead letter topics: Failed messages after N delivery attempts go to DLT.
from google.cloud import pubsub_v1
import json
# Publisher
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path('my-project', 'orders')
order = {'order_id': '12345', 'amount': 99.99, 'user_id': 'u789'}
data = json.dumps(order).encode('utf-8')
# Ordering key: route user's orders to same partition (preserve order)
future = publisher.publish(topic_path, data, ordering_key='u789')
print(f'Published: {future.result()}')
# Subscriber (pull)
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('my-project', 'orders-analytics')
def callback(message):
order = json.loads(message.data.decode('utf-8'))
process_order(order)
message.ack()
streaming_pull_future = subscriber.subscribe(subscription_path, callback=callback)
streaming_pull_future.result()
Cloud Pub/Sub handles authentication, message durability, and global delivery automatically. Unlike Kafka, there are no brokers to size, no partition rebalancing to manage, and no ZooKeeper or KRaft configuration required. The trade-off is less control over retention and replay semantics compared to self-managed Kafka.
Pub/Sub vs Kafka:
| Aspect | Cloud Pub/Sub | Kafka |
|---|---|---|
| Management | Fully managed (no brokers to manage) | Self-managed or Confluent Cloud |
| Retention | Up to 7 days | Configurable (unlimited with tiered storage) |
| Replay | Within retention window | Within retention window |
| Ordering | Per ordering key within region | Per partition (global ordering with 1 partition) |
| Scale | Auto (global) | Manual partition/broker scaling |
| Ecosystem | GCP-native, Dataflow integration | Universal (all major systems) |
| Exactly-once | With Cloud Dataflow | With transactional producer |
Q8. What is Cloud Dataflow, and how does it implement the Apache Beam model?
Cloud Dataflow is Google's managed stream and batch processing service based on Apache Beam.
Apache Beam model:
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
options = PipelineOptions([
'--runner=DataflowRunner',
'--project=my-project',
'--region=us-central1',
'--temp_location=gs://my-bucket/temp',
])
with beam.Pipeline(options=options) as p:
# Read from Pub/Sub (streaming)
events = (
p
| 'Read from PubSub' >> beam.io.ReadFromPubSub(
subscription='projects/my-project/subscriptions/events-sub')
| 'Parse JSON' >> beam.Map(lambda x: json.loads(x.decode('utf-8')))
)
# Window: group events into 5-minute tumbling windows
windowed = (
events
| 'Add timestamps' >> beam.Map(lambda e: beam.window.TimestampedValue(e, e['event_ts']))
| 'Window into 5min' >> beam.WindowInto(beam.window.FixedWindows(300))
)
# Aggregate per window
aggregated = (
windowed
| 'Extract (product, revenue)' >> beam.Map(lambda e: (e['product_id'], e['amount']))
| 'Sum revenue' >> beam.CombinePerKey(sum)
)
# Write to BigQuery
aggregated | 'Write to BQ' >> beam.io.WriteToBigQuery(
table='my-project:analytics.product_revenue',
write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
)
Dataflow unique features:
- Autoscaling: adjusts worker count based on backlog and throughput.
- Streaming + batch unified API: same Beam code runs in batch (GCS files) or streaming (Pub/Sub) mode.
- Exactly-once processing: built into the Dataflow runner.
- Flex Templates: package pipelines as containers for reproducible execution.
- Side inputs: broadcast small lookup tables to all workers without shuffle.
Q9. What is Cloud Composer, and how does it differ from managed Airflow?
Cloud Composer is Google's managed Apache Airflow service built on GKE.
Architecture:
Cloud Composer environment:
- GKE cluster (worker pods run tasks)
- Cloud SQL (Airflow metadata DB)
- Cloud Storage (DAG files, logs)
- Redis (Celery broker)
- Airflow webserver, scheduler
DAG deployment:
gsutil cp my_dag.py gs://your-composer-bucket/dags/
Composer automatically picks up and parses new DAGs
Cloud Composer vs self-hosted Airflow:
| Aspect | Cloud Composer | Self-hosted Airflow |
|---|---|---|
| Management | Fully managed upgrades, patches | Manual |
| GCP integration | Native (Dataflow, BigQuery, Cloud Storage operators) | Requires custom operators |
| Cost | Higher per unit | Lower if managed well |
| Scaling | Autoscaling (Composer 2) | Manual |
| Version control | GCP-managed Airflow versions | Any version |
Composer 2 improvements (Composer 1 vs 2):
- Workloads Kubernetes Autopilot: auto-scale workers based on task queue.
- Environment snapshots: backup and restore environments.
- Airflow 2.x: TaskFlow API, better performance.
Kubernetes Engine
Q10. What is GKE, and what are its key operational concepts?
GKE (Google Kubernetes Engine) is Google's managed Kubernetes service.
GKE cluster modes:
- Standard: manage node pools -- choose machine types, autoscaler settings, node locations.
- Autopilot: Google manages nodes entirely -- you only define Pods. Pay per Pod resource request.
Key GKE operational features:
# Create a GKE Autopilot cluster
gcloud container clusters create-auto my-cluster \
--region us-central1
# Create a Standard cluster with autoscaling
gcloud container clusters create my-cluster \
--num-nodes 3 \
--enable-autoscaling \
--min-nodes 1 \
--max-nodes 20 \
--machine-type e2-standard-4 \
--zone us-central1-a
# Get credentials
gcloud container clusters get-credentials my-cluster --zone us-central1-a
Workload Identity (replacing service account key files):
# Allow Kubernetes service account to impersonate GCP service account
# No key file needed -- IAM binding at identity level
gcloud iam service-accounts add-iam-policy-binding \
[email protected] \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:my-project.svc.id.goog[my-namespace/my-ksa]"
# Pod spec: annotate KSA to use Workload Identity
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-ksa
namespace: my-namespace
annotations:
iam.gke.io/gcp-service-account: [email protected]
GKE Autopilot resource model:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 3
template:
spec:
containers:
- name: app
image: gcr.io/my-project/my-app:v1
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1000m"
memory: "1Gi"
# Autopilot provisions nodes to fit these requests
# Billing = sum of all Pod requests, not node capacity
IAM and Security
Q11. Explain GCP's IAM model: projects, service accounts, roles.
Security on GCP is enforced through a layered IAM model. Unlike traditional systems where access is defined per user per resource, GCP uses roles (bundles of permissions) assigned to principals (users, groups, service accounts) at a specific scope in the resource hierarchy. Understanding this model is essential for both the Google Cloud Professional Cloud Architect exam and practical security governance interviews.
Resource hierarchy:
Organization (root)
|-- Folders (departments, teams)
| |-- Projects (billing + resource boundary)
| |-- Resources (VMs, buckets, databases)
IAM policies inherited down the hierarchy:
- Grant at Organization -> inherited by all projects/resources
- Grant at Project -> inherited by resources within
- Grant at Resource -> specific override
IAM member types:
user:[email protected]-- Google accountserviceAccount:[email protected]-- non-human identitygroup:[email protected]-- Google Groupdomain:example.com-- all users in a G Suite domainallUsers-- public internet (use with caution)allAuthenticatedUsers-- any Google-authenticated user
Role types:
| Type | Scope | Examples |
|---|---|---|
| Primitive | Project-wide | Owner, Editor, Viewer (too broad for prod) |
| Predefined | Service-specific | roles/storage.objectViewer, roles/bigquery.dataEditor |
| Custom | User-defined | Bundle exactly the permissions needed |
Service account best practices:
# Create minimal service account for a Cloud Run service
gcloud iam service-accounts create my-run-sa \
--display-name "Cloud Run service account"
# Grant only needed permissions (not Editor!)
gcloud projects add-iam-policy-binding my-project \
--member "serviceAccount:[email protected]" \
--role "roles/bigquery.dataViewer"
gcloud projects add-iam-policy-binding my-project \
--member "serviceAccount:[email protected]" \
--role "roles/pubsub.publisher"
# Deploy Cloud Run with specific SA
gcloud run deploy my-service \
--image gcr.io/my-project/my-app \
--service-account [email protected]
Q12. What are VPC Service Controls, and when do you use them?
VPC Service Controls create a security perimeter around GCP services, preventing data exfiltration even from authorized users.
Problem they solve:
A compromised or malicious user with bigquery.dataViewer can copy data to an external project. Standard IAM cannot prevent this since IAM controls WHO can access, not WHERE data can go.
Service perimeter:
Perimeter: my-secure-perimeter
Protected services: bigquery.googleapis.com, storage.googleapis.com
Projects inside perimeter: my-prod-project, my-analytics-project
Rules:
- API calls to BigQuery/Storage ONLY succeed from within the perimeter
- External project calling your BigQuery API: DENIED
- Internal project copying to external bucket: DENIED
- Access levels (exceptions): specific users/IPs can cross the perimeter
Use cases:
- Regulated data (HIPAA, PCI, financial): prevent data leakage across project boundaries.
- Data exfiltration prevention: insider threat mitigation.
- Compliance: audit-friendly perimeter with Cloud Audit Logs.
Networking
Q13. Explain GCP networking: VPCs, subnets, and peering.
GCP networking differs from AWS and Azure in one important way: VPCs in GCP are global by default. A single VPC can span all GCP regions, and subnets within that VPC are regional. This simplifies network design for global applications and eliminates the need for VPC peering between regions of the same organization -- you simply add subnets in each region you need.
GCP VPC is global (unlike AWS VPC which is regional):
my-vpc (global)
Subnet us-central1: 10.0.0.0/20
Subnet us-east1: 10.1.0.0/20
Subnet europe-west1: 10.2.0.0/20
VMs in different regions on same VPC communicate over Google's private network
No VPN or peering needed within same VPC
Subnet modes:
- Auto mode: Google auto-creates subnets in every region (fixed /20 ranges).
- Custom mode: You define subnet regions and CIDR ranges. Use for production (control).
Firewall rules:
# Allow HTTP traffic to web servers (tagged)
gcloud compute firewall-rules create allow-http \
--network my-vpc \
--allow tcp:80 \
--target-tags web-server \
--source-ranges 0.0.0.0/0
# Allow internal traffic between app and database tier
gcloud compute firewall-rules create allow-app-to-db \
--network my-vpc \
--allow tcp:5432 \
--source-tags app-server \
--target-tags db-server
# Hierarchical firewall policies (organization-level, enforced on all projects)
gcloud compute org-security-policies create my-org-policy \
--organization=123456789
VPC Peering:
- Connect two VPC networks (same or different projects/organizations).
- Non-transitive: A-B + B-C does NOT give A-C access.
- No single point of failure or bandwidth bottleneck.
VPC Peering is free -- there are no hourly charges for the peering connection itself, though normal data transfer charges apply. This makes it cost-effective for connecting multiple projects within an organization without routing traffic through a NAT gateway or internet.
Shared VPC:
- One host project has the VPC; service projects share it.
- Centralized network control, distributed resource creation.
- Common pattern: platform team owns networking, app teams own their resources in service projects.
Q14. What is Cloud Load Balancing on GCP?
GCP load balancers are global (Anycast IP) or regional:
| Type | Tier | Protocol | Scope |
|---|---|---|---|
| External Application (HTTP/S) | Global | HTTP, HTTPS, gRPC | Global, URL-based routing |
| Internal Application | Regional | HTTP, HTTPS | Internal VPC traffic |
| External Network (TCP/UDP) | Regional | TCP, UDP | Regional, preserve client IP |
| Internal TCP/UDP | Regional | TCP, UDP | Internal VPC |
| SSL Proxy | Global | SSL/TLS (non-HTTP) | Global |
| TCP Proxy | Global | TCP | Global |
Global external HTTP(S) LB:
User (Tokyo) -> Google's Anycast IP (nearest PoP)
-> Route to closest healthy backend (Tokyo/Singapore/US)
-> No geographic traffic penalty
-> CDN cache at edge (Cloud CDN)
URL Map (routing rules):
/api/* -> Backend Service: api-backend-group
/static/* -> Cloud Storage bucket (direct CDN origin)
/* -> Backend Service: web-frontend-group
Backend Service:
- Instance Group or NEG (Network Endpoint Group)
- Health check (HTTP /health)
- Session affinity (optional)
- Cloud CDN (enable/disable per backend)
Serverless NEGs: Route traffic directly to Cloud Run, App Engine, or Cloud Functions from the Global LB -- no intermediate VMs. This allows a single global load balancer with a single Anycast IP to front both container-based and serverless backends, with URL-map based routing distributing traffic to the appropriate backend service.
Real-World Architecture
Q15. Design a real-time analytics pipeline on GCP for 1M events/second.
Architecture design questions are common at senior GCP interviews. The pattern below is representative of production pipelines at scale -- candidates report seeing variations of this design at companies with large-scale event processing needs. Each component choice has specific trade-offs that interviewers probe.
Architecture:
Mobile/Web Apps -> Cloud Pub/Sub (ingestion)
|
____________|____________
| |
Cloud Dataflow Cloud Dataflow
(streaming, 5-min (streaming, raw
aggregations) data to BQ)
| |
BigQuery (aggregated BigQuery (raw events
metrics table, partitioned by date,
TTL 90 days) clustered by user_id)
|
Looker Studio / Looker
(real-time dashboards,
sub-minute latency)
Pub/Sub throughput:
1M events/sec at 500 bytes avg = 500 MB/s
Pub/Sub handles this with no configuration -- fully managed, auto-scales
Dataflow pipeline (streaming aggregation):
with beam.Pipeline(options=options) as p:
events = (
p
| beam.io.ReadFromPubSub(topic='projects/myproject/topics/app-events')
| beam.Map(json.loads)
| beam.Map(lambda e: beam.window.TimestampedValue(e, e['ts'] / 1000))
| beam.WindowInto(
beam.window.FixedWindows(300), # 5-minute windows
trigger=beam.trigger.AfterWatermark(
early=beam.trigger.AfterCount(10000) # early fire at 10K events
),
accumulation_mode=beam.trigger.AccumulationMode.ACCUMULATING
)
| beam.Map(lambda e: (e['event_type'], e['user_id']))
| beam.combiners.Count.PerKey()
| beam.io.WriteToBigQuery('myproject:analytics.event_counts')
)
BigQuery write optimization:
- Use streaming inserts for real-time (millisecond latency, higher cost).
- Use Dataflow batch for aggregates (write every 5 min, much cheaper).
- Storage Write API: higher throughput, lower cost than streaming inserts.
For high-volume pipelines, Storage Write API is preferred over the legacy streaming inserts API because it offers exactly-once semantics with deduplication support and charges per byte written rather than per row. Dataflow has native integration with the Storage Write API through the BigQueryIO connector.
Q16. How do you run ML training and serve predictions on GCP?
Vertex AI (unified ML platform):
from google.cloud import aiplatform
aiplatform.init(project='my-project', location='us-central1')
# Train a custom model on Vertex AI Training
job = aiplatform.CustomTrainingJob(
display_name='xgboost-churn-v1',
script_path='trainer/task.py',
container_uri='us-docker.pkg.dev/vertex-ai/training/sklearn-cpu.1-0:latest',
model_serving_container_image_uri=(
'us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest'
)
)
model = job.run(
dataset=dataset,
model_display_name='churn-predictor-v1',
args=['--max-depth=6', '--n-estimators=200'],
machine_type='n1-standard-4',
replica_count=1,
)
# Deploy to endpoint
endpoint = aiplatform.Endpoint.create(display_name='churn-endpoint')
model.deploy(
endpoint=endpoint,
machine_type='n1-standard-2',
min_replica_count=1,
max_replica_count=5,
traffic_percentage=100,
)
# Online prediction
prediction = endpoint.predict(instances=[
{'tenure': 24, 'monthly_charges': 65.5, 'contract': 'Month-to-month'}
])
Vertex AI components:
| Component | Purpose |
|---|---|
| Vertex AI Workbench | Managed Jupyter notebooks |
| Vertex AI Training | Custom + AutoML training |
| Vertex AI Prediction | Online + batch prediction endpoints |
| Vertex AI Pipelines | Kubeflow Pipelines on managed infrastructure |
| Vertex AI Feature Store | Centralized feature serving (online + offline) |
| Vertex AI Experiments | Track and compare training runs |
| Vertex AI Model Registry | Versioned model management |
| Vertex AI Vector Search | ANN (approximate nearest neighbor) for embeddings |
Cost Management
Q17. How do you manage and optimize GCP costs?
Key tools:
Cloud Billing: view costs by project, service, SKU
Budget alerts: notify at 50%, 75%, 90%, 100% of budget
Cost allocation: labels to attribute costs to teams/products
Cloud Cost Management (BigQuery billing export):
Export billing data to BigQuery daily
Build custom cost dashboards in Looker Studio
Recommender:
Idle VM recommendations: VMs with < 8% CPU over 14 days
Right-sizing: suggest smaller instance type
Committed use discount recommendations
Unused reservations
BigQuery cost control:
-- Create budget for project in BigQuery reservations
-- Flat-rate pricing: fixed slot count, predictable costs
-- On-demand pricing controls:
-- 1. Custom quotas: limit daily bytes processed per project
-- 2. Table access controls: limit which tables users can scan
-- 3. Authorized views: limit columns visible to users
-- Query cost before running
bq query --dry_run "SELECT * FROM mydataset.large_table"
# Output: Query will process 250 GB when run.
-- Column-level access control (column policy tags) to restrict PII
-- Users without tag access get NULL for those columns
Committed Use Discounts strategy:
Baseline compute (always needed) -> CUDs (1-yr or 3-yr)
Variable/burst compute -> On-demand or Spot VMs
Batch ML training -> Spot VMs (preemption acceptable)
Cost management is increasingly tested in GCP interviews at senior levels. Beyond CUDs, teams should export billing data to BigQuery and build dashboards that show spend by label (team, environment, service). The Recommender API provides programmatic access to cost-saving recommendations, allowing automated ticket creation or Slack notifications when optimization opportunities exceed a threshold.
FAQ
Q: What is the difference between Cloud SQL and Cloud Spanner? Cloud SQL is managed MySQL, PostgreSQL, or SQL Server -- regional, up to ~96 vCPUs and a few TB of storage. Suitable for most OLTP workloads. Cloud Spanner is Google's globally distributed relational database with external consistency (true serializable isolation across regions). It scales horizontally to petabytes, handles millions of transactions per second globally, and maintains ACID guarantees across regions. Spanner is significantly more expensive -- use it when: global distribution is required, > 30,000 QPS for reads, > 1,500 QPS for writes, or strict cross-region consistency is needed. Candidates report Spanner questions appear frequently for senior cloud architect interviews.
Q: How does Cloud Storage object versioning work? When versioning is enabled on a bucket, Cloud Storage retains previous versions of objects when overwritten or deleted. Each version gets a unique generation number. You can list, restore, or permanently delete old versions. Common use case: accidental deletion protection, rollback to prior file version. Combine with Object Lifecycle Management to auto-delete versions older than N days (keeps costs controlled). Object Lock (retention policies) is available for compliance/WORM requirements.
Q: What are the key differences between BigQuery and Cloud Bigtable? BigQuery is an OLAP data warehouse for analytical SQL queries on large datasets (seconds latency, TB-PB scale). Bigtable is a NoSQL wide-column store for high-throughput low-latency operational workloads (single-digit millisecond reads/writes at millions of requests per second). Use BigQuery for analytics and reporting; use Bigtable for time-series data ingestion, operational read/write at scale, and AdTech event storage. Confirm current service capabilities and pricing on the official Google Cloud documentation.
Related Topics
Methodology applied to this articlelast verified 8 Jun 2026
- No fabricated salary numbers or success rates. If we quote a range, it's sourced.
- No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
- No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
topic cluster
More resources in Interview Questions
Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.
paid contributor programme
Sat this this year? Share your story, earn ₹500.
First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.
Submit your story →ready to practice?
Take a free timed mock test
Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.