placement brief / Interview Questions / interview questions / 08 Jun 2026

GCP Interview Questions 2026: Google Cloud Platform Core Services & Architecture

Q: What is the difference between Cloud SQL and Cloud Spanner?

Cloud SQL is managed MySQL, PostgreSQL, or SQL Server -- regional, up to ~96 vCPUs and a few TB of storage. Suitable for most OLTP workloads. Cloud Spanner is Google's globally distributed relational database with external consistency (true serializable isolation across regions). It scales horizontally to petabytes, handles millions of transactions per second globally, and maintains ACID guarantees across regions. Spanner is significantly more expensive -- use it when: global distribution is required, > 30,000 QPS for reads, > 1,500 QPS for writes, or strict cross-region consistency is needed. Candidates report Spanner questions appear frequently for senior cloud architect interviews.

> Candidates report that GCP interviews in 2026 focus heavily on BigQuery, GKE, Cloud Run, Vertex AI, and data engineering with Dataflow and Pub/Sub. Confirm...

By Aditya SharmaPublished 8 Jun 20263 sources listedSpot an error? Corrections open

10 min read last revised 8 Jun 2026

on this page§ 11

Candidates report that GCP interviews in 2026 focus heavily on BigQuery, GKE, Cloud Run, Vertex AI, and data engineering with Dataflow and Pub/Sub. Confirm current service names and features on the official Google Cloud documentation before your interview -- GCP service naming and features evolve frequently.

Google Cloud Platform interviews cover core infrastructure (Compute, Storage, Networking), managed data services (BigQuery, Bigtable, Spanner), analytics pipelines (Dataflow, Pub/Sub), and ML infrastructure (Vertex AI). This guide covers all major domains. Interviewers at Google Cloud, Airtel, Reliance Jio, Infosys, and TCS digital units ask about GCP regularly in 2026. Understanding when to use serverless versus container-based versus VM-based compute is the most commonly tested decision in senior cloud interviews.

Core Compute

Q1. Compare Compute Engine, App Engine, Cloud Run, GKE, and Cloud Functions.

Service	Model	Use case	Cold start
Compute Engine	IaaS (VMs)	Full OS control, lift-and-shift, GPU workloads	None (always running)
App Engine Standard	PaaS (sandboxed)	Fast HTTP apps (Python, Java, Go, Node.js)	Seconds
App Engine Flexible	PaaS (Docker containers)	Custom runtimes, background processing	Minutes
Cloud Run	Serverless containers	Stateless HTTP services, any language via container	under 1 second
GKE (Autopilot)	Managed Kubernetes	Complex microservices, stateful workloads	None
Cloud Functions	Serverless functions	Event-driven triggers, glue code	Seconds
Batch	Managed job scheduler	HPC, batch compute with MPI, auto-provisioned VMs	N/A

Cloud Run specifics:

# A Cloud Run service is just a container that handles HTTP requests
# Cloud Run handles: scaling, load balancing, HTTPS, domain mapping

# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app

# Deploy
# gcloud run deploy my-service --image gcr.io/myproject/my-app --region us-central1 --allow-unauthenticated

GKE Autopilot vs Standard:

Standard: manage node pools (VM types, sizes, autoscaling config). Full control.
Autopilot: Google manages nodes entirely. Pay per pod CPU/memory request. Simpler operations, slightly higher per-unit cost.

The choice between Autopilot and Standard GKE often comes down to operational maturity. Teams early in their Kubernetes journey benefit from Autopilot's simplified model. Teams with specific hardware requirements, GPU workloads, or complex scheduling rules generally choose Standard mode.

Q2. What are GCP machine types, and how do you choose the right one?

Machine type families:

Family	Purpose	Examples
General Purpose	Balanced CPU/memory	e2 (cost), n2/n2d, c3 (performance)
Compute Optimized	High CPU-to-memory	c2, c3
Memory Optimized	High memory	m1/m2/m3 (up to 12 TB)
Accelerator Optimized	GPU	a2 (A100), g2 (L4)
Storage Optimized	High local SSD	z3

Custom machine types: GCP unique feature -- define exact vCPU and memory without being locked to standard sizes.

# Create custom VM: 6 vCPUs, 20 GB memory (instead of forced n1-standard-8 with 30 GB)
gcloud compute instances create my-vm \
    --custom-cpu=6 \
    --custom-memory=20GB \
    --zone=us-central1-a

Preemptible VMs / Spot VMs:

Preemptible: up to 24 hours, 30-second eviction notice, 60-91% cheaper.
Spot VMs (replacement for preemptible): no max lifetime, dynamic pricing, same eviction model.
Use for: fault-tolerant batch, ML training checkpointed jobs, stateless workloads.

Committed Use Discounts (CUDs):

1-year: ~37% discount. 3-year: ~55% discount.
Machine-level (specific VM family/region) or resource-based (flexible, like AWS Savings Plans).
Automatically applied to eligible usage.

GCP also offers sustained use discounts that apply automatically when you run a VM for more than 25% of a month -- no commitment needed. This makes GCP attractive for workloads that run consistently without requiring upfront reserved capacity purchases.

Storage

Q3. What are GCP's storage options, and when do you use each?

Service	Type	Latency	Use case
Cloud Storage	Object	ms	Unstructured data, data lake, backups
Persistent Disk	Block (SSD/HDD)	ms	VM boot disks, databases
Filestore	NFS file system	ms	Shared file storage for GKE, HPC
Cloud Bigtable	NoSQL wide-column	<10ms	Time-series, IoT, HBase-compatible
Firestore	NoSQL document	ms	Mobile/web apps, real-time sync
Cloud Spanner	Relational (global)	ms	Global ACID transactions
BigQuery	Columnar OLAP	Seconds	Analytics, data warehouse
Memorystore	In-memory (Redis)	<1ms	Cache, session store

Cloud Storage classes:

Standard: frequently accessed data
Nearline: accessed < once/month (backup)
Coldline: accessed < once/quarter (DR)
Archive: accessed < once/year (long-term)

Lifecycle rules: auto-transition between classes
gsutil lifecycle set lifecycle.json gs://my-bucket

Cloud Storage access control:

# Uniform bucket-level access (recommended, disables per-object ACLs)
gsutil uniformbucketlevelaccess set on gs://my-bucket

# IAM binding: grant read to specific service account
gsutil iam ch serviceAccount:[email protected]:objectViewer gs://my-bucket

# Signed URL (time-limited access without authentication)
gsutil signurl -d 1h key.json gs://my-bucket/private-file.csv

BigQuery

Q4. How does BigQuery work internally, and what makes it fast at scale?

Architecture:

Query -> Dremel (MPP query engine, tree of servers)
              |
         Colossus (distributed file system, stores ColumnIO/Capacitor format)

Storage and compute are FULLY SEPARATED:
- No cluster to size
- Slots (virtual CPUs) allocated per query
- Storage charged per GB/month
- Query charged per TB scanned (or flat rate via reservations)

Why BigQuery is fast:

Columnar storage: Query only accesses needed columns. A 100-column table scan for 3 columns reads 3% of data.
Dremel execution: Massively parallel -- query tree distributes work to thousands of leaf nodes reading from Colossus.
In-memory shuffles: Intermediate shuffle data held in Jupiter network fabric.
Query results cache: Identical queries return cached results at no cost (24-hour cache).

BigQuery's serverless model means there is no cluster to size, no index to build, and no VACUUM to run. The engineering team's operational burden is dramatically lower compared to traditional data warehouses like Redshift or Snowflake clusters. This operational simplicity is one of the main reasons enterprises adopt BigQuery as their primary analytics platform.

Partitioned tables:

-- Partition by ingestion time (automatic)
CREATE TABLE mydataset.events
PARTITION BY _PARTITIONDATE
OPTIONS (partition_expiration_days = 90)
AS SELECT * FROM mydataset.raw_events;

-- Partition by column date
CREATE TABLE mydataset.sales
PARTITION BY DATE(order_date)
AS SELECT * FROM ...;

-- Query with partition pruning (only scans relevant partitions)
SELECT SUM(amount) FROM mydataset.sales
WHERE DATE(order_date) BETWEEN '2026-01-01' AND '2026-03-31';

-- Check partitions
SELECT * FROM mydataset.INFORMATION_SCHEMA.PARTITIONS
WHERE table_name = 'sales';

Clustered tables:

-- Cluster by frequently filtered/joined columns (within each partition)
CREATE TABLE mydataset.events
PARTITION BY DATE(event_date)
CLUSTER BY user_id, event_type
AS SELECT * FROM mydataset.raw_events;

-- Block-level sorting means BigQuery skips irrelevant blocks
SELECT * FROM mydataset.events
WHERE event_date = '2026-06-08' AND user_id = '12345';

Q5. How do you optimize BigQuery queries for cost and performance?

Cost optimization:

-- 1. SELECT only needed columns (avoid SELECT *)
-- BAD: scans all columns (100 GB table -> 100 GB billed)
SELECT * FROM mydataset.events WHERE date = '2026-06-08';

-- GOOD: scans 2 columns only
SELECT user_id, event_type FROM mydataset.events WHERE date = '2026-06-08';

-- 2. Partition pruning (always filter on partition column)
-- BAD: full table scan
SELECT COUNT(*) FROM mydataset.events WHERE event_type = 'purchase';

-- GOOD: only scans June 2026 partition
SELECT COUNT(*) FROM mydataset.events
WHERE DATE(event_date) >= '2026-06-01' AND event_type = 'purchase';

-- 3. Preview costs before running
-- BigQuery UI: shows estimated bytes processed before execution
-- CLI: --dry_run flag
bq query --dry_run --use_legacy_sql=false \
    "SELECT * FROM mydataset.large_table WHERE date = '2026-06-08'"

-- 4. Approximate functions for exploration
SELECT APPROX_COUNT_DISTINCT(user_id) FROM mydataset.events;  -- much cheaper than COUNT(DISTINCT)
SELECT APPROX_QUANTILES(amount, 100) FROM mydataset.orders;   -- faster percentiles

Performance optimization:

-- 5. Avoid self-joins -- use window functions instead
-- BAD (self-join for running total):
SELECT a.date, SUM(b.revenue)
FROM sales a JOIN sales b ON b.date <= a.date
GROUP BY a.date;

-- GOOD (window function):
SELECT date, SUM(revenue) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM sales;

-- 6. Flatten repeated records before joining (UNNEST)
SELECT u.user_id, e.event_type
FROM users u
CROSS JOIN UNNEST(u.events) AS e;  -- unnest STRUCT/ARRAY fields

-- 7. Use CTEs for readability + intermediate caching
WITH daily_sales AS (
    SELECT DATE(order_date) AS day, SUM(amount) AS revenue
    FROM orders GROUP BY 1
),
running_total AS (
    SELECT day, revenue,
           SUM(revenue) OVER (ORDER BY day) AS cumulative
    FROM daily_sales
)
SELECT * FROM running_total;

Q6. What is BigQuery ML, and what models does it support?

BigQuery ML lets you train and deploy ML models directly in BigQuery using SQL.

-- Train a logistic regression model
CREATE OR REPLACE MODEL mydataset.churn_model
OPTIONS(
    model_type = 'logistic_reg',
    input_label_cols = ['churned'],
    max_iterations = 20,
    learn_rate = 0.1
) AS
SELECT
    tenure_months,
    monthly_charges,
    total_charges,
    contract_type,
    payment_method,
    churned
FROM mydataset.customer_features
WHERE partition_date >= '2025-01-01';

-- Evaluate model
SELECT * FROM ML.EVALUATE(MODEL mydataset.churn_model,
    (SELECT * FROM mydataset.customer_features WHERE partition_date = '2026-06-01')
);

-- Predict
SELECT customer_id, predicted_churned, predicted_churned_probs
FROM ML.PREDICT(MODEL mydataset.churn_model,
    (SELECT * FROM mydataset.new_customers)
);

Supported model types:

Model	SQL option
Linear regression	`linear_reg`
Logistic regression	`logistic_reg`
K-Means clustering	`kmeans`
Matrix factorization	`matrix_factorization`
Time series (ARIMA+)	`arima_plus`
XGBoost	`boosted_tree_classifier/regressor`
DNN	`dnn_classifier/regressor`
AutoML	`automl_classifier/regressor`
Remote model (Vertex AI)	Import external models

Data Engineering

Q7. What is Cloud Pub/Sub, and how does it compare to Kafka?

Cloud Pub/Sub is Google's managed messaging service for asynchronous decoupled communication.

Architecture:

Publisher -> Topic -> Subscription -> Subscriber

Multiple subscriptions per topic:
  - Subscription A: Analytics service
  - Subscription B: Email service
  - Subscription C: Audit log

Each subscription maintains its own offset -- independent consumption.

Key features:

At-least-once delivery: Messages delivered at least once (deduplication key for exactly-once).
Push vs Pull subscriptions: Pull (subscriber calls receive()), Push (Pub/Sub calls subscriber endpoint).
Message retention: 10 minutes to 7 days (configurable).
Ordering keys: Messages with same ordering key delivered in order within a region.
Dead letter topics: Failed messages after N delivery attempts go to DLT.

from google.cloud import pubsub_v1
import json

# Publisher
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path('my-project', 'orders')

order = {'order_id': '12345', 'amount': 99.99, 'user_id': 'u789'}
data = json.dumps(order).encode('utf-8')

# Ordering key: route user's orders to same partition (preserve order)
future = publisher.publish(topic_path, data, ordering_key='u789')
print(f'Published: {future.result()}')

# Subscriber (pull)
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('my-project', 'orders-analytics')

def callback(message):
    order = json.loads(message.data.decode('utf-8'))
    process_order(order)
    message.ack()

streaming_pull_future = subscriber.subscribe(subscription_path, callback=callback)
streaming_pull_future.result()

Cloud Pub/Sub handles authentication, message durability, and global delivery automatically. Unlike Kafka, there are no brokers to size, no partition rebalancing to manage, and no ZooKeeper or KRaft configuration required. The trade-off is less control over retention and replay semantics compared to self-managed Kafka.

Pub/Sub vs Kafka:

Aspect	Cloud Pub/Sub	Kafka
Management	Fully managed (no brokers to manage)	Self-managed or Confluent Cloud
Retention	Up to 7 days	Configurable (unlimited with tiered storage)
Replay	Within retention window	Within retention window
Ordering	Per ordering key within region	Per partition (global ordering with 1 partition)
Scale	Auto (global)	Manual partition/broker scaling
Ecosystem	GCP-native, Dataflow integration	Universal (all major systems)
Exactly-once	With Cloud Dataflow	With transactional producer

Q8. What is Cloud Dataflow, and how does it implement the Apache Beam model?

Cloud Dataflow is Google's managed stream and batch processing service based on Apache Beam.

Apache Beam model:

import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

options = PipelineOptions([
    '--runner=DataflowRunner',
    '--project=my-project',
    '--region=us-central1',
    '--temp_location=gs://my-bucket/temp',
])

with beam.Pipeline(options=options) as p:
    # Read from Pub/Sub (streaming)
    events = (
        p
        | 'Read from PubSub' >> beam.io.ReadFromPubSub(
            subscription='projects/my-project/subscriptions/events-sub')
        | 'Parse JSON' >> beam.Map(lambda x: json.loads(x.decode('utf-8')))
    )

    # Window: group events into 5-minute tumbling windows
    windowed = (
        events
        | 'Add timestamps' >> beam.Map(lambda e: beam.window.TimestampedValue(e, e['event_ts']))
        | 'Window into 5min' >> beam.WindowInto(beam.window.FixedWindows(300))
    )

    # Aggregate per window
    aggregated = (
        windowed
        | 'Extract (product, revenue)' >> beam.Map(lambda e: (e['product_id'], e['amount']))
        | 'Sum revenue' >> beam.CombinePerKey(sum)
    )

    # Write to BigQuery
    aggregated | 'Write to BQ' >> beam.io.WriteToBigQuery(
        table='my-project:analytics.product_revenue',
        write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
        create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
    )

Dataflow unique features:

Autoscaling: adjusts worker count based on backlog and throughput.
Streaming + batch unified API: same Beam code runs in batch (GCS files) or streaming (Pub/Sub) mode.
Exactly-once processing: built into the Dataflow runner.
Flex Templates: package pipelines as containers for reproducible execution.
Side inputs: broadcast small lookup tables to all workers without shuffle.

Q9. What is Cloud Composer, and how does it differ from managed Airflow?

Cloud Composer is Google's managed Apache Airflow service built on GKE.

Architecture:

Cloud Composer environment:
  - GKE cluster (worker pods run tasks)
  - Cloud SQL (Airflow metadata DB)
  - Cloud Storage (DAG files, logs)
  - Redis (Celery broker)
  - Airflow webserver, scheduler

DAG deployment:
  gsutil cp my_dag.py gs://your-composer-bucket/dags/
  Composer automatically picks up and parses new DAGs

Cloud Composer vs self-hosted Airflow:

Aspect	Cloud Composer	Self-hosted Airflow
Management	Fully managed upgrades, patches	Manual
GCP integration	Native (Dataflow, BigQuery, Cloud Storage operators)	Requires custom operators
Cost	Higher per unit	Lower if managed well
Scaling	Autoscaling (Composer 2)	Manual
Version control	GCP-managed Airflow versions	Any version

Composer 2 improvements (Composer 1 vs 2):

Workloads Kubernetes Autopilot: auto-scale workers based on task queue.
Environment snapshots: backup and restore environments.
Airflow 2.x: TaskFlow API, better performance.

Kubernetes Engine

Q10. What is GKE, and what are its key operational concepts?

GKE (Google Kubernetes Engine) is Google's managed Kubernetes service.

GKE cluster modes:

Standard: manage node pools -- choose machine types, autoscaler settings, node locations.
Autopilot: Google manages nodes entirely -- you only define Pods. Pay per Pod resource request.

Key GKE operational features:

# Create a GKE Autopilot cluster
gcloud container clusters create-auto my-cluster \
    --region us-central1

# Create a Standard cluster with autoscaling
gcloud container clusters create my-cluster \
    --num-nodes 3 \
    --enable-autoscaling \
    --min-nodes 1 \
    --max-nodes 20 \
    --machine-type e2-standard-4 \
    --zone us-central1-a

# Get credentials
gcloud container clusters get-credentials my-cluster --zone us-central1-a

Workload Identity (replacing service account key files):

# Allow Kubernetes service account to impersonate GCP service account
# No key file needed -- IAM binding at identity level

gcloud iam service-accounts add-iam-policy-binding \
    [email protected] \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:my-project.svc.id.goog[my-namespace/my-ksa]"

# Pod spec: annotate KSA to use Workload Identity
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-ksa
  namespace: my-namespace
  annotations:
    iam.gke.io/gcp-service-account: [email protected]

GKE Autopilot resource model:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: gcr.io/my-project/my-app:v1
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "1000m"
            memory: "1Gi"
# Autopilot provisions nodes to fit these requests
# Billing = sum of all Pod requests, not node capacity

IAM and Security

Q11. Explain GCP's IAM model: projects, service accounts, roles.

Security on GCP is enforced through a layered IAM model. Unlike traditional systems where access is defined per user per resource, GCP uses roles (bundles of permissions) assigned to principals (users, groups, service accounts) at a specific scope in the resource hierarchy. Understanding this model is essential for both the Google Cloud Professional Cloud Architect exam and practical security governance interviews.

Resource hierarchy:

Organization (root)
  |-- Folders (departments, teams)
  |     |-- Projects (billing + resource boundary)
  |           |-- Resources (VMs, buckets, databases)

IAM policies inherited down the hierarchy:
  - Grant at Organization -> inherited by all projects/resources
  - Grant at Project -> inherited by resources within
  - Grant at Resource -> specific override

IAM member types:

user:[email protected] -- Google account
serviceAccount:[email protected] -- non-human identity
group:[email protected] -- Google Group
domain:example.com -- all users in a G Suite domain
allUsers -- public internet (use with caution)
allAuthenticatedUsers -- any Google-authenticated user

Role types:

Type	Scope	Examples
Primitive	Project-wide	Owner, Editor, Viewer (too broad for prod)
Predefined	Service-specific	`roles/storage.objectViewer`, `roles/bigquery.dataEditor`
Custom	User-defined	Bundle exactly the permissions needed

Service account best practices:

# Create minimal service account for a Cloud Run service
gcloud iam service-accounts create my-run-sa \
    --display-name "Cloud Run service account"

# Grant only needed permissions (not Editor!)
gcloud projects add-iam-policy-binding my-project \
    --member "serviceAccount:[email protected]" \
    --role "roles/bigquery.dataViewer"

gcloud projects add-iam-policy-binding my-project \
    --member "serviceAccount:[email protected]" \
    --role "roles/pubsub.publisher"

# Deploy Cloud Run with specific SA
gcloud run deploy my-service \
    --image gcr.io/my-project/my-app \
    --service-account [email protected]

Q12. What are VPC Service Controls, and when do you use them?

VPC Service Controls create a security perimeter around GCP services, preventing data exfiltration even from authorized users.

Problem they solve: A compromised or malicious user with bigquery.dataViewer can copy data to an external project. Standard IAM cannot prevent this since IAM controls WHO can access, not WHERE data can go.

Service perimeter:

Perimeter: my-secure-perimeter
  Protected services: bigquery.googleapis.com, storage.googleapis.com
  Projects inside perimeter: my-prod-project, my-analytics-project

Rules:
  - API calls to BigQuery/Storage ONLY succeed from within the perimeter
  - External project calling your BigQuery API: DENIED
  - Internal project copying to external bucket: DENIED
  - Access levels (exceptions): specific users/IPs can cross the perimeter

Use cases:

Regulated data (HIPAA, PCI, financial): prevent data leakage across project boundaries.
Data exfiltration prevention: insider threat mitigation.
Compliance: audit-friendly perimeter with Cloud Audit Logs.

Networking

Q13. Explain GCP networking: VPCs, subnets, and peering.

GCP networking differs from AWS and Azure in one important way: VPCs in GCP are global by default. A single VPC can span all GCP regions, and subnets within that VPC are regional. This simplifies network design for global applications and eliminates the need for VPC peering between regions of the same organization -- you simply add subnets in each region you need.

GCP VPC is global (unlike AWS VPC which is regional):

my-vpc (global)
  Subnet us-central1: 10.0.0.0/20
  Subnet us-east1: 10.1.0.0/20
  Subnet europe-west1: 10.2.0.0/20

VMs in different regions on same VPC communicate over Google's private network
No VPN or peering needed within same VPC

Subnet modes:

Auto mode: Google auto-creates subnets in every region (fixed /20 ranges).
Custom mode: You define subnet regions and CIDR ranges. Use for production (control).

Firewall rules:

# Allow HTTP traffic to web servers (tagged)
gcloud compute firewall-rules create allow-http \
    --network my-vpc \
    --allow tcp:80 \
    --target-tags web-server \
    --source-ranges 0.0.0.0/0

# Allow internal traffic between app and database tier
gcloud compute firewall-rules create allow-app-to-db \
    --network my-vpc \
    --allow tcp:5432 \
    --source-tags app-server \
    --target-tags db-server

# Hierarchical firewall policies (organization-level, enforced on all projects)
gcloud compute org-security-policies create my-org-policy \
    --organization=123456789

VPC Peering:

Connect two VPC networks (same or different projects/organizations).
Non-transitive: A-B + B-C does NOT give A-C access.
No single point of failure or bandwidth bottleneck.

VPC Peering is free -- there are no hourly charges for the peering connection itself, though normal data transfer charges apply. This makes it cost-effective for connecting multiple projects within an organization without routing traffic through a NAT gateway or internet.

Shared VPC:

One host project has the VPC; service projects share it.
Centralized network control, distributed resource creation.
Common pattern: platform team owns networking, app teams own their resources in service projects.

Q14. What is Cloud Load Balancing on GCP?

GCP load balancers are global (Anycast IP) or regional:

Type	Tier	Protocol	Scope
External Application (HTTP/S)	Global	HTTP, HTTPS, gRPC	Global, URL-based routing
Internal Application	Regional	HTTP, HTTPS	Internal VPC traffic
External Network (TCP/UDP)	Regional	TCP, UDP	Regional, preserve client IP
Internal TCP/UDP	Regional	TCP, UDP	Internal VPC
SSL Proxy	Global	SSL/TLS (non-HTTP)	Global
TCP Proxy	Global	TCP	Global

Global external HTTP(S) LB:

User (Tokyo) -> Google's Anycast IP (nearest PoP)
                  -> Route to closest healthy backend (Tokyo/Singapore/US)
                  -> No geographic traffic penalty
                  -> CDN cache at edge (Cloud CDN)

URL Map (routing rules):
  /api/* -> Backend Service: api-backend-group
  /static/* -> Cloud Storage bucket (direct CDN origin)
  /* -> Backend Service: web-frontend-group

Backend Service:
  - Instance Group or NEG (Network Endpoint Group)
  - Health check (HTTP /health)
  - Session affinity (optional)
  - Cloud CDN (enable/disable per backend)

Serverless NEGs: Route traffic directly to Cloud Run, App Engine, or Cloud Functions from the Global LB -- no intermediate VMs. This allows a single global load balancer with a single Anycast IP to front both container-based and serverless backends, with URL-map based routing distributing traffic to the appropriate backend service.

Real-World Architecture

Q15. Design a real-time analytics pipeline on GCP for 1M events/second.

Architecture design questions are common at senior GCP interviews. The pattern below is representative of production pipelines at scale -- candidates report seeing variations of this design at companies with large-scale event processing needs. Each component choice has specific trade-offs that interviewers probe.

Architecture:

Mobile/Web Apps -> Cloud Pub/Sub (ingestion)
                        |
            ____________|____________
           |                         |
    Cloud Dataflow              Cloud Dataflow
    (streaming, 5-min           (streaming, raw
     aggregations)               data to BQ)
           |                         |
    BigQuery (aggregated        BigQuery (raw events
     metrics table,              partitioned by date,
     TTL 90 days)                clustered by user_id)
           |
    Looker Studio / Looker
    (real-time dashboards,
     sub-minute latency)

Pub/Sub throughput:

1M events/sec at 500 bytes avg = 500 MB/s
Pub/Sub handles this with no configuration -- fully managed, auto-scales

Dataflow pipeline (streaming aggregation):

with beam.Pipeline(options=options) as p:
    events = (
        p
        | beam.io.ReadFromPubSub(topic='projects/myproject/topics/app-events')
        | beam.Map(json.loads)
        | beam.Map(lambda e: beam.window.TimestampedValue(e, e['ts'] / 1000))
        | beam.WindowInto(
            beam.window.FixedWindows(300),  # 5-minute windows
            trigger=beam.trigger.AfterWatermark(
                early=beam.trigger.AfterCount(10000)  # early fire at 10K events
            ),
            accumulation_mode=beam.trigger.AccumulationMode.ACCUMULATING
          )
        | beam.Map(lambda e: (e['event_type'], e['user_id']))
        | beam.combiners.Count.PerKey()
        | beam.io.WriteToBigQuery('myproject:analytics.event_counts')
    )

BigQuery write optimization:

Use streaming inserts for real-time (millisecond latency, higher cost).
Use Dataflow batch for aggregates (write every 5 min, much cheaper).
Storage Write API: higher throughput, lower cost than streaming inserts.

For high-volume pipelines, Storage Write API is preferred over the legacy streaming inserts API because it offers exactly-once semantics with deduplication support and charges per byte written rather than per row. Dataflow has native integration with the Storage Write API through the BigQueryIO connector.

Q16. How do you run ML training and serve predictions on GCP?

Vertex AI (unified ML platform):

from google.cloud import aiplatform

aiplatform.init(project='my-project', location='us-central1')

# Train a custom model on Vertex AI Training
job = aiplatform.CustomTrainingJob(
    display_name='xgboost-churn-v1',
    script_path='trainer/task.py',
    container_uri='us-docker.pkg.dev/vertex-ai/training/sklearn-cpu.1-0:latest',
    model_serving_container_image_uri=(
        'us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest'
    )
)

model = job.run(
    dataset=dataset,
    model_display_name='churn-predictor-v1',
    args=['--max-depth=6', '--n-estimators=200'],
    machine_type='n1-standard-4',
    replica_count=1,
)

# Deploy to endpoint
endpoint = aiplatform.Endpoint.create(display_name='churn-endpoint')
model.deploy(
    endpoint=endpoint,
    machine_type='n1-standard-2',
    min_replica_count=1,
    max_replica_count=5,
    traffic_percentage=100,
)

# Online prediction
prediction = endpoint.predict(instances=[
    {'tenure': 24, 'monthly_charges': 65.5, 'contract': 'Month-to-month'}
])

Vertex AI components:

Component	Purpose
Vertex AI Workbench	Managed Jupyter notebooks
Vertex AI Training	Custom + AutoML training
Vertex AI Prediction	Online + batch prediction endpoints
Vertex AI Pipelines	Kubeflow Pipelines on managed infrastructure
Vertex AI Feature Store	Centralized feature serving (online + offline)
Vertex AI Experiments	Track and compare training runs
Vertex AI Model Registry	Versioned model management
Vertex AI Vector Search	ANN (approximate nearest neighbor) for embeddings

Cost Management

Q17. How do you manage and optimize GCP costs?

Key tools:

Cloud Billing: view costs by project, service, SKU
  Budget alerts: notify at 50%, 75%, 90%, 100% of budget
  Cost allocation: labels to attribute costs to teams/products
  
Cloud Cost Management (BigQuery billing export):
  Export billing data to BigQuery daily
  Build custom cost dashboards in Looker Studio
  
Recommender:
  Idle VM recommendations: VMs with < 8% CPU over 14 days
  Right-sizing: suggest smaller instance type
  Committed use discount recommendations
  Unused reservations

BigQuery cost control:

-- Create budget for project in BigQuery reservations
-- Flat-rate pricing: fixed slot count, predictable costs

-- On-demand pricing controls:
-- 1. Custom quotas: limit daily bytes processed per project
-- 2. Table access controls: limit which tables users can scan
-- 3. Authorized views: limit columns visible to users

-- Query cost before running
bq query --dry_run "SELECT * FROM mydataset.large_table"
# Output: Query will process 250 GB when run.

-- Column-level access control (column policy tags) to restrict PII
-- Users without tag access get NULL for those columns

Committed Use Discounts strategy:

Baseline compute (always needed) -> CUDs (1-yr or 3-yr)
Variable/burst compute -> On-demand or Spot VMs
Batch ML training -> Spot VMs (preemption acceptable)

Cost management is increasingly tested in GCP interviews at senior levels. Beyond CUDs, teams should export billing data to BigQuery and build dashboards that show spend by label (team, environment, service). The Recommender API provides programmatic access to cost-saving recommendations, allowing automated ticket creation or Slack notifications when optimization opportunities exceed a threshold.

FAQ

Q: What is the difference between Cloud SQL and Cloud Spanner?

Cloud SQL is managed MySQL, PostgreSQL, or SQL Server -- regional, up to ~96 vCPUs and a few TB of storage. Suitable for most OLTP workloads. Cloud Spanner is Google's globally distributed relational database with external consistency (true serializable isolation across regions). It scales horizontally to petabytes, handles millions of transactions per second globally, and maintains ACID guarantees across regions. Spanner is significantly more expensive -- use it when: global distribution is required, > 30,000 QPS for reads, > 1,500 QPS for writes, or strict cross-region consistency is needed. Candidates report Spanner questions appear frequently for senior cloud architect interviews.

Q: How does Cloud Storage object versioning work?

When versioning is enabled on a bucket, Cloud Storage retains previous versions of objects when overwritten or deleted. Each version gets a unique generation number. You can list, restore, or permanently delete old versions. Common use case: accidental deletion protection, rollback to prior file version. Combine with Object Lifecycle Management to auto-delete versions older than N days (keeps costs controlled). Object Lock (retention policies) is available for compliance/WORM requirements.

Q: What are the key differences between BigQuery and Cloud Bigtable?

BigQuery is an OLAP data warehouse for analytical SQL queries on large datasets (seconds latency, TB-PB scale). Bigtable is a NoSQL wide-column store for high-throughput low-latency operational workloads (single-digit millisecond reads/writes at millions of requests per second). Use BigQuery for analytics and reporting; use Bigtable for time-series data ingestion, operational read/write at scale, and AdTech event storage. Confirm current service capabilities and pricing on the official Google Cloud documentation.

Sources and review notesreviewed 8 Jun 2026

Article-specific sources

Verification window

Page last edited 8 Jun 2026 by Aditya Sharma. A review date records an editorial edit, not a guarantee that every external fact is still current.

Evidence labels

Official notices, candidate reports, offer documents, and editorial practice questions carry different confidence levels. The visible source list lets you inspect the evidence instead of relying on a blanket verification badge.

Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

topic cluster

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.

Submit your story →

ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start free mock test →

related guides

Interview Questions

Share this guide

Twitter LinkedIn W WhatsApp

GCP Interview Questions 2026: Google Cloud Platform Core Services & Architecture

Core Compute

Q1. Compare Compute Engine, App Engine, Cloud Run, GKE, and Cloud Functions.

Q2. What are GCP machine types, and how do you choose the right one?

Storage

Q3. What are GCP's storage options, and when do you use each?

BigQuery

Q4. How does BigQuery work internally, and what makes it fast at scale?

Q5. How do you optimize BigQuery queries for cost and performance?

Q6. What is BigQuery ML, and what models does it support?

Data Engineering

Q7. What is Cloud Pub/Sub, and how does it compare to Kafka?

Q8. What is Cloud Dataflow, and how does it implement the Apache Beam model?

Q9. What is Cloud Composer, and how does it differ from managed Airflow?

Kubernetes Engine

Q10. What is GKE, and what are its key operational concepts?

IAM and Security

Q11. Explain GCP's IAM model: projects, service accounts, roles.

Q12. What are VPC Service Controls, and when do you use them?

Networking

Q13. Explain GCP networking: VPCs, subnets, and peering.

Q14. What is Cloud Load Balancing on GCP?

Real-World Architecture

Q15. Design a real-time analytics pipeline on GCP for 1M events/second.

Q16. How do you run ML training and serve predictions on GCP?

Cost Management

Q17. How do you manage and optimize GCP costs?

FAQ

Q: What is the difference between Cloud SQL and Cloud Spanner?

Q: How does Cloud Storage object versioning work?

Q: What are the key differences between BigQuery and Cloud Bigtable?

More resources in Interview Questions

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

Kubernetes Interview Questions 2026, Top 50 with Expert Answers

AWS Interview Questions 2026, Top 50 with Expert Answers

AWS Scenario Based Interview Questions 2026, 25 Real Architecture Cases

Google Interview Questions 2026: Top 30 HR & Tech Q&As

Kubernetes Architecture Interview Questions 2026, 30 Q&A on Control Plane and Components

Share this guide

GCP Interview Questions 2026: Google Cloud Platform Core Services & Architecture

Core Compute

Q1. Compare Compute Engine, App Engine, Cloud Run, GKE, and Cloud Functions.

Q2. What are GCP machine types, and how do you choose the right one?

Storage

Q3. What are GCP's storage options, and when do you use each?

BigQuery

Q4. How does BigQuery work internally, and what makes it fast at scale?

Q5. How do you optimize BigQuery queries for cost and performance?

Q6. What is BigQuery ML, and what models does it support?

Data Engineering

Q7. What is Cloud Pub/Sub, and how does it compare to Kafka?

Q8. What is Cloud Dataflow, and how does it implement the Apache Beam model?

Q9. What is Cloud Composer, and how does it differ from managed Airflow?

Kubernetes Engine

Q10. What is GKE, and what are its key operational concepts?

IAM and Security

Q11. Explain GCP's IAM model: projects, service accounts, roles.

Q12. What are VPC Service Controls, and when do you use them?

Networking

Q13. Explain GCP networking: VPCs, subnets, and peering.

Q14. What is Cloud Load Balancing on GCP?

Real-World Architecture

Q15. Design a real-time analytics pipeline on GCP for 1M events/second.

Q16. How do you run ML training and serve predictions on GCP?

Cost Management

Q17. How do you manage and optimize GCP costs?

FAQ

Q: What is the difference between Cloud SQL and Cloud Spanner?

Q: How does Cloud Storage object versioning work?

Q: What are the key differences between BigQuery and Cloud Bigtable?

Related Topics

More resources in Interview Questions

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

Kubernetes Interview Questions 2026, Top 50 with Expert Answers

AWS Interview Questions 2026, Top 50 with Expert Answers

AWS Scenario Based Interview Questions 2026, 25 Real Architecture Cases

Google Interview Questions 2026: Top 30 HR & Tech Q&As

Kubernetes Architecture Interview Questions 2026, 30 Q&A on Control Plane and Components

Share this guide