placement brief / Uncategorized / interview questions / 08 Jun 2026

AWS Solutions Architect Interview Questions 2026: SAA-C03 & Design Patterns

Q: What is VPC endpoint, and when should you use it?

A VPC endpoint lets EC2/Lambda in a private subnet reach S3, DynamoDB, or other AWS services without traffic leaving AWS's network (no NAT Gateway, no Internet Gateway). Two types: Gateway endpoints (S3, DynamoDB -- free, add to route table) and Interface endpoints (most other services -- cost per hour + per GB, creates ENI in subnet). Use gateway endpoints for S3/DynamoDB to eliminate NAT Gateway costs and keep traffic on private network.

Q: What is the difference between CloudWatch and CloudTrail?

CloudWatch is for performance monitoring and operational visibility: metrics (CPU, memory, custom), logs (application, access, flow logs), alarms, dashboards, and anomaly detection. CloudTrail is for audit and compliance: records every API call made in the AWS account (who did what, when, from where). CloudTrail logs go to S3 and can be analyzed with Athena. Use CloudWatch to know IF something is wrong; use CloudTrail to know WHO changed something.

Q: How do you secure data at rest and in transit on AWS?

At rest: S3 Server-Side Encryption (SSE-S3 managed keys, SSE-KMS for audit trail, SSE-C for customer-managed keys outside AWS), EBS encryption (KMS), RDS encryption (KMS, enabled at creation), DynamoDB encryption (always on, KMS key choice). In transit: enforce HTTPS on ALB/CloudFront/API Gateway, use TLS 1.2+, ACM (AWS Certificate Manager) for free managed certificates, TLS between RDS client and DB, VPC PrivateLink / VPN / Direct Connect for on-premise-to-AWS. Confirm current AWS encryption defaults on the official AWS security documentation before your interview. ---

> Candidates report that AWS Solutions Architect interviews in 2026 combine service knowledge with real-world design questions: how would you architect a...

By Aditya SharmaPublished 8 Jun 20263 sources listedSpot an error? Corrections open

7 min read last revised 8 Jun 2026

on this page§ 11

Candidates report that AWS Solutions Architect interviews in 2026 combine service knowledge with real-world design questions: how would you architect a multi-region, fault-tolerant application? Confirm current exam guide and service features on the official AWS documentation and exam pages.

AWS Solutions Architect is one of the most in-demand cloud certifications. Interviews test both the SAA-C03 exam syllabus and practical architecture judgment: choosing between services, designing for failure, and optimizing cost.

Core Compute

Q1. What is the difference between EC2 instance types, and how do you choose the right one?

EC2 instances are grouped into families by workload type:

Family	Purpose	Examples
General Purpose	Balanced CPU/memory	t3, m6i, m7g
Compute Optimized	High CPU-to-memory	c6i, c7g
Memory Optimized	High memory-to-CPU	r6i, x2idn, u-*
Storage Optimized	High disk I/O (NVMe SSD)	i3en, d3en
Accelerated Computing	GPU / FPGA	p4d, g5, inf2, trn1
HPC Optimized	Low-latency networking (EFA)	hpc6a

Selection process:

Identify bottleneck: CPU-bound (compute optimized), memory-bound (memory optimized), I/O-bound (storage optimized).
Graviton (ARM) instances (m7g, c7g, r7g): cost less per performance unit, good for containerized workloads.
T-family (burstable): use only for variable workloads with CPU spikes -- sustained high CPU depletes credits.
Spot instances: up to 90% cost savings for fault-tolerant, stateless workloads. Use with Auto Scaling and Spot interruption handling.

Pricing models:

Model	Discount	Commitment	Use case
On-Demand	0%	None	Dev/test, unpredictable
Reserved (1yr)	~40%	1 year	Steady-state production
Reserved (3yr)	~60%	3 years	Committed long-term
Savings Plans	~60%	1-3 years	Flexible (EC2+Fargate+Lambda)
Spot	~70-90%	None	Stateless, fault-tolerant

Q2. What is Auto Scaling, and what are the different scaling policies?

Auto Scaling automatically adjusts EC2 capacity based on demand.

Components:

Auto Scaling Group (ASG): Defines min/max/desired capacity, launch template, AZ placement.
Launch Template: AMI, instance type, security groups, key pair, user data.
Scaling policies: Rules that trigger capacity changes.

Scaling policy types:

Target Tracking (recommended)

# Maintain average CPU utilization at 60%
aws autoscaling put-scaling-policy \
    --policy-name cpu-target-tracking \
    --auto-scaling-group-name my-asg \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration '{
        "TargetValue": 60.0,
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "ASGAverageCPUUtilization"
        }
    }'

Step Scaling

# Scale out by 2 if CPU > 70%, by 4 if CPU > 85%
# Scale in by 1 if CPU < 40%

Scheduled Scaling

# Scale to 20 instances every weekday at 8 AM
aws autoscaling put-scheduled-update-group-action \
    --auto-scaling-group-name my-asg \
    --scheduled-action-name scale-up-morning \
    --recurrence "0 8 * * 1-5" \
    --desired-capacity 20

Predictive Scaling Uses ML to forecast traffic and scale proactively based on historical patterns.

Key settings:

Cooldown period: Wait time after scale event before next scale (default 300 seconds). Prevents thrashing.
Warm-up period: Time for new instances to start serving traffic (excluded from scale-in until warm).
Termination policies: OldestInstance, NewestInstance, ClosestToNextInstanceHour (cost optimization).

Q3. Explain ELB types and when to use each.

AWS has four load balancer types:

Type	Layer	Protocol	Use case
Application Load Balancer (ALB)	7 (HTTP/S)	HTTP, HTTPS, gRPC	Web apps, microservices, content-based routing
Network Load Balancer (NLB)	4 (TCP/UDP)	TCP, UDP, TLS	Ultra-low latency, static IP, non-HTTP protocols
Gateway Load Balancer (GWLB)	3	IP packets	Inline network appliances (firewalls, IDS)
Classic Load Balancer (CLB)	4/7	HTTP/S, TCP	Legacy only (do not use for new deployments)

ALB routing rules (interview favorite):

# Path-based routing
/api/* -> Target Group: API servers
/static/* -> Target Group: CDN origin / S3

# Host-based routing
api.example.com -> TG: API
www.example.com -> TG: Frontend

# Header-based routing
Header "X-Mobile: true" -> TG: Mobile-optimized backend

# Weighted routing (blue/green, canary)
TG-Blue: weight 90, TG-Green: weight 10

NLB characteristics:

Preserves client source IP (ALB replaces with LB IP unless Proxy Protocol enabled).
Static Elastic IP addresses -- useful for IP whitelisting at client firewalls.
Handles millions of requests per second with microsecond latency.
Supports PrivateLink for cross-VPC service endpoints.

Storage

Q4. Compare S3 storage classes and their use cases.

Storage Class	Retrieval	Min Duration	Use case
Standard	Immediate	None	Frequently accessed data
Intelligent-Tiering	Immediate (frequent/infrequent tiers)	None	Unknown or changing access patterns
Standard-IA	Immediate	30 days	Infrequent access (backup, disaster recovery)
One Zone-IA	Immediate	30 days	Infrequent, non-critical (re-creatable data)
Glacier Instant Retrieval	Milliseconds	90 days	Archival, quarterly access
Glacier Flexible Retrieval	Minutes to hours	90 days	Archival, annual access
Glacier Deep Archive	12-48 hours	180 days	Long-term compliance, rarely accessed

S3 Lifecycle policies:

{
  "Rules": [{
    "ID": "archive-old-logs",
    "Status": "Enabled",
    "Filter": {"Prefix": "logs/"},
    "Transitions": [
      {"Days": 30, "StorageClass": "STANDARD_IA"},
      {"Days": 90, "StorageClass": "GLACIER"},
      {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
    ],
    "Expiration": {"Days": 2555}
  }]
}

S3 performance:

Standard: 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix.
Multipart upload: recommended for objects > 100 MB, required for > 5 GB.
Transfer Acceleration: CloudFront edge locations for faster global uploads.
S3 Select: SQL queries on object content (CSV, JSON, Parquet) -- reduce data transfer.

Q5. What is the difference between EBS, EFS, and instance store?

Feature	EBS	EFS	Instance Store
Type	Block storage	NFS file system	Ephemeral block
Persistence	Yes (independent of instance)	Yes (regional)	No (lost on stop/terminate)
Attach to	Single AZ, single EC2 (usually)	Multiple EC2, multiple AZs	Specific instance type
Throughput	Up to 16 GB/s (io2 BE)	Burst or provisioned	Very high (NVMe SSD)
Use case	Boot volumes, databases	Shared content, CMS, ECS	High I/O temp storage, caches
Cost	Per GB + IOPS	Per GB used	Included in instance price

EBS volume types:

Type	IOPS	Use case
gp3 (default)	Up to 16,000	General purpose (boot, dev)
io2 Block Express	Up to 256,000	Latency-sensitive DBs (SAP HANA, Oracle)
st1 (throughput)	500 IOPS	Big data, log processing (sequential read)
sc1 (cold)	250 IOPS	Infrequent access cold storage

EFS performance modes:

General Purpose: low-latency operations (web servers, CMS).
Max I/O: high aggregate throughput (big data, media processing) -- higher latency.

Databases

Q6. When do you use RDS vs DynamoDB vs Aurora vs ElastiCache?

Service	Type	Use case	When NOT to use
RDS (MySQL/PostgreSQL)	Relational	Structured data, ACID transactions, reporting	High-scale OLTP (>100K TPS)
Aurora	Managed relational (MySQL/PG compat)	High-scale relational, global apps	Simple/small workloads (cost)
DynamoDB	NoSQL key-value/document	Single-digit ms at any scale, serverless	Complex queries, JOINs, ACID multi-table
ElastiCache (Redis)	In-memory cache	Session store, leaderboards, pub/sub, hot data	Durable primary storage
ElastiCache (Memcached)	In-memory cache	Simple key-value cache, horizontal scaling	Complex data structures, persistence
Redshift	Columnar OLAP	Data warehouse, analytics (TB-PB)	OLTP, row-level updates
DocumentDB	MongoDB-compatible	JSON documents, content management	High-volume time series
Neptune	Graph	Social graphs, fraud detection	Non-graph data

Aurora advantages over RDS:

Up to 15 read replicas (vs 5 for RDS MySQL/PG).
Automatic storage growth in 10 GB increments (no pre-provisioning).
Aurora Global Database: sub-second cross-region replication (RPO seconds, RTO < 1 minute).
Aurora Serverless v2: auto-scales from 0.5 to 128 ACUs in seconds.

Q7. How does DynamoDB achieve single-digit millisecond performance at scale?

DynamoDB architecture for scale:

1. Partition key distribution Data is distributed across partitions by hashing the partition key. Each partition handles up to 3,000 RCUs and 1,000 WCUs.

# Hot partition problem: all writes to same partition key
# BAD key: status (only "active"/"inactive" -- two partitions take all traffic)
# GOOD key: user_id (high cardinality, uniform distribution)

# For write-heavy hot keys: add random suffix
user_id = "user123"
shard_key = f"{user_id}#{random.randint(0, 9)}"  # 10 shards per user
# Read: query all 10 shards, merge results

2. Eventually consistent vs strongly consistent reads

import boto3

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Orders')

# Eventually consistent (default) -- 50% cost, may lag milliseconds
response = table.get_item(Key={'order_id': '12345'})

# Strongly consistent -- full cost, guaranteed latest data
response = table.get_item(
    Key={'order_id': '12345'},
    ConsistentRead=True
)

3. Global Secondary Indexes (GSI)

# Table: Orders (PK=user_id, SK=order_id)
# GSI: status-created_at-index (PK=status, SK=created_at)

# Query all pending orders in last 24 hours
response = table.query(
    IndexName='status-created_at-index',
    KeyConditionExpression=Key('status').eq('PENDING') &
                           Key('created_at').gte('2026-06-07T00:00:00Z')
)

4. DynamoDB Streams + Lambda

Write to DynamoDB -> Stream event -> Lambda processes change
Use case: invalidate ElastiCache on DynamoDB write
         replicate to OpenSearch for full-text search
         trigger downstream workflows

5. DAX (DynamoDB Accelerator) In-memory cache in front of DynamoDB. Microsecond reads for hot data. Same DynamoDB API -- DAX SDK is a drop-in replacement.

Networking

Q8. Explain VPC architecture: subnets, route tables, NACLs, and security groups.

VPC anatomy:

VPC (10.0.0.0/16)
  Public Subnet (10.0.1.0/24) -- AZ-1a
    Internet Gateway -> Route table: 0.0.0.0/0 -> IGW
    EC2 instances with public IPs (web servers)
    
  Public Subnet (10.0.2.0/24) -- AZ-1b  
    NAT Gateway (for private subnet outbound)
    
  Private Subnet (10.0.10.0/24) -- AZ-1a
    Route table: 0.0.0.0/0 -> NAT Gateway
    EC2 instances (app servers, no public IP)
    
  Private Subnet (10.0.20.0/24) -- AZ-1b
    Route table: 0.0.0.0/0 -> NAT Gateway
    RDS instance (no internet access)

Security Groups vs NACLs:

Aspect	Security Group	NACL
Level	Instance (ENI)	Subnet
State	Stateful (return traffic auto-allowed)	Stateless (must allow inbound AND outbound)
Rules	Allow only	Allow + Deny
Evaluation	All rules evaluated	Rules evaluated in order (lowest number first)
Default	Deny all inbound, allow all outbound	Allow all in/out

# Security Group: web server
Inbound: TCP 443 from 0.0.0.0/0
Inbound: TCP 80 from 0.0.0.0/0
Outbound: All traffic

# Security Group: app server (only web tier can talk to it)
Inbound: TCP 8080 from sg-web-server (SG reference, not IP)
Outbound: TCP 5432 to sg-database

# Security Group: RDS
Inbound: TCP 5432 from sg-app-server
Outbound: (empty -- stateful, return traffic allowed)

VPC Peering vs Transit Gateway:

VPC Peering: Direct 1-to-1 connection, non-transitive (A-B + B-C does NOT give A-C access).
Transit Gateway: Hub-and-spoke for 100s of VPCs + on-premise connections. Transitive routing. Centralized network policy.

Q9. How does CloudFront work, and when should you use it?

CloudFront is AWS's CDN: globally distributed edge locations cache content close to end users.

Architecture:

User (Mumbai) -> CloudFront Edge (Mumbai)
                      |-- Cache HIT -> return cached content (<5ms)
                      |-- Cache MISS -> fetch from Origin (S3/ALB/custom)
                                          cache, serve to user

Origin types:

S3 bucket (with OAC -- Origin Access Control, replaces legacy OAI).
ALB / EC2 (dynamic content).
API Gateway.
Custom HTTP origin (on-premise, third-party).

Behaviors (routing rules):

Cache Behavior 1: Path /api/* -> Origin: ALB, TTL=0 (no cache)
Cache Behavior 2: Path /static/* -> Origin: S3, TTL=86400 (24 hours)
Cache Behavior 3: Default (*) -> Origin: ALB, TTL=300

Cache invalidation:

# Invalidate specific path (charged after 1,000 free/month)
aws cloudfront create-invalidation \
    --distribution-id E1234ABCDE \
    --paths "/images/*" "/css/main.css"

# Better practice: use versioned filenames (/css/main.v2.css) to avoid invalidation cost

Security features:

HTTPS enforcement + HTTP-to-HTTPS redirect.
AWS WAF integration (rate limiting, SQL injection protection, IP blocking).
Signed URLs / Signed Cookies for private content (paid video, premium downloads).
Origin Shield: additional caching layer between edge and origin (reduces origin load).
Geo-restriction: block or allow by country.

High Availability and Disaster Recovery

Q10. What are the four DR strategies in AWS, and when do you use each?

AWS defines four DR strategies on the reliability/cost spectrum:

Strategy	RTO	RPO	Cost	Setup
Backup and Restore	Hours	Hours	Lowest	S3 backups, restore on disaster
Pilot Light	Minutes-hours	Minutes	Low	Core services always on (DB), rest off
Warm Standby	Minutes	Seconds-minutes	Medium	Scaled-down running replica in DR region
Multi-Site Active/Active	Seconds	Near-zero	Highest	Full capacity in both regions

Backup and Restore:

# Automated S3 cross-region replication
aws s3api put-bucket-replication \
    --bucket source-bucket \
    --replication-configuration file://replication.json
# replication.json: Rules with Destination.Bucket = arn:aws:s3:::dr-bucket

Pilot Light:

RDS snapshot to DR region, manual restore on disaster.
AMIs copied to DR region.
Route 53 health checks switch DNS on failure.

Warm Standby:

Primary: Auto Scaling Group min=10, running full load
DR: Auto Scaling Group min=2, running at low scale
On disaster: update DR ASG desired=10, update Route 53 failover record

Active/Active:

Route 53 Latency-Based Routing:
  us-east-1: full capacity, handles east coast traffic
  eu-west-1: full capacity, handles EU traffic
DynamoDB Global Tables: multi-master, automatic conflict resolution
Aurora Global Database: writes to primary region, sub-second replication to secondary

Q11. How do Route 53 routing policies work?

Policy	Use case
Simple	Single resource, no health checks
Weighted	A/B testing, blue/green canary (90/10 split)
Latency-Based	Route to lowest-latency region
Failover	Active/passive DR -- switch to standby on health check failure
Geolocation	Route by user's geographic location (country/continent)
Geoproximity	Route by proximity with adjustable bias
Multivalue Answer	Return up to 8 healthy records (client-side load balancing)

Failover with health checks:

# Create health check
aws route53 create-health-check \
    --caller-reference unique-ref-1 \
    --health-check-config '{
        "Type": "HTTPS",
        "ResourcePath": "/health",
        "FullyQualifiedDomainName": "api.primary.example.com",
        "RequestInterval": 30,
        "FailureThreshold": 3
    }'

# Primary A record with failover type PRIMARY
# Secondary A record with failover type SECONDARY (DR region)
# Route 53 switches automatically when health check fails

Weighted routing for canary deployment:

Record A: api.example.com -> ALB-v1 (weight=90)
Record B: api.example.com -> ALB-v2 (weight=10)
# Gradually shift: 90/10 -> 70/30 -> 50/50 -> 0/100 -> delete Record A

Security

Q12. Explain the AWS shared responsibility model.

AWS Responsibility (Security OF the cloud):
  - Physical infrastructure (data centers, hardware, networking)
  - Hypervisor and host OS
  - Managed service internals (S3 durability, RDS engine)
  - Global infrastructure (regions, AZs, edge locations)

Customer Responsibility (Security IN the cloud):
  - Guest OS on EC2 instances (patching, hardening)
  - Applications and runtime
  - Data encryption (at-rest and in-transit)
  - IAM (identity, access, MFA)
  - Network configuration (VPC, security groups, NACLs)
  - S3 bucket policies, encryption settings
  - Compliance for regulated data (HIPAA, PCI, SOC2)

Shared controls (both AWS and customer):

Patch management: AWS patches hypervisor, customer patches guest OS.
Configuration management: AWS manages service config, customer configures their usage.
Awareness and training.

Interview follow-up: Which services require more customer security attention?

EC2: Full OS responsibility. Patch, harden, configure firewall.
S3: Default private, but customer must set bucket policies, block public access, enable encryption.
Lambda: AWS manages runtime, customer secures code and IAM execution role.
DynamoDB: AWS manages infrastructure, customer manages IAM policies, encryption, VPC endpoints.

Q13. What is AWS IAM, and what are best practices for access control?

IAM (Identity and Access Management) controls who can do what with AWS resources.

Core concepts:

Users: Human identities with long-term credentials. Prefer SSO/federation over IAM users.
Groups: Collections of users with shared policies.
Roles: Temporary credentials for services, cross-account access, federated identities.
Policies: JSON documents specifying Allow/Deny on Actions + Resources.

Principle of least privilege:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-app-bucket/uploads/*",
      "Condition": {
        "StringEquals": {
          "s3:prefix": ["uploads/"]
        }
      }
    }
  ]
}

Best practices:

1. Root account: enable MFA, never use for daily operations
2. IAM Users: enforce MFA, use access keys for programmatic access sparingly
3. Roles > Users: EC2/Lambda/ECS should always use instance roles
4. Policy: deny-by-default, grant minimum required permissions
5. Rotate credentials: audit unused access keys, set 90-day rotation policy
6. AWS Organizations + SCPs: account-level guardrails (e.g., deny all non-approved regions)
7. CloudTrail: log all API calls for audit
8. Access Analyzer: identify external access to S3, IAM roles, KMS keys

Cross-account access (common pattern):

// Trust policy on Role in Account B (allows Account A to assume it)
{
  "Statement": [{
    "Effect": "Allow",
    "Principal": {"AWS": "arn:aws:iam::ACCOUNT-A-ID:root"},
    "Action": "sts:AssumeRole",
    "Condition": {"Bool": {"aws:MultiFactorAuthPresent": "true"}}
  }]
}

// From Account A: assume role in Account B
aws sts assume-role \
    --role-arn arn:aws:iam::ACCOUNT-B-ID:role/CrossAccountRole \
    --role-session-name my-session

Architecture Design

Q14. How would you design a highly available, scalable web application on AWS?

Architecture (3-tier web app, multi-AZ, multi-region capable):

CloudFront (global CDN + WAF)
       |
       | HTTPS
       v
Route 53 (health-check based failover, latency routing)
       |
       v
ALB (multi-AZ, sticky sessions for stateful apps)
       |
       v
ASG + EC2 (m7g.large, min=3, max=30, across 3 AZs)
  - Stateless application tier
  - Session data in ElastiCache Redis
  - App config in SSM Parameter Store + Secrets Manager
       |
       v
Aurora (Multi-AZ, 2 read replicas, automated backups)
       |
       v
ElastiCache Redis (session store, application cache)

Data layer:

Static assets -> S3 + CloudFront (origin: S3 with OAC)
User uploads -> S3 (pre-signed URLs for direct upload, avoid EC2 proxy)
Search -> OpenSearch Service (sync from Aurora via Lambda + DynamoDB Streams)

CI/CD:

GitHub -> CodePipeline
  -> CodeBuild (unit tests, build Docker image, push to ECR)
  -> CodeDeploy blue/green to ASG (or ECS rolling update)
  -> Route 53 weighted routing for canary (10% new, 90% old)

Monitoring:

CloudWatch: EC2 metrics, custom app metrics, dashboards, alarms
CloudWatch Logs: aggregated from all instances (CloudWatch Agent)
X-Ray: distributed tracing across ALB -> EC2 -> RDS -> ElastiCache
SNS: alert routing to PagerDuty/Slack on alarm state change

Aspect	SQS	SNS
Model	Message queue (pull)	Pub/Sub (push)
Consumers	One consumer per message (unless FIFO with separate groups)	Multiple subscribers (fan-out)
Retention	Up to 14 days	No retention (fire and forget)
Delivery guarantee	At-least-once (standard), exactly-once (FIFO)	At-least-once
Use case	Decoupled async processing, task queue	Broadcast to multiple subscribers, fan-out

SQS queue types:

Type	Ordering	Deduplication	Max TPS
Standard	Best-effort	Possible duplicates	Nearly unlimited
FIFO	Strict	Exactly-once processing	3,000 messages/s with batching

SNS + SQS fan-out pattern (interview favorite):

Order Service publishes to SNS topic "order-events"
  -> SQS Queue: inventory-updates (Inventory Service subscribes)
  -> SQS Queue: email-notifications (Email Service subscribes)
  -> SQS Queue: analytics-events (Analytics Service subscribes)
  -> Lambda: real-time fraud-check

Benefit: Order Service decoupled from all downstream services
         Each service processes at its own pace
         New subscriber = add new SQS queue, no Order Service change

SQS visibility timeout:

# Consumer gets message, starts processing
# Message becomes invisible (visibility timeout, default 30s)
# If processing fails (no delete), message reappears after timeout
# Configure timeout = expected max processing time + buffer

# Dead Letter Queue (DLQ): after N failed attempts, move to DLQ
aws sqs create-queue --queue-name my-dlq
aws sqs set-queue-attributes \
    --queue-url https://sqs.us-east-1.amazonaws.com/123456789/my-queue \
    --attributes '{
        "RedrivePolicy": "{\"deadLetterTargetArn\":\"arn:aws:sqs:us-east-1:123456789:my-dlq\",\"maxReceiveCount\":5}"
    }'

Q16. When would you use Lambda instead of EC2/ECS?

Lambda advantages:

No server management (AWS handles patching, scaling, availability).
Per-invocation billing (no cost when idle).
Auto-scales to thousands of concurrent executions without configuration.
Event-driven integrations with 200+ AWS services.

Lambda limitations:

Max execution time: 15 minutes.
Ephemeral: no persistent local state (use S3/DynamoDB/EFS).
Cold start latency: 100ms-3s for first invocation (SnapStart for Java, provisioned concurrency mitigates).
Memory: 128 MB to 10 GB (CPU scales with memory).

Use Lambda when:

- Event-driven processing: S3 upload triggers image resize
- Webhooks / API callbacks: Stripe, GitHub, Slack events
- Scheduled jobs: CloudWatch Events cron (daily report)
- Stream processing: Kinesis/DynamoDB Streams consumers
- Glue code: orchestrate other AWS services
- APIs with highly variable traffic (zero to spiky)

Use EC2/ECS when:

- Long-running processes (> 15 min)
- High sustained throughput (cost advantage over per-invocation)
- WebSocket connections (Lambda does not hold persistent connections)
- Specific runtime versions or system dependencies not in Lambda
- Legacy applications requiring OS-level access

# Lambda function triggered by S3 upload
import boto3
import json

def lambda_handler(event, context):
    s3 = boto3.client('s3')
    rekognition = boto3.client('rekognition')

    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        # Detect labels in uploaded image
        response = rekognition.detect_labels(
            Image={'S3Object': {'Bucket': bucket, 'Name': key}},
            MaxLabels=10,
            MinConfidence=80
        )

        labels = [l['Name'] for l in response['Labels']]

        # Store results in DynamoDB
        dynamodb = boto3.resource('dynamodb')
        table = dynamodb.Table('ImageLabels')
        table.put_item(Item={
            'image_key': key,
            'labels': labels,
            'processed_at': context.aws_request_id
        })

    return {'statusCode': 200, 'body': json.dumps('Processed successfully')}

Cost Optimization

Q17. What are the key strategies for reducing AWS costs?

Compute:

1. Right-sizing: CloudWatch metrics -> CPU/memory < 20% -> downsize
   AWS Compute Optimizer: ML-based recommendations
   
2. Spot instances: 70-90% savings for stateless, fault-tolerant workloads
   Spot Fleet: mix instance types to improve availability
   
3. Savings Plans / Reserved Instances: 40-60% for predictable baseline
   Savings Plans > RIs: more flexible (covers Fargate + Lambda too)
   
4. Auto Scaling: scale to zero nights/weekends for dev/test
   Instance Scheduler: automated start/stop via Lambda + DynamoDB

Storage:

1. S3 Lifecycle policies: move to IA -> Glacier -> expire
2. EBS: delete unattached volumes (common waste)
   Downsize gp2 to gp3 (same performance, cheaper)
3. EBS snapshot cleanup: delete old snapshots, use Data Lifecycle Manager
4. S3 Intelligent-Tiering: automatic tier moves (no retrieval fee)

Database:

1. Aurora Serverless v2: scale to 0 for dev/test databases
2. RDS: stop dev/test instances nights/weekends (stop charges storage only)
3. Redshift: pause cluster when not in use (managed pause/resume)
4. DynamoDB on-demand vs provisioned: on-demand for unpredictable workloads

Network:

1. NAT Gateway: often top cost for private subnets
   Consolidate to fewer NAT GWs, use VPC endpoints for S3/DynamoDB (no NAT fee)
2. Data transfer: same-AZ traffic free, cross-AZ costs $0.01/GB
   Keep components in same AZ when possible
3. CloudFront: reduces data transfer out (CloudFront -> internet cheaper than EC2 -> internet)

Real-World Design Scenarios

Q18. Design a serverless data pipeline for real-time analytics on AWS.

Requirements: 100K events/second from IoT devices, store raw events, compute rolling aggregations, power real-time dashboard.

Architecture:

IoT Devices -> Kinesis Data Streams (10 shards, ~100K rec/s)
                    |
                    |-- Lambda (stream consumer, batch=100, window=30s)
                    |       -> DynamoDB: rolling 5-min aggregations (TTL=7d)
                    |
                    |-- Kinesis Firehose -> S3 (raw events, Parquet format)
                                               |
                                      Glue Crawler (auto-discover schema)
                                               |
                                           Athena (ad-hoc SQL on S3)
                                               |
                                    QuickSight (dashboard on Athena)

DynamoDB -> API Gateway -> Lambda (read aggregations)
                                |
                          CloudFront (cache API responses 30s)
                                |
                          React dashboard (WebSocket for live updates)

Kinesis shard calculation:

Write throughput: 100K records/s, avg 200 bytes/record = 20 MB/s
Kinesis shard capacity: 1,000 records/s or 1 MB/s per shard
Required shards: MAX(100K/1000, 20/1) = 20 shards

Read throughput per consumer: 2 MB/s per shard per consumer
With Lambda consumer: 20 shards * 2 MB/s = 40 MB/s fan-out

Lambda aggregation:

def lambda_handler(event, context):
    aggregations = {}
    for record in event['Records']:
        payload = json.loads(base64.b64decode(record['kinesis']['data']))
        device_id = payload['device_id']
        metric = payload['value']

        if device_id not in aggregations:
            aggregations[device_id] = {'sum': 0, 'count': 0, 'max': float('-inf')}

        aggregations[device_id]['sum'] += metric
        aggregations[device_id]['count'] += 1
        aggregations[device_id]['max'] = max(aggregations[device_id]['max'], metric)

    # Batch write to DynamoDB
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('DeviceMetrics')
    with table.batch_writer() as batch:
        for device_id, stats in aggregations.items():
            batch.put_item(Item={
                'device_id': device_id,
                'window': int(time.time() // 300) * 300,  # 5-min window
                'avg': stats['sum'] / stats['count'],
                'max': stats['max'],
                'count': stats['count'],
                'ttl': int(time.time()) + 7 * 86400
            })

FAQ

Q: What is VPC endpoint, and when should you use it?

A VPC endpoint lets EC2/Lambda in a private subnet reach S3, DynamoDB, or other AWS services without traffic leaving AWS's network (no NAT Gateway, no Internet Gateway). Two types: Gateway endpoints (S3, DynamoDB -- free, add to route table) and Interface endpoints (most other services -- cost per hour + per GB, creates ENI in subnet). Use gateway endpoints for S3/DynamoDB to eliminate NAT Gateway costs and keep traffic on private network.

Q: What is the difference between CloudWatch and CloudTrail?

CloudWatch is for performance monitoring and operational visibility: metrics (CPU, memory, custom), logs (application, access, flow logs), alarms, dashboards, and anomaly detection. CloudTrail is for audit and compliance: records every API call made in the AWS account (who did what, when, from where). CloudTrail logs go to S3 and can be analyzed with Athena. Use CloudWatch to know IF something is wrong; use CloudTrail to know WHO changed something.

Q: How do you secure data at rest and in transit on AWS?

At rest: S3 Server-Side Encryption (SSE-S3 managed keys, SSE-KMS for audit trail, SSE-C for customer-managed keys outside AWS), EBS encryption (KMS), RDS encryption (KMS, enabled at creation), DynamoDB encryption (always on, KMS key choice). In transit: enforce HTTPS on ALB/CloudFront/API Gateway, use TLS 1.2+, ACM (AWS Certificate Manager) for free managed certificates, TLS between RDS client and DB, VPC PrivateLink / VPN / Direct Connect for on-premise-to-AWS. Confirm current AWS encryption defaults on the official AWS security documentation before your interview.

Sources and review notesreviewed 8 Jun 2026

Article-specific sources

Verification window

Page last edited 8 Jun 2026 by Aditya Sharma. A review date records an editorial edit, not a guarantee that every external fact is still current.

Evidence labels

Official notices, candidate reports, offer documents, and editorial practice questions carry different confidence levels. The visible source list lets you inspect the evidence instead of relying on a blanket verification badge.

Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

topic cluster

More resources in Uncategorized

Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.

Browse all articles

paid contributor programme

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story with byline.

Submit your story →

ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start free mock test →

related guides

Interview Questions

Share this guide

Twitter LinkedIn W WhatsApp

AWS Solutions Architect Interview Questions 2026: SAA-C03 & Design Patterns

Core Compute

Q1. What is the difference between EC2 instance types, and how do you choose the right one?

Q2. What is Auto Scaling, and what are the different scaling policies?

Q3. Explain ELB types and when to use each.

Storage

Q4. Compare S3 storage classes and their use cases.

Q5. What is the difference between EBS, EFS, and instance store?

Databases

Q6. When do you use RDS vs DynamoDB vs Aurora vs ElastiCache?

Q7. How does DynamoDB achieve single-digit millisecond performance at scale?

Networking

Q8. Explain VPC architecture: subnets, route tables, NACLs, and security groups.

Q9. How does CloudFront work, and when should you use it?

High Availability and Disaster Recovery

Q10. What are the four DR strategies in AWS, and when do you use each?

Q11. How do Route 53 routing policies work?

Security

Q12. Explain the AWS shared responsibility model.

Q13. What is AWS IAM, and what are best practices for access control?

Architecture Design

Q14. How would you design a highly available, scalable web application on AWS?

Q16. When would you use Lambda instead of EC2/ECS?

Cost Optimization

Q17. What are the key strategies for reducing AWS costs?

Real-World Design Scenarios

Q18. Design a serverless data pipeline for real-time analytics on AWS.

FAQ

Q: What is VPC endpoint, and when should you use it?

Q: What is the difference between CloudWatch and CloudTrail?

Q: How do you secure data at rest and in transit on AWS?

More resources in Uncategorized

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

AWS Scenario Based Interview Questions 2026, 25 Real Architecture Cases

AWS Interview Questions 2026, Top 50 with Expert Answers

AWS Lambda Interview Questions 2026, 30 Q&A on Serverless and Cold Starts

AWS S3 and EC2 Interview Questions 2026, 32 Q&A on Storage and Compute

Azure Fundamentals Interview Questions 2026: AZ-900 & Core Services

Share this guide

AWS Solutions Architect Interview Questions 2026: SAA-C03 & Design Patterns

Core Compute

Q1. What is the difference between EC2 instance types, and how do you choose the right one?

Q2. What is Auto Scaling, and what are the different scaling policies?

Q3. Explain ELB types and when to use each.

Storage

Q4. Compare S3 storage classes and their use cases.

Q5. What is the difference between EBS, EFS, and instance store?

Databases

Q6. When do you use RDS vs DynamoDB vs Aurora vs ElastiCache?

Q7. How does DynamoDB achieve single-digit millisecond performance at scale?

Networking

Q8. Explain VPC architecture: subnets, route tables, NACLs, and security groups.

Q9. How does CloudFront work, and when should you use it?

High Availability and Disaster Recovery

Q10. What are the four DR strategies in AWS, and when do you use each?

Q11. How do Route 53 routing policies work?

Security

Q12. Explain the AWS shared responsibility model.

Q13. What is AWS IAM, and what are best practices for access control?

Architecture Design

Q14. How would you design a highly available, scalable web application on AWS?

Q15. What is the difference between SQS and SNS, and when do you use each?

Q16. When would you use Lambda instead of EC2/ECS?

Cost Optimization

Q17. What are the key strategies for reducing AWS costs?

Real-World Design Scenarios

Q18. Design a serverless data pipeline for real-time analytics on AWS.

FAQ

Q: What is VPC endpoint, and when should you use it?

Q: What is the difference between CloudWatch and CloudTrail?

Q: How do you secure data at rest and in transit on AWS?

Related Topics

More resources in Uncategorized

Sat this this year? Share your story, earn ₹500.

Take a free timed mock test

AWS Scenario Based Interview Questions 2026, 25 Real Architecture Cases

AWS Interview Questions 2026, Top 50 with Expert Answers

AWS Lambda Interview Questions 2026, 30 Q&A on Serverless and Cold Starts

AWS S3 and EC2 Interview Questions 2026, 32 Q&A on Storage and Compute

Azure Fundamentals Interview Questions 2026: AZ-900 & Core Services

Share this guide