issue 117apr 27mmxxvi
est. 2017
Sun, 27 Apr 2026
vol. IX · no. 117
PapersAdda
placement intelligence, since 2017
640+ briefs · 24 campuses · by reservation
verified offers · sourced from r/developersIndia
razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1razorpay₹65.00 LPA· iit-d · sde-1google₹54.00 LPA· iiit-h · swe-imicrosoft₹49.50 LPA· iit-b · sdeatlassian₹38.00 LPA· nit-w · sde-1amazon₹44.20 LPA· bits-p · sde-1uber₹42.00 LPA· iit-kgp · sde-1

AWS Solutions Architect Interview Questions 2026: SAA-C03 & Design Patterns

23 min read
Uncategorized
Updated: 8 Jun 2026
Aditya Sharma
Aditya's Edit

PapersAdda 2026 Placement Cycle

By Aditya Sharma·Founder & Editor, PapersAdda

What changed in 2026 drives

Mass-recruiter offer letters are flatter for 2026 batch - the 4-5 LPA ASE band has barely budged in three years while inflation eats real wages. Premium tracks (Digital, Pro, Elite, Specialist) are still where the differential lives, and they are entirely test-driven. If you are aiming higher than the default offer, the coding round is not optional pageantry - it is the entire interview.

What I'd actually study for this

  • 01Two solid coding-round answers (1 medium-hard DSA each, with edge-case discussion) > five half-baked ones
  • 02One real project you can defend end-to-end - file paths, design decisions, and what you would change
  • 03One DBMS schema you actually built (not a textbook ER diagram), with at least 3 join-heavy queries written from memory
  • 04Three behavioural STAR stories: failure recovered, conflict handled, ownership taken

Where most candidates trip up

The single biggest mistake is treating company-specific guides as primary prep and DSA as secondary. It is the opposite. Mass recruiters use the test as a filter, but premium tracks at every IT services company use coding to allocate offer band. Spend 70% of prep time on DSA + system fundamentals, 20% on company-specific patterns, 10% on HR rehearsal. Reverse that ratio and you collect the default offer.

Editorial commentary by Aditya Sharma · written for PapersAdda · not generated, not aggregated.

Candidates report that AWS Solutions Architect interviews in 2026 combine service knowledge with real-world design questions: how would you architect a multi-region, fault-tolerant application? Confirm current exam guide and service features on the official AWS documentation and exam pages.

AWS Solutions Architect is one of the most in-demand cloud certifications. Interviews test both the SAA-C03 exam syllabus and practical architecture judgment: choosing between services, designing for failure, and optimizing cost.


Core Compute

Q1. What is the difference between EC2 instance types, and how do you choose the right one?

EC2 instances are grouped into families by workload type:

FamilyPurposeExamples
General PurposeBalanced CPU/memoryt3, m6i, m7g
Compute OptimizedHigh CPU-to-memoryc6i, c7g
Memory OptimizedHigh memory-to-CPUr6i, x2idn, u-*
Storage OptimizedHigh disk I/O (NVMe SSD)i3en, d3en
Accelerated ComputingGPU / FPGAp4d, g5, inf2, trn1
HPC OptimizedLow-latency networking (EFA)hpc6a

Selection process:

  1. Identify bottleneck: CPU-bound (compute optimized), memory-bound (memory optimized), I/O-bound (storage optimized).
  2. Graviton (ARM) instances (m7g, c7g, r7g): cost less per performance unit, good for containerized workloads.
  3. T-family (burstable): use only for variable workloads with CPU spikes -- sustained high CPU depletes credits.
  4. Spot instances: up to 90% cost savings for fault-tolerant, stateless workloads. Use with Auto Scaling and Spot interruption handling.

Pricing models:

ModelDiscountCommitmentUse case
On-Demand0%NoneDev/test, unpredictable
Reserved (1yr)~40%1 yearSteady-state production
Reserved (3yr)~60%3 yearsCommitted long-term
Savings Plans~60%1-3 yearsFlexible (EC2+Fargate+Lambda)
Spot~70-90%NoneStateless, fault-tolerant

Q2. What is Auto Scaling, and what are the different scaling policies?

Auto Scaling automatically adjusts EC2 capacity based on demand.

Components:

  • Auto Scaling Group (ASG): Defines min/max/desired capacity, launch template, AZ placement.
  • Launch Template: AMI, instance type, security groups, key pair, user data.
  • Scaling policies: Rules that trigger capacity changes.

Scaling policy types:

Target Tracking (recommended)

# Maintain average CPU utilization at 60%
aws autoscaling put-scaling-policy \
    --policy-name cpu-target-tracking \
    --auto-scaling-group-name my-asg \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration '{
        "TargetValue": 60.0,
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "ASGAverageCPUUtilization"
        }
    }'

Step Scaling

# Scale out by 2 if CPU > 70%, by 4 if CPU > 85%
# Scale in by 1 if CPU < 40%

Scheduled Scaling

# Scale to 20 instances every weekday at 8 AM
aws autoscaling put-scheduled-update-group-action \
    --auto-scaling-group-name my-asg \
    --scheduled-action-name scale-up-morning \
    --recurrence "0 8 * * 1-5" \
    --desired-capacity 20

Predictive Scaling Uses ML to forecast traffic and scale proactively based on historical patterns.

Key settings:

  • Cooldown period: Wait time after scale event before next scale (default 300 seconds). Prevents thrashing.
  • Warm-up period: Time for new instances to start serving traffic (excluded from scale-in until warm).
  • Termination policies: OldestInstance, NewestInstance, ClosestToNextInstanceHour (cost optimization).

Q3. Explain ELB types and when to use each.

AWS has four load balancer types:

TypeLayerProtocolUse case
Application Load Balancer (ALB)7 (HTTP/S)HTTP, HTTPS, gRPCWeb apps, microservices, content-based routing
Network Load Balancer (NLB)4 (TCP/UDP)TCP, UDP, TLSUltra-low latency, static IP, non-HTTP protocols
Gateway Load Balancer (GWLB)3IP packetsInline network appliances (firewalls, IDS)
Classic Load Balancer (CLB)4/7HTTP/S, TCPLegacy only (do not use for new deployments)

ALB routing rules (interview favorite):

# Path-based routing
/api/* -> Target Group: API servers
/static/* -> Target Group: CDN origin / S3

# Host-based routing
api.example.com -> TG: API
www.example.com -> TG: Frontend

# Header-based routing
Header "X-Mobile: true" -> TG: Mobile-optimized backend

# Weighted routing (blue/green, canary)
TG-Blue: weight 90, TG-Green: weight 10

NLB characteristics:

  • Preserves client source IP (ALB replaces with LB IP unless Proxy Protocol enabled).
  • Static Elastic IP addresses -- useful for IP whitelisting at client firewalls.
  • Handles millions of requests per second with microsecond latency.
  • Supports PrivateLink for cross-VPC service endpoints.

Storage

Q4. Compare S3 storage classes and their use cases.

Storage ClassRetrievalMin DurationUse case
StandardImmediateNoneFrequently accessed data
Intelligent-TieringImmediate (frequent/infrequent tiers)NoneUnknown or changing access patterns
Standard-IAImmediate30 daysInfrequent access (backup, disaster recovery)
One Zone-IAImmediate30 daysInfrequent, non-critical (re-creatable data)
Glacier Instant RetrievalMilliseconds90 daysArchival, quarterly access
Glacier Flexible RetrievalMinutes to hours90 daysArchival, annual access
Glacier Deep Archive12-48 hours180 daysLong-term compliance, rarely accessed

S3 Lifecycle policies:

{
  "Rules": [{
    "ID": "archive-old-logs",
    "Status": "Enabled",
    "Filter": {"Prefix": "logs/"},
    "Transitions": [
      {"Days": 30, "StorageClass": "STANDARD_IA"},
      {"Days": 90, "StorageClass": "GLACIER"},
      {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
    ],
    "Expiration": {"Days": 2555}
  }]
}

S3 performance:

  • Standard: 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix.
  • Multipart upload: recommended for objects > 100 MB, required for > 5 GB.
  • Transfer Acceleration: CloudFront edge locations for faster global uploads.
  • S3 Select: SQL queries on object content (CSV, JSON, Parquet) -- reduce data transfer.

Q5. What is the difference between EBS, EFS, and instance store?

FeatureEBSEFSInstance Store
TypeBlock storageNFS file systemEphemeral block
PersistenceYes (independent of instance)Yes (regional)No (lost on stop/terminate)
Attach toSingle AZ, single EC2 (usually)Multiple EC2, multiple AZsSpecific instance type
ThroughputUp to 16 GB/s (io2 BE)Burst or provisionedVery high (NVMe SSD)
Use caseBoot volumes, databasesShared content, CMS, ECSHigh I/O temp storage, caches
CostPer GB + IOPSPer GB usedIncluded in instance price

EBS volume types:

TypeIOPSUse case
gp3 (default)Up to 16,000General purpose (boot, dev)
io2 Block ExpressUp to 256,000Latency-sensitive DBs (SAP HANA, Oracle)
st1 (throughput)500 IOPSBig data, log processing (sequential read)
sc1 (cold)250 IOPSInfrequent access cold storage

EFS performance modes:

  • General Purpose: low-latency operations (web servers, CMS).
  • Max I/O: high aggregate throughput (big data, media processing) -- higher latency.

Databases

Q6. When do you use RDS vs DynamoDB vs Aurora vs ElastiCache?

ServiceTypeUse caseWhen NOT to use
RDS (MySQL/PostgreSQL)RelationalStructured data, ACID transactions, reportingHigh-scale OLTP (>100K TPS)
AuroraManaged relational (MySQL/PG compat)High-scale relational, global appsSimple/small workloads (cost)
DynamoDBNoSQL key-value/documentSingle-digit ms at any scale, serverlessComplex queries, JOINs, ACID multi-table
ElastiCache (Redis)In-memory cacheSession store, leaderboards, pub/sub, hot dataDurable primary storage
ElastiCache (Memcached)In-memory cacheSimple key-value cache, horizontal scalingComplex data structures, persistence
RedshiftColumnar OLAPData warehouse, analytics (TB-PB)OLTP, row-level updates
DocumentDBMongoDB-compatibleJSON documents, content managementHigh-volume time series
NeptuneGraphSocial graphs, fraud detectionNon-graph data

Aurora advantages over RDS:

  • Up to 15 read replicas (vs 5 for RDS MySQL/PG).
  • Automatic storage growth in 10 GB increments (no pre-provisioning).
  • Aurora Global Database: sub-second cross-region replication (RPO seconds, RTO < 1 minute).
  • Aurora Serverless v2: auto-scales from 0.5 to 128 ACUs in seconds.

Q7. How does DynamoDB achieve single-digit millisecond performance at scale?

DynamoDB architecture for scale:

1. Partition key distribution Data is distributed across partitions by hashing the partition key. Each partition handles up to 3,000 RCUs and 1,000 WCUs.

# Hot partition problem: all writes to same partition key
# BAD key: status (only "active"/"inactive" -- two partitions take all traffic)
# GOOD key: user_id (high cardinality, uniform distribution)

# For write-heavy hot keys: add random suffix
user_id = "user123"
shard_key = f"{user_id}#{random.randint(0, 9)}"  # 10 shards per user
# Read: query all 10 shards, merge results

2. Eventually consistent vs strongly consistent reads

import boto3

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Orders')

# Eventually consistent (default) -- 50% cost, may lag milliseconds
response = table.get_item(Key={'order_id': '12345'})

# Strongly consistent -- full cost, guaranteed latest data
response = table.get_item(
    Key={'order_id': '12345'},
    ConsistentRead=True
)

3. Global Secondary Indexes (GSI)

# Table: Orders (PK=user_id, SK=order_id)
# GSI: status-created_at-index (PK=status, SK=created_at)

# Query all pending orders in last 24 hours
response = table.query(
    IndexName='status-created_at-index',
    KeyConditionExpression=Key('status').eq('PENDING') &
                           Key('created_at').gte('2026-06-07T00:00:00Z')
)

4. DynamoDB Streams + Lambda

Write to DynamoDB -> Stream event -> Lambda processes change
Use case: invalidate ElastiCache on DynamoDB write
         replicate to OpenSearch for full-text search
         trigger downstream workflows

5. DAX (DynamoDB Accelerator) In-memory cache in front of DynamoDB. Microsecond reads for hot data. Same DynamoDB API -- DAX SDK is a drop-in replacement.


Networking

Q8. Explain VPC architecture: subnets, route tables, NACLs, and security groups.

VPC anatomy:

VPC (10.0.0.0/16)
  Public Subnet (10.0.1.0/24) -- AZ-1a
    Internet Gateway -> Route table: 0.0.0.0/0 -> IGW
    EC2 instances with public IPs (web servers)
    
  Public Subnet (10.0.2.0/24) -- AZ-1b  
    NAT Gateway (for private subnet outbound)
    
  Private Subnet (10.0.10.0/24) -- AZ-1a
    Route table: 0.0.0.0/0 -> NAT Gateway
    EC2 instances (app servers, no public IP)
    
  Private Subnet (10.0.20.0/24) -- AZ-1b
    Route table: 0.0.0.0/0 -> NAT Gateway
    RDS instance (no internet access)

Security Groups vs NACLs:

AspectSecurity GroupNACL
LevelInstance (ENI)Subnet
StateStateful (return traffic auto-allowed)Stateless (must allow inbound AND outbound)
RulesAllow onlyAllow + Deny
EvaluationAll rules evaluatedRules evaluated in order (lowest number first)
DefaultDeny all inbound, allow all outboundAllow all in/out
# Security Group: web server
Inbound: TCP 443 from 0.0.0.0/0
Inbound: TCP 80 from 0.0.0.0/0
Outbound: All traffic

# Security Group: app server (only web tier can talk to it)
Inbound: TCP 8080 from sg-web-server (SG reference, not IP)
Outbound: TCP 5432 to sg-database

# Security Group: RDS
Inbound: TCP 5432 from sg-app-server
Outbound: (empty -- stateful, return traffic allowed)

VPC Peering vs Transit Gateway:

  • VPC Peering: Direct 1-to-1 connection, non-transitive (A-B + B-C does NOT give A-C access).
  • Transit Gateway: Hub-and-spoke for 100s of VPCs + on-premise connections. Transitive routing. Centralized network policy.

Q9. How does CloudFront work, and when should you use it?

CloudFront is AWS's CDN: globally distributed edge locations cache content close to end users.

Architecture:

User (Mumbai) -> CloudFront Edge (Mumbai)
                      |-- Cache HIT -> return cached content (<5ms)
                      |-- Cache MISS -> fetch from Origin (S3/ALB/custom)
                                          cache, serve to user

Origin types:

  • S3 bucket (with OAC -- Origin Access Control, replaces legacy OAI).
  • ALB / EC2 (dynamic content).
  • API Gateway.
  • Custom HTTP origin (on-premise, third-party).

Behaviors (routing rules):

Cache Behavior 1: Path /api/* -> Origin: ALB, TTL=0 (no cache)
Cache Behavior 2: Path /static/* -> Origin: S3, TTL=86400 (24 hours)
Cache Behavior 3: Default (*) -> Origin: ALB, TTL=300

Cache invalidation:

# Invalidate specific path (charged after 1,000 free/month)
aws cloudfront create-invalidation \
    --distribution-id E1234ABCDE \
    --paths "/images/*" "/css/main.css"

# Better practice: use versioned filenames (/css/main.v2.css) to avoid invalidation cost

Security features:

  • HTTPS enforcement + HTTP-to-HTTPS redirect.
  • AWS WAF integration (rate limiting, SQL injection protection, IP blocking).
  • Signed URLs / Signed Cookies for private content (paid video, premium downloads).
  • Origin Shield: additional caching layer between edge and origin (reduces origin load).
  • Geo-restriction: block or allow by country.

High Availability and Disaster Recovery

Q10. What are the four DR strategies in AWS, and when do you use each?

AWS defines four DR strategies on the reliability/cost spectrum:

StrategyRTORPOCostSetup
Backup and RestoreHoursHoursLowestS3 backups, restore on disaster
Pilot LightMinutes-hoursMinutesLowCore services always on (DB), rest off
Warm StandbyMinutesSeconds-minutesMediumScaled-down running replica in DR region
Multi-Site Active/ActiveSecondsNear-zeroHighestFull capacity in both regions

Backup and Restore:

# Automated S3 cross-region replication
aws s3api put-bucket-replication \
    --bucket source-bucket \
    --replication-configuration file://replication.json
# replication.json: Rules with Destination.Bucket = arn:aws:s3:::dr-bucket

Pilot Light:

  • RDS snapshot to DR region, manual restore on disaster.
  • AMIs copied to DR region.
  • Route 53 health checks switch DNS on failure.

Warm Standby:

Primary: Auto Scaling Group min=10, running full load
DR: Auto Scaling Group min=2, running at low scale
On disaster: update DR ASG desired=10, update Route 53 failover record

Active/Active:

Route 53 Latency-Based Routing:
  us-east-1: full capacity, handles east coast traffic
  eu-west-1: full capacity, handles EU traffic
DynamoDB Global Tables: multi-master, automatic conflict resolution
Aurora Global Database: writes to primary region, sub-second replication to secondary

Q11. How do Route 53 routing policies work?

PolicyUse case
SimpleSingle resource, no health checks
WeightedA/B testing, blue/green canary (90/10 split)
Latency-BasedRoute to lowest-latency region
FailoverActive/passive DR -- switch to standby on health check failure
GeolocationRoute by user's geographic location (country/continent)
GeoproximityRoute by proximity with adjustable bias
Multivalue AnswerReturn up to 8 healthy records (client-side load balancing)

Failover with health checks:

# Create health check
aws route53 create-health-check \
    --caller-reference unique-ref-1 \
    --health-check-config '{
        "Type": "HTTPS",
        "ResourcePath": "/health",
        "FullyQualifiedDomainName": "api.primary.example.com",
        "RequestInterval": 30,
        "FailureThreshold": 3
    }'

# Primary A record with failover type PRIMARY
# Secondary A record with failover type SECONDARY (DR region)
# Route 53 switches automatically when health check fails

Weighted routing for canary deployment:

Record A: api.example.com -> ALB-v1 (weight=90)
Record B: api.example.com -> ALB-v2 (weight=10)
# Gradually shift: 90/10 -> 70/30 -> 50/50 -> 0/100 -> delete Record A

Security

Q12. Explain the AWS shared responsibility model.

AWS Responsibility (Security OF the cloud):
  - Physical infrastructure (data centers, hardware, networking)
  - Hypervisor and host OS
  - Managed service internals (S3 durability, RDS engine)
  - Global infrastructure (regions, AZs, edge locations)

Customer Responsibility (Security IN the cloud):
  - Guest OS on EC2 instances (patching, hardening)
  - Applications and runtime
  - Data encryption (at-rest and in-transit)
  - IAM (identity, access, MFA)
  - Network configuration (VPC, security groups, NACLs)
  - S3 bucket policies, encryption settings
  - Compliance for regulated data (HIPAA, PCI, SOC2)

Shared controls (both AWS and customer):

  • Patch management: AWS patches hypervisor, customer patches guest OS.
  • Configuration management: AWS manages service config, customer configures their usage.
  • Awareness and training.

Interview follow-up: Which services require more customer security attention?

  • EC2: Full OS responsibility. Patch, harden, configure firewall.
  • S3: Default private, but customer must set bucket policies, block public access, enable encryption.
  • Lambda: AWS manages runtime, customer secures code and IAM execution role.
  • DynamoDB: AWS manages infrastructure, customer manages IAM policies, encryption, VPC endpoints.

Q13. What is AWS IAM, and what are best practices for access control?

IAM (Identity and Access Management) controls who can do what with AWS resources.

Core concepts:

  • Users: Human identities with long-term credentials. Prefer SSO/federation over IAM users.
  • Groups: Collections of users with shared policies.
  • Roles: Temporary credentials for services, cross-account access, federated identities.
  • Policies: JSON documents specifying Allow/Deny on Actions + Resources.

Principle of least privilege:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-app-bucket/uploads/*",
      "Condition": {
        "StringEquals": {
          "s3:prefix": ["uploads/"]
        }
      }
    }
  ]
}

Best practices:

1. Root account: enable MFA, never use for daily operations
2. IAM Users: enforce MFA, use access keys for programmatic access sparingly
3. Roles > Users: EC2/Lambda/ECS should always use instance roles
4. Policy: deny-by-default, grant minimum required permissions
5. Rotate credentials: audit unused access keys, set 90-day rotation policy
6. AWS Organizations + SCPs: account-level guardrails (e.g., deny all non-approved regions)
7. CloudTrail: log all API calls for audit
8. Access Analyzer: identify external access to S3, IAM roles, KMS keys

Cross-account access (common pattern):

// Trust policy on Role in Account B (allows Account A to assume it)
{
  "Statement": [{
    "Effect": "Allow",
    "Principal": {"AWS": "arn:aws:iam::ACCOUNT-A-ID:root"},
    "Action": "sts:AssumeRole",
    "Condition": {"Bool": {"aws:MultiFactorAuthPresent": "true"}}
  }]
}

// From Account A: assume role in Account B
aws sts assume-role \
    --role-arn arn:aws:iam::ACCOUNT-B-ID:role/CrossAccountRole \
    --role-session-name my-session

Architecture Design

Q14. How would you design a highly available, scalable web application on AWS?

Architecture (3-tier web app, multi-AZ, multi-region capable):

CloudFront (global CDN + WAF)
       |
       | HTTPS
       v
Route 53 (health-check based failover, latency routing)
       |
       v
ALB (multi-AZ, sticky sessions for stateful apps)
       |
       v
ASG + EC2 (m7g.large, min=3, max=30, across 3 AZs)
  - Stateless application tier
  - Session data in ElastiCache Redis
  - App config in SSM Parameter Store + Secrets Manager
       |
       v
Aurora (Multi-AZ, 2 read replicas, automated backups)
       |
       v
ElastiCache Redis (session store, application cache)

Data layer:

Static assets -> S3 + CloudFront (origin: S3 with OAC)
User uploads -> S3 (pre-signed URLs for direct upload, avoid EC2 proxy)
Search -> OpenSearch Service (sync from Aurora via Lambda + DynamoDB Streams)

CI/CD:

GitHub -> CodePipeline
  -> CodeBuild (unit tests, build Docker image, push to ECR)
  -> CodeDeploy blue/green to ASG (or ECS rolling update)
  -> Route 53 weighted routing for canary (10% new, 90% old)

Monitoring:

CloudWatch: EC2 metrics, custom app metrics, dashboards, alarms
CloudWatch Logs: aggregated from all instances (CloudWatch Agent)
X-Ray: distributed tracing across ALB -> EC2 -> RDS -> ElastiCache
SNS: alert routing to PagerDuty/Slack on alarm state change

Q15. What is the difference between SQS and SNS, and when do you use each?

AspectSQSSNS
ModelMessage queue (pull)Pub/Sub (push)
ConsumersOne consumer per message (unless FIFO with separate groups)Multiple subscribers (fan-out)
RetentionUp to 14 daysNo retention (fire and forget)
Delivery guaranteeAt-least-once (standard), exactly-once (FIFO)At-least-once
Use caseDecoupled async processing, task queueBroadcast to multiple subscribers, fan-out

SQS queue types:

TypeOrderingDeduplicationMax TPS
StandardBest-effortPossible duplicatesNearly unlimited
FIFOStrictExactly-once processing3,000 messages/s with batching

SNS + SQS fan-out pattern (interview favorite):

Order Service publishes to SNS topic "order-events"
  -> SQS Queue: inventory-updates (Inventory Service subscribes)
  -> SQS Queue: email-notifications (Email Service subscribes)
  -> SQS Queue: analytics-events (Analytics Service subscribes)
  -> Lambda: real-time fraud-check

Benefit: Order Service decoupled from all downstream services
         Each service processes at its own pace
         New subscriber = add new SQS queue, no Order Service change

SQS visibility timeout:

# Consumer gets message, starts processing
# Message becomes invisible (visibility timeout, default 30s)
# If processing fails (no delete), message reappears after timeout
# Configure timeout = expected max processing time + buffer

# Dead Letter Queue (DLQ): after N failed attempts, move to DLQ
aws sqs create-queue --queue-name my-dlq
aws sqs set-queue-attributes \
    --queue-url https://sqs.us-east-1.amazonaws.com/123456789/my-queue \
    --attributes '{
        "RedrivePolicy": "{\"deadLetterTargetArn\":\"arn:aws:sqs:us-east-1:123456789:my-dlq\",\"maxReceiveCount\":5}"
    }'

Q16. When would you use Lambda instead of EC2/ECS?

Lambda advantages:

  • No server management (AWS handles patching, scaling, availability).
  • Per-invocation billing (no cost when idle).
  • Auto-scales to thousands of concurrent executions without configuration.
  • Event-driven integrations with 200+ AWS services.

Lambda limitations:

  • Max execution time: 15 minutes.
  • Ephemeral: no persistent local state (use S3/DynamoDB/EFS).
  • Cold start latency: 100ms-3s for first invocation (SnapStart for Java, provisioned concurrency mitigates).
  • Memory: 128 MB to 10 GB (CPU scales with memory).

Use Lambda when:

- Event-driven processing: S3 upload triggers image resize
- Webhooks / API callbacks: Stripe, GitHub, Slack events
- Scheduled jobs: CloudWatch Events cron (daily report)
- Stream processing: Kinesis/DynamoDB Streams consumers
- Glue code: orchestrate other AWS services
- APIs with highly variable traffic (zero to spiky)

Use EC2/ECS when:

- Long-running processes (> 15 min)
- High sustained throughput (cost advantage over per-invocation)
- WebSocket connections (Lambda does not hold persistent connections)
- Specific runtime versions or system dependencies not in Lambda
- Legacy applications requiring OS-level access
# Lambda function triggered by S3 upload
import boto3
import json

def lambda_handler(event, context):
    s3 = boto3.client('s3')
    rekognition = boto3.client('rekognition')

    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        # Detect labels in uploaded image
        response = rekognition.detect_labels(
            Image={'S3Object': {'Bucket': bucket, 'Name': key}},
            MaxLabels=10,
            MinConfidence=80
        )

        labels = [l['Name'] for l in response['Labels']]

        # Store results in DynamoDB
        dynamodb = boto3.resource('dynamodb')
        table = dynamodb.Table('ImageLabels')
        table.put_item(Item={
            'image_key': key,
            'labels': labels,
            'processed_at': context.aws_request_id
        })

    return {'statusCode': 200, 'body': json.dumps('Processed successfully')}

Cost Optimization

Q17. What are the key strategies for reducing AWS costs?

Compute:

1. Right-sizing: CloudWatch metrics -> CPU/memory < 20% -> downsize
   AWS Compute Optimizer: ML-based recommendations
   
2. Spot instances: 70-90% savings for stateless, fault-tolerant workloads
   Spot Fleet: mix instance types to improve availability
   
3. Savings Plans / Reserved Instances: 40-60% for predictable baseline
   Savings Plans > RIs: more flexible (covers Fargate + Lambda too)
   
4. Auto Scaling: scale to zero nights/weekends for dev/test
   Instance Scheduler: automated start/stop via Lambda + DynamoDB

Storage:

1. S3 Lifecycle policies: move to IA -> Glacier -> expire
2. EBS: delete unattached volumes (common waste)
   Downsize gp2 to gp3 (same performance, cheaper)
3. EBS snapshot cleanup: delete old snapshots, use Data Lifecycle Manager
4. S3 Intelligent-Tiering: automatic tier moves (no retrieval fee)

Database:

1. Aurora Serverless v2: scale to 0 for dev/test databases
2. RDS: stop dev/test instances nights/weekends (stop charges storage only)
3. Redshift: pause cluster when not in use (managed pause/resume)
4. DynamoDB on-demand vs provisioned: on-demand for unpredictable workloads

Network:

1. NAT Gateway: often top cost for private subnets
   Consolidate to fewer NAT GWs, use VPC endpoints for S3/DynamoDB (no NAT fee)
2. Data transfer: same-AZ traffic free, cross-AZ costs $0.01/GB
   Keep components in same AZ when possible
3. CloudFront: reduces data transfer out (CloudFront -> internet cheaper than EC2 -> internet)

Real-World Design Scenarios

Q18. Design a serverless data pipeline for real-time analytics on AWS.

Requirements: 100K events/second from IoT devices, store raw events, compute rolling aggregations, power real-time dashboard.

Architecture:

IoT Devices -> Kinesis Data Streams (10 shards, ~100K rec/s)
                    |
                    |-- Lambda (stream consumer, batch=100, window=30s)
                    |       -> DynamoDB: rolling 5-min aggregations (TTL=7d)
                    |
                    |-- Kinesis Firehose -> S3 (raw events, Parquet format)
                                               |
                                      Glue Crawler (auto-discover schema)
                                               |
                                           Athena (ad-hoc SQL on S3)
                                               |
                                    QuickSight (dashboard on Athena)

DynamoDB -> API Gateway -> Lambda (read aggregations)
                                |
                          CloudFront (cache API responses 30s)
                                |
                          React dashboard (WebSocket for live updates)

Kinesis shard calculation:

Write throughput: 100K records/s, avg 200 bytes/record = 20 MB/s
Kinesis shard capacity: 1,000 records/s or 1 MB/s per shard
Required shards: MAX(100K/1000, 20/1) = 20 shards

Read throughput per consumer: 2 MB/s per shard per consumer
With Lambda consumer: 20 shards * 2 MB/s = 40 MB/s fan-out

Lambda aggregation:

def lambda_handler(event, context):
    aggregations = {}
    for record in event['Records']:
        payload = json.loads(base64.b64decode(record['kinesis']['data']))
        device_id = payload['device_id']
        metric = payload['value']

        if device_id not in aggregations:
            aggregations[device_id] = {'sum': 0, 'count': 0, 'max': float('-inf')}

        aggregations[device_id]['sum'] += metric
        aggregations[device_id]['count'] += 1
        aggregations[device_id]['max'] = max(aggregations[device_id]['max'], metric)

    # Batch write to DynamoDB
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('DeviceMetrics')
    with table.batch_writer() as batch:
        for device_id, stats in aggregations.items():
            batch.put_item(Item={
                'device_id': device_id,
                'window': int(time.time() // 300) * 300,  # 5-min window
                'avg': stats['sum'] / stats['count'],
                'max': stats['max'],
                'count': stats['count'],
                'ttl': int(time.time()) + 7 * 86400
            })

FAQ

Q: What is VPC endpoint, and when should you use it? A VPC endpoint lets EC2/Lambda in a private subnet reach S3, DynamoDB, or other AWS services without traffic leaving AWS's network (no NAT Gateway, no Internet Gateway). Two types: Gateway endpoints (S3, DynamoDB -- free, add to route table) and Interface endpoints (most other services -- cost per hour + per GB, creates ENI in subnet). Use gateway endpoints for S3/DynamoDB to eliminate NAT Gateway costs and keep traffic on private network.

Q: What is the difference between CloudWatch and CloudTrail? CloudWatch is for performance monitoring and operational visibility: metrics (CPU, memory, custom), logs (application, access, flow logs), alarms, dashboards, and anomaly detection. CloudTrail is for audit and compliance: records every API call made in the AWS account (who did what, when, from where). CloudTrail logs go to S3 and can be analyzed with Athena. Use CloudWatch to know IF something is wrong; use CloudTrail to know WHO changed something.

Q: How do you secure data at rest and in transit on AWS? At rest: S3 Server-Side Encryption (SSE-S3 managed keys, SSE-KMS for audit trail, SSE-C for customer-managed keys outside AWS), EBS encryption (KMS), RDS encryption (KMS, enabled at creation), DynamoDB encryption (always on, KMS key choice). In transit: enforce HTTPS on ALB/CloudFront/API Gateway, use TLS 1.2+, ACM (AWS Certificate Manager) for free managed certificates, TLS between RDS client and DB, VPC PrivateLink / VPN / Direct Connect for on-premise-to-AWS. Confirm current AWS encryption defaults on the official AWS security documentation before your interview.


Methodology applied to this articlelast verified 8 Jun 2026
Sources used
Public exam-pattern documents, official recruiter pages, and verified candidate reports on r/developersIndia and LinkedIn.
Verification window
Page last edited 8 Jun 2026 by Aditya Sharma. Numbers and patterns sanity-checked against the most recent 2026 cycle drives we tracked.
What we did NOT do
  • No fabricated salary numbers or success rates. If we quote a range, it's sourced.
  • No noun-substituted templates. This article was not generated by swapping company names in a stock prompt.
  • No paid placements, sponsored coaching links, or affiliate-shilled course pushes.
Verification policy: /editorial-standards/. Found something incorrect? Submit a correction - we respond within 48 hours.

Explore this topic cluster

More resources in Uncategorized

Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.

Paid contributor programme

Sat this this year? Share your story, earn ₹500.

First-person experience reports help future candidates prep smarter. We pay verified contributors ₹500 via UPI per accepted story - with byline.

Submit your story →

Ready to practice?

Take a free timed mock test

Put what you learned into practice. Our mock tests match the 2026 pattern with timer, navigator, reveal, and score breakdown. No signup.

Start Free Mock Test →

Related Articles

More from PapersAdda

Share this guide: