AWS Interview Questions 2026 — Top 50 with Expert Answers

AWS certifications command a 25-30% salary premium in India, and AWS skills appear in 74% of all cloud job postings. AWS powers over 32% of the global cloud market and remains the most in-demand platform for engineers worldwide. Whether you're interviewing at Amazon itself, a fintech like Razorpay, or a startup scaling on cloud-native infrastructure, AWS knowledge is non-negotiable in 2026. This guide compiles 50 real questions from interviews at Amazon, Flipkart, Razorpay, PhonePe, Zerodha, and Swiggy — with the authoritative answers that get offers, organized from beginner to advanced.

Cloud roles are the fastest path to high-paying tech careers in India. AWS Cloud Architect roles command Rs 40-80 LPA at product companies. This guide is your roadmap to getting there.

Beginner-Level AWS Questions (Q1-Q15)

These questions are asked at every AWS interview from Wipro to Amazon. Get them right with confidence, and the interviewer immediately takes you seriously for the harder questions.

Q1. What is the difference between a Region, Availability Zone, and Edge Location in AWS?

Concept	Definition	Example
Region	A geographic area with 2+ AZs	ap-south-1 (Mumbai)
Availability Zone (AZ)	One or more discrete data centers with redundant power/networking	ap-south-1a, ap-south-1b
Edge Location	CDN node used by CloudFront and Route 53	Mumbai, Chennai
Local Zone	Extension of a Region closer to users	Delhi (ap-south-1-del-1)

Regions are completely isolated from each other for fault tolerance. AZs within a region are connected by low-latency fiber links. Edge Locations are not full Regions — they only serve cached content and DNS requests.

Asked by: Amazon, Wipro, Infosys L2 interviews

Q2. What is EC2? Explain instance types and when to use each.

Family	Optimized For	Example Use Case
t3/t4g	Burstable CPU (dev/test)	Development servers
m6i/m7g	General purpose	Application servers
c6i/c7g	Compute intensive	Video encoding, ML inference
r6i/r7g	Memory intensive	In-memory caches, SAP HANA
p3/p4	GPU	Deep learning training
i3/i4i	High I/O NVMe	Databases, Hadoop
d2/d3	Dense HDD storage	Data warehousing

The "g" suffix (e.g., m7g) indicates AWS Graviton (ARM-based) — typically 20–40% cheaper with 10–15% better performance per dollar than x86 equivalents.

Asked by: Flipkart, Myntra, Amazon SDE-2

Q3. What is S3 and what are its storage classes?

Storage classes compared:

Class	Use Case	Retrieval Time	Min Duration	Cost (approx)
S3 Standard	Frequently accessed data	Milliseconds	None	$0.023/GB
S3 Intelligent-Tiering	Unknown access patterns	Milliseconds	30 days	$0.023/GB + monitoring
S3 Standard-IA	Infrequent access	Milliseconds	30 days	$0.0125/GB
S3 One Zone-IA	Non-critical infrequent	Milliseconds	30 days	$0.01/GB
S3 Glacier Instant	Archive with fast retrieval	Milliseconds	90 days	$0.004/GB
S3 Glacier Flexible	Archives, 1–5 min retrieval	Minutes–hours	90 days	$0.0036/GB
S3 Glacier Deep Archive	Long-term compliance	12 hours	180 days	$0.00099/GB

S3 buckets are Region-specific, but bucket names must be globally unique.

Q4. What is IAM? Explain users, groups, roles, and policies.

User: A permanent identity for a human or application (has long-term credentials — access key + secret)
Group: Collection of users sharing the same permissions (e.g., "developers" group)
Role: Temporary identity assumed by services, EC2 instances, Lambda, or cross-account entities (no long-term credentials — uses STS)
Policy: JSON document defining permissions. Two main types:
- Identity-based: Attached to users/groups/roles
- Resource-based: Attached to resources (S3 bucket policy, SQS queue policy)

Best practice: Never use root account for daily operations. Follow the principle of least privilege. Prefer roles over long-term access keys for EC2 and Lambda.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject"],
      "Resource": "arn:aws:s3:::my-bucket/*"
    }
  ]
}

Asked by: Amazon, Razorpay, PhonePe

Q5. What is a VPC? What components does it contain?

Core components:

VPC (10.0.0.0/16)
├── Public Subnet (10.0.1.0/24)
│   ├── EC2 instances with public IPs
│   └── NAT Gateway
├── Private Subnet (10.0.2.0/24)
│   ├── RDS databases
│   └── Application servers
├── Internet Gateway (IGW) — entry/exit for public traffic
├── Route Tables — define traffic routing per subnet
├── Security Groups — stateful instance-level firewall
├── Network ACLs — stateless subnet-level firewall
└── VPC Endpoints — private access to AWS services

Key difference: Security Groups are stateful (return traffic automatically allowed), NACLs are stateless (you must explicitly allow inbound AND outbound). SGs operate at instance level; NACLs at subnet level.

Q6. What is the difference between EBS, EFS, and S3?

Feature	EBS	EFS	S3
Type	Block storage	File storage (NFS)	Object storage
Attached to	Single EC2 (mostly)	Multiple EC2s simultaneously	Not attached — accessed via API/URL
Protocol	Block device	NFS v4	HTTP REST
Performance	Up to 256,000 IOPS (io2 BE)	Scales automatically	—
Use case	OS disk, databases	Shared file system	Backups, static assets, data lake
Pricing	Per GB provisioned	Per GB stored	Per GB stored + requests
Multi-AZ	Replicated within AZ	Yes (Regional)	Yes (multiple AZs)

Asked by: Infosys, TCS Digital, Wipro Elite

Q7. What is Auto Scaling and how does it work?

Target Tracking: Maintain a metric at a target value (e.g., keep CPU at 50%)
Step Scaling: Scale by different amounts based on breach thresholds
Scheduled Scaling: Scale at specific times (e.g., scale up before market open at 9 AM IST)

Components: Launch Template (defines AMI, instance type, security groups) + Auto Scaling Group (defines min/max/desired capacity + VPC subnets) + Scaling Policy.

Cooldown period (default 300 seconds) prevents rapid scale in/out oscillation.

Q8. What is CloudFront and how does it differ from S3?

S3 is the origin storage. CloudFront sits in front of S3 (or EC2, ALB) to:

Cache static content at edge
Terminate SSL/TLS at edge
Apply WAF rules
Sign URLs for private content
Compress content with gzip/brotli

Origin Access Control (OAC) replaces the old OAI — it restricts S3 bucket access only to CloudFront, so your S3 URL is never exposed publicly.

Q9. What is Route 53? What routing policies does it support?

Routing Policy	Use Case
Simple	Single resource
Weighted	A/B testing, canary deployments (e.g., 90% v1, 10% v2)
Latency-based	Route users to the lowest-latency Region
Failover	Active-passive DR with health checks
Geolocation	Route by user's geographic location
Geoproximity	Route by geographic proximity (with bias)
Multi-Value	Return multiple IPs, basic load balancing
IP-based	Route based on client IP ranges (new in 2023)

Health checks can monitor endpoints and trigger failover automatically.

Q10. What is the difference between Application Load Balancer (ALB) and Network Load Balancer (NLB)?

Feature	ALB	NLB
Layer	Layer 7 (HTTP/HTTPS/WebSocket)	Layer 4 (TCP/UDP/TLS)
Routing	Path, header, host, query, IP	Port and protocol
Static IP	No (DNS only)	Yes (Elastic IP per AZ)
Performance	—	Ultra-low latency, millions of RPS
WebSockets	Yes	Yes
gRPC	Yes	—
Best for	Microservices, HTTP APIs	Gaming, IoT, financial trading
SSL Termination	At ALB	At NLB or pass-through

A third type, Gateway Load Balancer (GLB), is used for inline network appliances (firewalls, IDS).

Asked by: Amazon, Swiggy, Zomato, Razorpay

Q11. What is Lambda? What are its limits?

Key limits (2026):

Max execution timeout: 15 minutes
Memory: 128 MB – 10 GB
Ephemeral disk (/tmp): 512 MB – 10 GB (configurable)
Deployment package: 50 MB (zipped direct upload), 250 MB (unzipped), 10 GB (container image)
Concurrent executions: 1,000 per account per Region (can increase via service limit)
Max response payload: 6 MB (synchronous), 256 KB (async)

Lambda integrates natively with API Gateway, S3, DynamoDB Streams, Kinesis, SQS, SNS, EventBridge, and 200+ more services.

Q12. What is RDS? Which engines does it support?

Supported engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora (MySQL- and PostgreSQL-compatible, AWS proprietary).

RDS Multi-AZ: Synchronous replication to a standby in a different AZ for high availability. Automatic failover in 1–2 minutes. Standby cannot be used for reads (use Read Replicas for that).

Read Replicas: Asynchronous replication. Supports up to 15 replicas (Aurora). Can be promoted to standalone. Can be in different Regions (cross-region read replicas).

Q13. What is DynamoDB?

Core concepts:

Partition Key (PK): Distributes data across partitions. Must be unique for simple primary keys.
Sort Key (SK): Optional secondary dimension. PK + SK must be unique together.
GSI (Global Secondary Index): Alternate access patterns with different PK/SK; can span all partitions
LSI (Local Secondary Index): Same PK, different SK; must be created at table creation

Capacity modes:

On-demand: Pay per request. Best for unpredictable traffic.
Provisioned: Set RCU/WCU. Use with Auto Scaling. Better for predictable, steady workloads.

DynamoDB Streams + Lambda = real-time event-driven pipelines.

Asked by: Amazon, Flipkart, Meesho

Q14. What is CloudFormation?

Template anatomy:

AWSTemplateFormatVersion: '2010-09-09'
Description: My application stack
Parameters:
  InstanceType:
    Type: String
    Default: t3.micro
Resources:
  MyEC2:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: !Ref InstanceType
      ImageId: ami-0abcdef1234567890
Outputs:
  InstanceId:
    Value: !Ref MyEC2

Change Sets let you preview changes before applying. Drift Detection identifies manual changes to resources outside CloudFormation.

CloudFormation vs. Terraform: CloudFormation is AWS-only but has tighter native integration. Terraform is multi-cloud and has a larger ecosystem but requires more setup.

Q15. What is the Shared Responsibility Model?

AWS Responsibility ("Security OF the cloud")	Customer Responsibility ("Security IN the cloud")
Physical infrastructure (data centers)	Customer data encryption
Hardware and network	IAM configuration and policies
Hypervisor	Operating system patching (for EC2)
Managed service software	Application-level security
Global network	Security groups and NACLs
AWS Region/AZ infrastructure	Compliance certifications for their apps

For managed services like RDS, Lambda, S3 — AWS takes more responsibility (OS, runtime). For EC2, the customer manages everything above the hypervisor.

One of the most commonly asked conceptual questions at all levels.

Intermediate-Level AWS Questions (Q16-Q35)

This is the section that separates "I've used the console" from "I've built production systems." These questions come up in SDE-2 and Cloud Architect interviews at Amazon, Flipkart, and Razorpay.

Q16. Explain VPC Peering vs. AWS Transit Gateway vs. PrivateLink.

Feature	VPC Peering	Transit Gateway	PrivateLink
Connectivity	1-to-1 VPC	Hub-and-spoke for many VPCs	Service endpoint to VPC
Transitive routing	No	Yes	N/A
Cost	Free (data transfer fees apply)	Per attachment + data	Per endpoint + data
Cross-account	Yes	Yes	Yes
Cross-region	Yes	Yes	No (Regional)
Use case	Few VPCs	10+ VPCs, multi-account	Access AWS or SaaS services privately

PrivateLink exposes a service from one VPC to consumers in other VPCs without the networks being peered — traffic never leaves AWS backbone.

Asked by: Amazon senior rounds, Razorpay, PhonePe infrastructure teams

Q17. What is AWS ECS vs. EKS? When would you choose each?

Feature	ECS	EKS
Orchestrator	AWS proprietary	Kubernetes
Learning curve	Lower	Higher
Portability	AWS-only	Multi-cloud (K8s standard)
Control plane cost	Free	$0.10/hour per cluster (~$73/month)
Fargate support	Yes	Yes
Ecosystem	AWS-native integrations	Massive CNCF ecosystem
Debugging	AWS Console, CloudWatch	kubectl, K8s dashboards

Choose ECS when: team is new to containers, you want tight AWS integration, minimal operational overhead matters more than portability.

Choose EKS when: you need Kubernetes compatibility, multi-cloud strategy, using Helm charts, custom operators, or service meshes like Istio.

Asked by: Amazon, Swiggy, Razorpay, Flipkart

Q18. How does Lambda handle concurrency? What is provisioned concurrency?

Unreserved concurrency: Shared pool, default 1,000/account/Region
Reserved concurrency: Guarantees a set amount for one function (isolates it from others); also acts as a throttle cap
Provisioned concurrency: Pre-warms instances to eliminate cold starts — critical for latency-sensitive APIs

Cold start breakdown:

AWS spins up a new execution environment (container)
Downloads deployment package
Initializes runtime (JVM, Node, Python interpreter)
Runs init code outside the handler

For Java/JVM functions, cold starts can be 2–10 seconds. Provisioned concurrency keeps N instances warm so requests are served in <1ms initialization time.

Q19. Design a highly available 3-tier web application on AWS.

Architecture (text diagram):

Internet
    |
Route 53 (latency-based routing, health checks)
    |
CloudFront (SSL termination, WAF, edge caching)
    |
ALB (Application Load Balancer — Multi-AZ)
   / \
EC2 ASG (AZ-a)    EC2 ASG (AZ-b)     [Web/App Tier — Private Subnet]
       \               /
        ElastiCache (Redis — in-memory session, cache)
            |
        RDS Aurora (Multi-AZ — writer in AZ-a, reader in AZ-b)

Key HA decisions:

Route 53 health checks trigger failover at DNS level
ALB distributes across 2+ AZs
ASG min=2 ensures at least one instance per AZ
RDS Multi-AZ gives automatic failover (<120 seconds)
ElastiCache Multi-AZ replication group
S3 + CloudFront for static assets (decoupled from compute)

Classic architecture question at Amazon L5/L6 and Flipkart SDE-2/SDE-3

Service	Pattern	Delivery	Retention	Consumers
SQS	Queue (point-to-point)	Pull	4 days – 14 days	1 consumer
SNS	Pub/Sub (fan-out)	Push	No storage	Multiple
EventBridge	Event bus	Push	Optional archive	Rules-based routing

Common pattern — Fan-out: SNS topic → multiple SQS queues (so each microservice gets its own copy of the event).

EventBridge is preferred for event-driven architectures — it can filter events by content, route to Lambda/SQS/Step Functions/API destinations, and integrate with 200+ SaaS sources.

Q21. What is S3 Object Lock and how does it work?

Governance mode: Users with special IAM permissions can bypass the lock
Compliance mode: Nobody (including root) can delete the object until the retention period expires — used for regulatory compliance (SEBI, HIPAA, SOC2)

Legal Hold: Independent of retention period — can be applied/removed by privileged users.

Required for: financial records retention, healthcare data, compliance with India's DPDP Act.

Q22. Explain AWS WAF. What rules can you configure?

Rule types:

IP set rules: Block/allow specific IP ranges
Rate-based rules: Throttle IPs exceeding X requests per 5 minutes (DDoS protection)
Managed rule groups: AWS-managed rules for OWASP Top 10, known bad IPs, SQL injection, XSS
Custom rules: Match on request components (URI, headers, query strings, body — first 8 KB)

Each Web ACL contains rules with Allow/Block/Count/CAPTCHA actions. Rules are evaluated in priority order.

Asked by: Razorpay, PhonePe, Zerodha (security-focused rounds)

Q23. What is AWS KMS? How does envelope encryption work?

Envelope encryption:

KMS generates a Customer Master Key (CMK) — never leaves KMS
Application requests a Data Encryption Key (DEK) from KMS
KMS returns: plaintext DEK + encrypted DEK
Application encrypts data with plaintext DEK in memory
Plaintext DEK is discarded; encrypted DEK is stored alongside encrypted data
To decrypt: send encrypted DEK to KMS → get back plaintext DEK → decrypt data

This pattern means KMS is only involved during key wrapping/unwrapping, not for bulk data encryption, keeping latency and cost low.

Q24. How does DynamoDB handle hot partitions? How do you fix them?

Solutions:

Write sharding: Append a random suffix (0–9) to the PK → spread 1 partition into 10 → scatter reads also use scatter-gather
Caching layer: ElastiCache DAX (DynamoDB Accelerator) — in-memory caching at microsecond latency, reduces read load on hot items
Better PK design: Choose high-cardinality attributes (UUID, user_id, order_id)
Adaptive capacity (automatic): DynamoDB shifts capacity to hot partitions automatically within limits

Deep-dive question at Amazon, Flipkart data platform teams

Q25. What is AWS CloudTrail and how does it differ from CloudWatch?

Feature	CloudTrail	CloudWatch
Purpose	API audit log	Metrics, logs, alarms, dashboards
What it tracks	Who did what (API calls)	How resources are performing
Data type	Events (JSON)	Metrics, log streams
Retention	90 days (free) or S3	Configurable (indefinite)
Use case	Security audit, compliance	Ops monitoring, alerting

CloudTrail records every API call: who made it (user/role/service), from which IP, at what time, what resource was targeted, and whether it succeeded. Management events are enabled by default; data events (S3 object-level, Lambda invocations) cost extra.

Q26. What is Elastic Beanstalk? When would you NOT use it?

Do NOT use Elastic Beanstalk when:

You need fine-grained control over infrastructure configuration
You're running microservices that need container orchestration
You need to customize the underlying OS or runtime beyond what EB offers
Your team already uses Terraform/CDK for IaC (EB's environment configs don't integrate cleanly)
Cost optimization is critical (you'd over-provision with EB's opinionated setup)

Good for: monolithic applications, teams new to AWS, proof-of-concepts.

Q27. What is AWS Step Functions? Give a real use case.

Real use case — Order processing pipeline:

[Start]
  → ValidateOrder (Lambda)
  → CheckInventory (Lambda)
      → [Choice] In stock?
          → YES: ReserveInventory → ProcessPayment → SendConfirmation → [End]
          → NO: NotifyOutOfStock → RefundPayment → [End]
  → [Catch] PaymentFailed → RollbackInventory → SendFailureNotification → [End]

Step Functions handles retries, timeouts, parallel execution, and error handling automatically. Standard Workflows are for long-running (1 year max), exactly-once execution. Express Workflows are for high-throughput, short-duration (5 min max), at-least-once execution.

Asked by: Amazon, Flipkart, Myntra backend rounds

Q28. What is the difference between NAT Gateway and NAT Instance?

Feature	NAT Gateway	NAT Instance
Managed by	AWS	Customer
Availability	Highly available (within AZ)	Single EC2 — SPOF unless you manage HA
Bandwidth	Up to 100 Gbps (auto-scales)	Limited by instance size
Security groups	Cannot attach	Can attach
Cost	$0.045/hour + $0.045/GB	EC2 pricing + management overhead
Maintenance	None	Patching, monitoring required

Rule: Always use NAT Gateway in production. NAT Instances are obsolete except for very cost-sensitive dev environments.

Q29. How would you optimize AWS costs for a startup running 24/7 workloads?

Reserved Instances (RI): 1-year or 3-year commitment for EC2, RDS, ElastiCache → up to 72% savings vs. On-Demand
Savings Plans: More flexible than RIs — compute savings plans cover any EC2, Lambda, Fargate; 66% max savings
Spot Instances: 70–90% cheaper than On-Demand for interruption-tolerant workloads (batch jobs, ML training, CI runners)
Graviton instances: Switch t3→t4g, m5→m7g → 20–40% cost reduction, better perf/dollar
S3 Intelligent-Tiering: Auto-moves infrequently accessed objects to cheaper tiers
RDS Aurora Serverless v2: Scales to 0 ACUs in dev/staging (no cost when idle)
Lambda instead of always-on EC2: For bursty or low-frequency workloads
AWS Cost Explorer + Budgets: Set billing alarms, identify waste
Right-sizing: Use Compute Optimizer recommendations to downsize over-provisioned instances
Delete zombie resources: Unused EIPs ($3.65/month each), idle NAT Gateways, forgotten snapshots

Heavily asked at startup interviews (Razorpay, Zerodha, CRED)

Q30. What is AWS Config and how does it enforce compliance?

How it works:

Config Recorder captures resource configurations and changes
Config Rules evaluate configurations (AWS Managed Rules or custom Lambda rules)
Non-compliant resources are flagged
Remediation Actions (manual or auto via SSM Automation) fix violations

Common rules:

ec2-instance-no-public-ip — alert on EC2s with public IPs in private subnets
s3-bucket-public-read-prohibited — detect accidentally public S3 buckets
iam-root-access-key-check — ensure root has no access keys
rds-multi-az-support — enforce Multi-AZ for production RDS

Q31. What is the difference between STS AssumeRole and IAM instance profiles?

Instance Profile: An IAM role attached to an EC2 instance. The EC2 metadata service (169.254.169.254) vends temporary credentials automatically via the IMDSv2 endpoint. Applications running on EC2 call the metadata endpoint and get credentials without any configuration.
STS AssumeRole: An explicit API call (sts:AssumeRole) to obtain temporary credentials for a role. Used for cross-account access, federated identity (SSO), Lambda assuming another role, or EKS pods via IRSA.

IMDSv2 (Instance Metadata Service v2) is session-oriented and requires a PUT request first — mitigates SSRF attacks that can steal EC2 credentials (a real attack vector that got several companies breached).

Q32. What is Aurora Global Database?

Use cases:

Global applications needing low-latency reads worldwide
Disaster recovery with RPO of ~1 second and RTO of <1 minute

Failover: In a disaster, you can promote a secondary Region to primary. Global Write Forwarding allows secondary Regions to write — Aurora routes the write to the primary automatically (slight latency increase).

Cost: ~20% more than standard Aurora due to cross-region replication costs.

Q33. How does API Gateway handle throttling?

Account-level: 10,000 RPS (requests per second) burst limit of 5,000 per Region (adjustable)
Stage-level: Set default throttling per stage
Method-level: Override per route (e.g., /payment stricter than /health)
Usage Plans + API Keys: Throttle per API consumer (for public APIs with paying customers)

When throttled, clients receive HTTP 429 Too Many Requests. Implement exponential backoff + jitter in clients. Use SQS as a buffer in front of Lambda for bursty ingestion instead of direct API Gateway → Lambda if you can tolerate async processing.

Q34. What is AWS CDK vs. CloudFormation vs. Terraform?

Feature	CloudFormation	CDK	Terraform
Language	YAML/JSON	TypeScript, Python, Java, C#, Go	HCL
Multi-cloud	No	No	Yes
Abstraction	Low	High (L3 constructs)	Medium
State management	Managed by AWS	Deploys via CloudFormation	Local/remote tfstate
Import existing resources	Limited	Limited	Yes (`terraform import`)
Module ecosystem	None	Construct Hub	Terraform Registry
Community	Medium	Growing	Very large

CDK synthesizes into CloudFormation templates — so ultimately it's CloudFormation under the hood. CDK's high-level constructs (L3) encapsulate best practices (e.g., ApplicationLoadBalancedFargateService = ALB + ECS Fargate with sane defaults in one construct).

Asked at senior/architect level interviews

Q35. Explain SQS visibility timeout and dead-letter queues.

Visibility Timeout: When a consumer reads a message from SQS, the message becomes invisible to other consumers for the visibility timeout duration (default 30 seconds, max 12 hours). If the consumer successfully processes and deletes the message within this window, it's gone. If it crashes or takes too long, the message reappears for another consumer to pick up.

Dead-Letter Queue (DLQ): After a message is received N times (maxReceiveCount) without being deleted, SQS moves it to the DLQ. The DLQ is a regular SQS queue used for:

Debugging: inspect why messages failed
Alerting: CloudWatch alarm on DLQ depth > 0
Replaying: fix the consumer, then move messages back to the main queue

For FIFO queues, DLQs must also be FIFO.

Advanced-Level AWS Questions (Q36-Q50)

Don't skip the Advanced section — this is where interviewers separate Rs 20 LPA from Rs 50+ LPA candidates. These questions are asked at Amazon SDE-3, Flipkart L3, and senior cloud architect roles.

Q36. Design a serverless data pipeline ingesting 1 million events/minute on AWS.

Architecture:

Mobile/Web Apps
    |
Kinesis Data Streams (100 shards × 1MB/s = 100MB/s capacity)
    |
Kinesis Data Firehose (buffers, transforms, delivers)
   /        \
S3 (raw)    Lambda (real-time processing, enrich/filter)
   |              |
AWS Glue       DynamoDB (hot path — last 5 min aggregations)
(batch ETL)
   |
S3 (parquet, Hive-partitioned: year/month/day/hour)
   |
Amazon Athena (ad-hoc SQL on S3)
   |
QuickSight (dashboards)

Key choices: Kinesis over SQS for ordered, replay-capable, high-throughput streaming. Firehose auto-scales and handles delivery retries. Parquet format gives 3–5x query speedup in Athena. Glue Data Catalog as metastore for Athena schema discovery.

Q37. How would you implement blue/green deployments on AWS?

Option 1 — Route 53 weighted routing:

Blue (current production): 100% weight
Deploy Green (new version) to a separate stack
Shift 10% → 50% → 100% traffic via Route 53 weights
Monitor metrics; instant rollback by setting Green weight to 0

Option 2 — ALB listener rules:

Two target groups: Blue (v1) and Green (v2)
Shift traffic by modifying target group weights on the listener
AWS CodeDeploy automates this for ECS and Lambda

Option 3 — CodeDeploy for Lambda:

Linear10PercentEvery1Minute: shift 10% of traffic to new Lambda version every minute
Pre/PostTraffic hooks run validation Lambda functions before/after shift
Automatic rollback if CloudWatch alarms fire during deployment

Architecture question at Amazon L6, Flipkart Principal Engineer

Q38. How does EKS handle IAM authentication and authorization?

Authentication: AWS IAM via aws-iam-authenticator. The kubectl command generates a pre-signed STS URL token. The EKS control plane validates the token against IAM.
Authorization: Kubernetes RBAC. IAM identities are mapped to Kubernetes users/groups via the aws-auth ConfigMap in kube-system:

mapRoles:
- rolearn: arn:aws:iam::123456789:role/developer-role
  username: developer
  groups:
    - system:masters  # cluster-admin (or custom RBAC groups)

IRSA (IAM Roles for Service Accounts): Associate a Kubernetes Service Account with an IAM Role using OIDC federation. Pods get AWS credentials scoped to their service account → no more node-level IAM roles sharing credentials across all pods.

Q39. What is AWS Outposts? When would you deploy it?

When to use:

Data residency requirements preventing cloud migration (banking, government, healthcare in certain jurisdictions)
Ultra-low latency local processing with cloud connectivity
Gradual hybrid migration strategy
Manufacturing/industrial edge computing where internet connectivity is unreliable

Available in Outposts rack (full rack delivered by AWS), Outposts servers (1U/2U for smaller locations), and Local Zones (AWS-operated facility close to metro areas — different from Outposts).

Q40. Explain AWS Shield Standard vs. Advanced and how DDoS protection works.

Shield Standard (free, automatic):

Protects all AWS customers against Layer 3/4 attacks (SYN floods, UDP reflection)
Automatic detection and mitigation at the network edge

Shield Advanced ($3,000/month per organization):

Layer 7 protection (with WAF integration)
24/7 DDoS Response Team (DRT) access
Cost protection (AWS credits DDoS-related scaling charges)
Advanced attack diagnostics in real-time
Protects: EC2, ELB, CloudFront, Global Accelerator, Route 53

DDoS mitigation architecture:

Attacker
    |
CloudFront (absorbs HTTP floods at edge — 450+ PoPs)
    |
Shield Advanced (Layer 3/4 scrubbing)
    |
WAF (rate-based rules, IP reputation lists)
    |
ALB (health checks drop bad traffic)
    |
Your application (sees only clean traffic)

Asked at Razorpay, Zerodha security rounds

Q41. How do you implement cross-account access in AWS?

Pattern 1 — Cross-account IAM Role:

In Account B (target), create a role with a trust policy allowing Account A to assume it
In Account A, attach an IAM policy to users/roles allowing sts:AssumeRole on Account B's role
Application in Account A calls sts:AssumeRole → gets temporary credentials for Account B resources

Pattern 2 — Resource-based policies (S3, SQS, KMS, Lambda): Directly grant Account A principal access in the resource policy without assuming a role.

Pattern 3 — AWS Organizations + Service Control Policies (SCPs): SCPs are permission guardrails applied at OU/account level — they restrict what IAM policies CAN grant, even if the IAM policy allows it. Used for organization-wide compliance (e.g., "no one can disable CloudTrail", "all resources must be tagged").

Q42. What is the difference between CloudWatch Metrics, Logs, and X-Ray?

Tool	Purpose	Data Type
CloudWatch Metrics	Numeric time-series data (CPU, latency, error rate)	Numbers with dimensions
CloudWatch Logs	Log lines from applications and AWS services	Text streams
CloudWatch Log Insights	Serverless SQL-like queries over log data	Query language
AWS X-Ray	Distributed tracing across services	Trace segments/subsegments

X-Ray provides an end-to-end trace view: API Gateway → Lambda → DynamoDB → external HTTP calls. It generates a service map showing latency contribution per service and error rates. Essential for debugging distributed latency in microservices.

Q43. How does S3 Transfer Acceleration work?

When it helps: Uploading from geographically distant clients (e.g., a user in India uploading to an S3 bucket in us-east-1 for compliance reasons). AWS provides a speed comparison tool at s3-accelerate-speedtest.s3-accelerate.amazonaws.com.

When it doesn't help (or hurts): Same-region uploads — the public internet path is comparable to the AWS backbone for short distances, and you'd pay the acceleration surcharge unnecessarily.

Q44. What is AWS Global Accelerator? How is it different from CloudFront?

Feature	CloudFront	Global Accelerator
Use case	HTTP/HTTPS content delivery and caching	TCP/UDP traffic acceleration, non-HTTP
Caching	Yes	No
Static IPs	No	Yes (2 anycast IPs)
Protocol	HTTP/HTTPS/WebSocket	Any TCP/UDP
Health routing	No (edge-to-origin)	Yes (routes around failures)
Best for	Web content, APIs with caching	Gaming, IoT, multi-region ALB failover

Global Accelerator's two static anycast IPs are whitelisted in enterprise firewalls — critical for B2B SaaS. It routes traffic to healthy endpoints across regions automatically.

Q45. How do you encrypt data at rest in S3?

SSE-S3 (AES-256): AWS manages keys entirely. Zero configuration. Free. Default since January 2023 for all new objects.
SSE-KMS: Uses KMS CMK. You control key policy, audit usage via CloudTrail, enable key rotation. Adds KMS API call latency (~1ms) and cost ($0.03/10,000 requests).
SSE-KMS with DSSE: Dual-layer encryption — two independent KMS calls. Meets CNSSI requirements for top-secret data.
SSE-C (Customer-Provided Keys): You provide the key per request. AWS encrypts, then discards the key. You manage key storage.

For client-side encryption: use the AWS Encryption SDK. Data is encrypted before leaving the client. AWS never sees plaintext.

Q46. What is Amazon Bedrock? How does it integrate with existing AWS services?

Integration patterns:

Knowledge Bases for Bedrock: RAG pipeline — upload documents to S3 → Bedrock ingests, chunks, embeds into a vector store (OpenSearch Serverless or Pinecone) → query via RetrieveAndGenerate API
Bedrock Agents: Multi-step reasoning agents that call Lambda functions as tools, query databases, and complete tasks autonomously
Guardrails: Content filtering, PII redaction, topic blocking — applied to inputs and outputs
Model Evaluation: Compare models on custom datasets before choosing

Rapidly becoming a standard interview topic at AI-forward companies (2026)

Q47. Explain AWS Well-Architected Framework pillars with examples.

Pillar	Key Principle	Example
Operational Excellence	Run and monitor systems, continuously improve	Use IaC (CloudFormation/CDK), implement runbooks, post-mortems
Security	Protect information and systems	Enable MFA, use KMS, rotate credentials, enable GuardDuty
Reliability	Recover from failures, meet demand	Multi-AZ RDS, circuit breakers, chaos engineering with FIS
Performance Efficiency	Use resources efficiently as demand changes	Choose right instance family, use CDN, profile before optimizing
Cost Optimization	Avoid unnecessary costs	Reserved Instances, Spot for batch, right-sizing
Sustainability	Minimize environmental impact	Graviton (better perf/watt), serverless, archive to Glacier

AWS Fault Injection Service (FIS) is the chaos engineering tool — injects CPU stress, AZ failures, latency on EC2/ECS/EKS to validate resilience.

Q48. How does Amazon Kinesis Data Streams handle shard splitting and merging?

Shard Splitting: Split one shard into two when you need more throughput. The old shard becomes read-only (existing data still readable until retention expires); new shards receive new writes.
Shard Merging: Merge two adjacent shards (by hash key range) to reduce cost when throughput drops.

Re-sharding considerations:

Enhanced fan-out consumers (dedicated 2 MB/s per consumer per shard) are not affected
Partition key hashing must be understood — all records with the same partition key go to the same shard (ordering guarantee within a key)
Use DescribeStreamSummary to check current shard count before splitting

Q49. What is Service Control Policy (SCP) in AWS Organizations? Give a practical example.

Practical example — Prevent disabling security services:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyDisableGuardDuty",
      "Effect": "Deny",
      "Action": [
        "guardduty:DeleteDetector",
        "guardduty:DisassociateFromMasterAccount",
        "cloudtrail:StopLogging",
        "cloudtrail:DeleteTrail",
        "config:DeleteConfigurationRecorder"
      ],
      "Resource": "*"
    }
  ]
}

Applied to the root OU, this SCP prevents any account in the organization — even the root user of a member account — from disabling security monitoring. Essential for enterprise governance.

Asked at senior/lead level at fintech companies

Q50. How do you troubleshoot a Lambda function timing out at 15 minutes?

Diagnosis steps:

Check CloudWatch Logs for the function — identify which part is slow (add timestamps/X-Ray traces)
Check Lambda Insights (enhanced monitoring) for memory pressure, CPU throttling, init duration
Check downstream service latency: DynamoDB, RDS, external HTTP calls, S3

Solutions:

Break into smaller functions: Use Step Functions to orchestrate a workflow instead of one monolithic Lambda
Async pattern: Trigger Lambda from SQS/SNS, return 202 Accepted immediately, poll for results
Optimize the bottleneck: Add connection pooling for RDS (RDS Proxy), add ElastiCache, use batch operations instead of loops
Switch to Fargate/ECS: For truly long-running tasks (video processing, ML inference), Fargate has no 15-minute limit
Increase Lambda memory: CPU scales linearly with memory. Going from 512MB to 2GB often cuts runtime by 60-75% for CPU-bound tasks

FAQ Section — Your AWS Career Questions Answered

Q: Which AWS certification should I get first? Start with AWS Certified Solutions Architect - Associate. It's the most recognized cloud cert globally and has the highest ROI for career growth — many engineers report Rs 3-5 LPA salary bumps just from this one certification. It covers all core services and is required/preferred at most companies. After that, pursue the Solutions Architect - Professional or DevOps Engineer - Professional depending on your role.

Q: What is the difference between AWS and Azure for Indian companies? AWS dominates India with data centers in Mumbai (ap-south-1) and Hyderabad (ap-south-2). Most Indian unicorns (Flipkart, Razorpay, Zerodha, CRED, Swiggy) run on AWS. Azure is stronger in enterprises using Microsoft stack. GCP is growing in AI/ML workloads.

Q: Is Terraform or CloudFormation better for AWS? For AWS-only teams: CloudFormation or CDK is easier (native integration, no state file management). For multi-cloud or teams with strong Terraform expertise: Terraform. Most companies use Terraform in practice because of the larger ecosystem and better multi-account/multi-region patterns (Terragrunt, Atlantis).

Q: What salary can I expect for AWS roles in India in 2026? Here are the real numbers from verified offers: AWS Cloud Engineer (3-5 years): Rs 18-35 LPA. AWS Architect (7+ years): Rs 40-80 LPA. AWS at FAANG (Amazon): Rs 60 LPA-1.5 Cr+ including RSUs. DevOps/SRE with strong AWS: Rs 20-50 LPA at product companies. The cloud skills gap in India is massive — demand far exceeds supply.

Q: What is the difference between Reserved Instances and Savings Plans? Reserved Instances are tied to a specific instance type, region, and OS. Savings Plans are more flexible — Compute Savings Plans cover any EC2 instance, Fargate, and Lambda regardless of type, size, or region within the commitment amount. Savings Plans are generally recommended now over RIs.

Q: How do you handle secrets in AWS Lambda? Never hardcode secrets. Use AWS Secrets Manager (auto-rotation, versioning, fine-grained IAM policies) or SSM Parameter Store (SecureString type, free tier available). Access secrets at function startup and cache in memory (not in environment variables for highly sensitive secrets, as they're visible in the console).

Q: What is the difference between ECS on EC2 and ECS on Fargate? With EC2 launch type, you manage the underlying EC2 instances (patching, scaling, capacity planning). With Fargate, you specify CPU and memory per task and AWS manages the underlying infrastructure. Fargate is ~20-30% more expensive but eliminates operational overhead. Use Fargate for most workloads unless you have specific kernel requirements or need GPU access.

Q: How does AWS handle data sovereignty for Indian customers? AWS's Mumbai (ap-south-1) and Hyderabad (ap-south-2) Regions ensure data stays in India. RBI, SEBI, and IRDAI regulated entities can use these regions to comply with data localization requirements. AWS has a shared compliance responsibility — it holds certifications like ISO 27001, SOC 2, and PCI DSS, and customers inherit these for their workloads.

Summary: What Companies Actually Ask

Company	Focus Areas
Amazon (AWS SDE/SRE)	DynamoDB deep dives, distributed systems, Lambda internals, cost optimization, Leadership Principles
Flipkart	Multi-region architecture, Kinesis streaming, EKS, cost optimization at scale
Razorpay	VPC security, WAF, KMS, compliance (PCI DSS), API Gateway, Lambda
PhonePe	High-availability patterns, RDS Aurora, caching strategies, incident response
Zerodha	Security (IAM, SCP, GuardDuty), CloudTrail, cost optimization, minimal infrastructure philosophy
Swiggy/Zomato	Auto-scaling for traffic spikes, ECS/EKS, ElastiCache, SQS patterns

Keep building your cloud & infrastructure interview toolkit:

DevOps Interview Questions 2026 — CI/CD, Terraform, monitoring
Kubernetes Interview Questions 2026 — Container orchestration mastery
Docker Interview Questions 2026 — Container fundamentals
System Design Interview Questions 2026 — Architect systems on AWS
Microservices Interview Questions 2026 — Distributed application patterns
Data Engineering Interview Questions 2026 — Build data pipelines on AWS

AWS Interview Questions 2026 — Top 50 with Expert Answers

Beginner-Level AWS Questions (Q1-Q15)

Q1. What is the difference between a Region, Availability Zone, and Edge Location in AWS?

Q2. What is EC2? Explain instance types and when to use each.

Q3. What is S3 and what are its storage classes?

Q4. What is IAM? Explain users, groups, roles, and policies.

Q5. What is a VPC? What components does it contain?

Q6. What is the difference between EBS, EFS, and S3?

Q7. What is Auto Scaling and how does it work?

Q8. What is CloudFront and how does it differ from S3?

Q9. What is Route 53? What routing policies does it support?

Q10. What is the difference between Application Load Balancer (ALB) and Network Load Balancer (NLB)?

Q11. What is Lambda? What are its limits?

Q12. What is RDS? Which engines does it support?

Q13. What is DynamoDB?

Q14. What is CloudFormation?

Q15. What is the Shared Responsibility Model?

Intermediate-Level AWS Questions (Q16-Q35)

Q16. Explain VPC Peering vs. AWS Transit Gateway vs. PrivateLink.

Q17. What is AWS ECS vs. EKS? When would you choose each?

Q18. How does Lambda handle concurrency? What is provisioned concurrency?

Q19. Design a highly available 3-tier web application on AWS.

Q20. What is SQS vs. SNS vs. EventBridge? When do you use each?

Q21. What is S3 Object Lock and how does it work?

Q22. Explain AWS WAF. What rules can you configure?

Q23. What is AWS KMS? How does envelope encryption work?

Q24. How does DynamoDB handle hot partitions? How do you fix them?

Q25. What is AWS CloudTrail and how does it differ from CloudWatch?

Q26. What is Elastic Beanstalk? When would you NOT use it?

Q27. What is AWS Step Functions? Give a real use case.

Q28. What is the difference between NAT Gateway and NAT Instance?

Q29. How would you optimize AWS costs for a startup running 24/7 workloads?

Q30. What is AWS Config and how does it enforce compliance?

Q31. What is the difference between STS AssumeRole and IAM instance profiles?

Q32. What is Aurora Global Database?

Q33. How does API Gateway handle throttling?

Q34. What is AWS CDK vs. CloudFormation vs. Terraform?

Q35. Explain SQS visibility timeout and dead-letter queues.

Advanced-Level AWS Questions (Q36-Q50)

Q36. Design a serverless data pipeline ingesting 1 million events/minute on AWS.

Q37. How would you implement blue/green deployments on AWS?

Q38. How does EKS handle IAM authentication and authorization?

Q39. What is AWS Outposts? When would you deploy it?

Q40. Explain AWS Shield Standard vs. Advanced and how DDoS protection works.

Q41. How do you implement cross-account access in AWS?

Q42. What is the difference between CloudWatch Metrics, Logs, and X-Ray?

Q43. How does S3 Transfer Acceleration work?

Q44. What is AWS Global Accelerator? How is it different from CloudFront?

Q45. How do you encrypt data at rest in S3?

Q46. What is Amazon Bedrock? How does it integrate with existing AWS services?

Q47. Explain AWS Well-Architected Framework pillars with examples.

Q48. How does Amazon Kinesis Data Streams handle shard splitting and merging?

Q49. What is Service Control Policy (SCP) in AWS Organizations? Give a practical example.

Q50. How do you troubleshoot a Lambda function timing out at 15 minutes?

FAQ Section — Your AWS Career Questions Answered

Summary: What Companies Actually Ask

More resources in Interview Questions

Related Articles

DevOps Interview Questions 2026 — Top 50 with Expert Answers

Docker Interview Questions 2026 — Top 40 with Expert Answers

Kubernetes Interview Questions 2026 — Top 50 with Expert Answers

Microservices Interview Questions 2026 — Top 40 with Expert Answers

AI/ML Interview Questions 2026 — Top 50 Questions with Answers

More from PapersAdda

Share this guide: