PapersAdda
2026 Placement Season is LIVE12,000+ students preparing now

AWS Interview Questions 2026 — Top 50 with Expert Answers

36 min read
Interview Questions
Last Updated: 30 Mar 2026
Verified by Industry Experts
3,160 students found this helpful
Advertisement Placement

AWS certifications command a 25-30% salary premium in India, and AWS skills appear in 74% of all cloud job postings. AWS powers over 32% of the global cloud market and remains the most in-demand platform for engineers worldwide. Whether you're interviewing at Amazon itself, a fintech like Razorpay, or a startup scaling on cloud-native infrastructure, AWS knowledge is non-negotiable in 2026. This guide compiles 50 real questions from interviews at Amazon, Flipkart, Razorpay, PhonePe, Zerodha, and Swiggy — with the authoritative answers that get offers, organized from beginner to advanced.

Cloud roles are the fastest path to high-paying tech careers in India. AWS Cloud Architect roles command Rs 40-80 LPA at product companies. This guide is your roadmap to getting there.

Related: Kubernetes Interview Questions 2026 | DevOps Interview Questions 2026 | System Design Interview Questions 2026


Beginner-Level AWS Questions (Q1-Q15)

These questions are asked at every AWS interview from Wipro to Amazon. Get them right with confidence, and the interviewer immediately takes you seriously for the harder questions.

Q1. What is the difference between a Region, Availability Zone, and Edge Location in AWS?

ConceptDefinitionExample
RegionA geographic area with 2+ AZsap-south-1 (Mumbai)
Availability Zone (AZ)One or more discrete data centers with redundant power/networkingap-south-1a, ap-south-1b
Edge LocationCDN node used by CloudFront and Route 53Mumbai, Chennai
Local ZoneExtension of a Region closer to usersDelhi (ap-south-1-del-1)

Regions are completely isolated from each other for fault tolerance. AZs within a region are connected by low-latency fiber links. Edge Locations are not full Regions — they only serve cached content and DNS requests.

Asked by: Amazon, Wipro, Infosys L2 interviews


Q2. What is EC2? Explain instance types and when to use each.

FamilyOptimized ForExample Use Case
t3/t4gBurstable CPU (dev/test)Development servers
m6i/m7gGeneral purposeApplication servers
c6i/c7gCompute intensiveVideo encoding, ML inference
r6i/r7gMemory intensiveIn-memory caches, SAP HANA
p3/p4GPUDeep learning training
i3/i4iHigh I/O NVMeDatabases, Hadoop
d2/d3Dense HDD storageData warehousing

The "g" suffix (e.g., m7g) indicates AWS Graviton (ARM-based) — typically 20–40% cheaper with 10–15% better performance per dollar than x86 equivalents.

Asked by: Flipkart, Myntra, Amazon SDE-2


Q3. What is S3 and what are its storage classes?

Storage classes compared:

ClassUse CaseRetrieval TimeMin DurationCost (approx)
S3 StandardFrequently accessed dataMillisecondsNone$0.023/GB
S3 Intelligent-TieringUnknown access patternsMilliseconds30 days$0.023/GB + monitoring
S3 Standard-IAInfrequent accessMilliseconds30 days$0.0125/GB
S3 One Zone-IANon-critical infrequentMilliseconds30 days$0.01/GB
S3 Glacier InstantArchive with fast retrievalMilliseconds90 days$0.004/GB
S3 Glacier FlexibleArchives, 1–5 min retrievalMinutes–hours90 days$0.0036/GB
S3 Glacier Deep ArchiveLong-term compliance12 hours180 days$0.00099/GB

S3 buckets are Region-specific, but bucket names must be globally unique.


Q4. What is IAM? Explain users, groups, roles, and policies.

  • User: A permanent identity for a human or application (has long-term credentials — access key + secret)
  • Group: Collection of users sharing the same permissions (e.g., "developers" group)
  • Role: Temporary identity assumed by services, EC2 instances, Lambda, or cross-account entities (no long-term credentials — uses STS)
  • Policy: JSON document defining permissions. Two main types:
    • Identity-based: Attached to users/groups/roles
    • Resource-based: Attached to resources (S3 bucket policy, SQS queue policy)

Best practice: Never use root account for daily operations. Follow the principle of least privilege. Prefer roles over long-term access keys for EC2 and Lambda.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject", "s3:PutObject"],
      "Resource": "arn:aws:s3:::my-bucket/*"
    }
  ]
}

Asked by: Amazon, Razorpay, PhonePe


Q5. What is a VPC? What components does it contain?

Core components:

VPC (10.0.0.0/16)
├── Public Subnet (10.0.1.0/24)
│   ├── EC2 instances with public IPs
│   └── NAT Gateway
├── Private Subnet (10.0.2.0/24)
│   ├── RDS databases
│   └── Application servers
├── Internet Gateway (IGW) — entry/exit for public traffic
├── Route Tables — define traffic routing per subnet
├── Security Groups — stateful instance-level firewall
├── Network ACLs — stateless subnet-level firewall
└── VPC Endpoints — private access to AWS services

Key difference: Security Groups are stateful (return traffic automatically allowed), NACLs are stateless (you must explicitly allow inbound AND outbound). SGs operate at instance level; NACLs at subnet level.


Q6. What is the difference between EBS, EFS, and S3?

FeatureEBSEFSS3
TypeBlock storageFile storage (NFS)Object storage
Attached toSingle EC2 (mostly)Multiple EC2s simultaneouslyNot attached — accessed via API/URL
ProtocolBlock deviceNFS v4HTTP REST
PerformanceUp to 256,000 IOPS (io2 BE)Scales automatically
Use caseOS disk, databasesShared file systemBackups, static assets, data lake
PricingPer GB provisionedPer GB storedPer GB stored + requests
Multi-AZReplicated within AZYes (Regional)Yes (multiple AZs)

Asked by: Infosys, TCS Digital, Wipro Elite


Q7. What is Auto Scaling and how does it work?

  1. Target Tracking: Maintain a metric at a target value (e.g., keep CPU at 50%)
  2. Step Scaling: Scale by different amounts based on breach thresholds
  3. Scheduled Scaling: Scale at specific times (e.g., scale up before market open at 9 AM IST)

Components: Launch Template (defines AMI, instance type, security groups) + Auto Scaling Group (defines min/max/desired capacity + VPC subnets) + Scaling Policy.

Cooldown period (default 300 seconds) prevents rapid scale in/out oscillation.


Q8. What is CloudFront and how does it differ from S3?

S3 is the origin storage. CloudFront sits in front of S3 (or EC2, ALB) to:

  • Cache static content at edge
  • Terminate SSL/TLS at edge
  • Apply WAF rules
  • Sign URLs for private content
  • Compress content with gzip/brotli

Origin Access Control (OAC) replaces the old OAI — it restricts S3 bucket access only to CloudFront, so your S3 URL is never exposed publicly.


Q9. What is Route 53? What routing policies does it support?

Routing PolicyUse Case
SimpleSingle resource
WeightedA/B testing, canary deployments (e.g., 90% v1, 10% v2)
Latency-basedRoute users to the lowest-latency Region
FailoverActive-passive DR with health checks
GeolocationRoute by user's geographic location
GeoproximityRoute by geographic proximity (with bias)
Multi-ValueReturn multiple IPs, basic load balancing
IP-basedRoute based on client IP ranges (new in 2023)

Health checks can monitor endpoints and trigger failover automatically.


Q10. What is the difference between Application Load Balancer (ALB) and Network Load Balancer (NLB)?

FeatureALBNLB
LayerLayer 7 (HTTP/HTTPS/WebSocket)Layer 4 (TCP/UDP/TLS)
RoutingPath, header, host, query, IPPort and protocol
Static IPNo (DNS only)Yes (Elastic IP per AZ)
PerformanceUltra-low latency, millions of RPS
WebSocketsYesYes
gRPCYes
Best forMicroservices, HTTP APIsGaming, IoT, financial trading
SSL TerminationAt ALBAt NLB or pass-through

A third type, Gateway Load Balancer (GLB), is used for inline network appliances (firewalls, IDS).

Asked by: Amazon, Swiggy, Zomato, Razorpay


Q11. What is Lambda? What are its limits?

Key limits (2026):

  • Max execution timeout: 15 minutes
  • Memory: 128 MB – 10 GB
  • Ephemeral disk (/tmp): 512 MB – 10 GB (configurable)
  • Deployment package: 50 MB (zipped direct upload), 250 MB (unzipped), 10 GB (container image)
  • Concurrent executions: 1,000 per account per Region (can increase via service limit)
  • Max response payload: 6 MB (synchronous), 256 KB (async)

Lambda integrates natively with API Gateway, S3, DynamoDB Streams, Kinesis, SQS, SNS, EventBridge, and 200+ more services.


Q12. What is RDS? Which engines does it support?

Supported engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora (MySQL- and PostgreSQL-compatible, AWS proprietary).

RDS Multi-AZ: Synchronous replication to a standby in a different AZ for high availability. Automatic failover in 1–2 minutes. Standby cannot be used for reads (use Read Replicas for that).

Read Replicas: Asynchronous replication. Supports up to 15 replicas (Aurora). Can be promoted to standalone. Can be in different Regions (cross-region read replicas).


Q13. What is DynamoDB?

Core concepts:

  • Partition Key (PK): Distributes data across partitions. Must be unique for simple primary keys.
  • Sort Key (SK): Optional secondary dimension. PK + SK must be unique together.
  • GSI (Global Secondary Index): Alternate access patterns with different PK/SK; can span all partitions
  • LSI (Local Secondary Index): Same PK, different SK; must be created at table creation

Capacity modes:

  • On-demand: Pay per request. Best for unpredictable traffic.
  • Provisioned: Set RCU/WCU. Use with Auto Scaling. Better for predictable, steady workloads.

DynamoDB Streams + Lambda = real-time event-driven pipelines.

Asked by: Amazon, Flipkart, Meesho


Q14. What is CloudFormation?

Template anatomy:

AWSTemplateFormatVersion: '2010-09-09'
Description: My application stack
Parameters:
  InstanceType:
    Type: String
    Default: t3.micro
Resources:
  MyEC2:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: !Ref InstanceType
      ImageId: ami-0abcdef1234567890
Outputs:
  InstanceId:
    Value: !Ref MyEC2

Change Sets let you preview changes before applying. Drift Detection identifies manual changes to resources outside CloudFormation.

CloudFormation vs. Terraform: CloudFormation is AWS-only but has tighter native integration. Terraform is multi-cloud and has a larger ecosystem but requires more setup.


Q15. What is the Shared Responsibility Model?

AWS Responsibility ("Security OF the cloud")Customer Responsibility ("Security IN the cloud")
Physical infrastructure (data centers)Customer data encryption
Hardware and networkIAM configuration and policies
HypervisorOperating system patching (for EC2)
Managed service softwareApplication-level security
Global networkSecurity groups and NACLs
AWS Region/AZ infrastructureCompliance certifications for their apps

For managed services like RDS, Lambda, S3 — AWS takes more responsibility (OS, runtime). For EC2, the customer manages everything above the hypervisor.

One of the most commonly asked conceptual questions at all levels.


Intermediate-Level AWS Questions (Q16-Q35)

This is the section that separates "I've used the console" from "I've built production systems." These questions come up in SDE-2 and Cloud Architect interviews at Amazon, Flipkart, and Razorpay.

FeatureVPC PeeringTransit GatewayPrivateLink
Connectivity1-to-1 VPCHub-and-spoke for many VPCsService endpoint to VPC
Transitive routingNoYesN/A
CostFree (data transfer fees apply)Per attachment + dataPer endpoint + data
Cross-accountYesYesYes
Cross-regionYesYesNo (Regional)
Use caseFew VPCs10+ VPCs, multi-accountAccess AWS or SaaS services privately

PrivateLink exposes a service from one VPC to consumers in other VPCs without the networks being peered — traffic never leaves AWS backbone.

Asked by: Amazon senior rounds, Razorpay, PhonePe infrastructure teams


Q17. What is AWS ECS vs. EKS? When would you choose each?

FeatureECSEKS
OrchestratorAWS proprietaryKubernetes
Learning curveLowerHigher
PortabilityAWS-onlyMulti-cloud (K8s standard)
Control plane costFree$0.10/hour per cluster (~$73/month)
Fargate supportYesYes
EcosystemAWS-native integrationsMassive CNCF ecosystem
DebuggingAWS Console, CloudWatchkubectl, K8s dashboards

Choose ECS when: team is new to containers, you want tight AWS integration, minimal operational overhead matters more than portability.

Choose EKS when: you need Kubernetes compatibility, multi-cloud strategy, using Helm charts, custom operators, or service meshes like Istio.

Asked by: Amazon, Swiggy, Razorpay, Flipkart


Q18. How does Lambda handle concurrency? What is provisioned concurrency?

  • Unreserved concurrency: Shared pool, default 1,000/account/Region
  • Reserved concurrency: Guarantees a set amount for one function (isolates it from others); also acts as a throttle cap
  • Provisioned concurrency: Pre-warms instances to eliminate cold starts — critical for latency-sensitive APIs

Cold start breakdown:

  1. AWS spins up a new execution environment (container)
  2. Downloads deployment package
  3. Initializes runtime (JVM, Node, Python interpreter)
  4. Runs init code outside the handler

For Java/JVM functions, cold starts can be 2–10 seconds. Provisioned concurrency keeps N instances warm so requests are served in <1ms initialization time.


Q19. Design a highly available 3-tier web application on AWS.

Architecture (text diagram):

Internet
    |
Route 53 (latency-based routing, health checks)
    |
CloudFront (SSL termination, WAF, edge caching)
    |
ALB (Application Load Balancer — Multi-AZ)
   / \
EC2 ASG (AZ-a)    EC2 ASG (AZ-b)     [Web/App Tier — Private Subnet]
       \               /
        ElastiCache (Redis — in-memory session, cache)
            |
        RDS Aurora (Multi-AZ — writer in AZ-a, reader in AZ-b)

Key HA decisions:

  • Route 53 health checks trigger failover at DNS level
  • ALB distributes across 2+ AZs
  • ASG min=2 ensures at least one instance per AZ
  • RDS Multi-AZ gives automatic failover (<120 seconds)
  • ElastiCache Multi-AZ replication group
  • S3 + CloudFront for static assets (decoupled from compute)

Classic architecture question at Amazon L5/L6 and Flipkart SDE-2/SDE-3


Q20. What is SQS vs. SNS vs. EventBridge? When do you use each?

ServicePatternDeliveryRetentionConsumers
SQSQueue (point-to-point)Pull4 days – 14 days1 consumer
SNSPub/Sub (fan-out)PushNo storageMultiple
EventBridgeEvent busPushOptional archiveRules-based routing

Common pattern — Fan-out: SNS topic → multiple SQS queues (so each microservice gets its own copy of the event).

EventBridge is preferred for event-driven architectures — it can filter events by content, route to Lambda/SQS/Step Functions/API destinations, and integrate with 200+ SaaS sources.


Q21. What is S3 Object Lock and how does it work?

  • Governance mode: Users with special IAM permissions can bypass the lock
  • Compliance mode: Nobody (including root) can delete the object until the retention period expires — used for regulatory compliance (SEBI, HIPAA, SOC2)

Legal Hold: Independent of retention period — can be applied/removed by privileged users.

Required for: financial records retention, healthcare data, compliance with India's DPDP Act.


Q22. Explain AWS WAF. What rules can you configure?

Rule types:

  • IP set rules: Block/allow specific IP ranges
  • Rate-based rules: Throttle IPs exceeding X requests per 5 minutes (DDoS protection)
  • Managed rule groups: AWS-managed rules for OWASP Top 10, known bad IPs, SQL injection, XSS
  • Custom rules: Match on request components (URI, headers, query strings, body — first 8 KB)

Each Web ACL contains rules with Allow/Block/Count/CAPTCHA actions. Rules are evaluated in priority order.

Asked by: Razorpay, PhonePe, Zerodha (security-focused rounds)


Q23. What is AWS KMS? How does envelope encryption work?

Envelope encryption:

  1. KMS generates a Customer Master Key (CMK) — never leaves KMS
  2. Application requests a Data Encryption Key (DEK) from KMS
  3. KMS returns: plaintext DEK + encrypted DEK
  4. Application encrypts data with plaintext DEK in memory
  5. Plaintext DEK is discarded; encrypted DEK is stored alongside encrypted data
  6. To decrypt: send encrypted DEK to KMS → get back plaintext DEK → decrypt data

This pattern means KMS is only involved during key wrapping/unwrapping, not for bulk data encryption, keeping latency and cost low.


Q24. How does DynamoDB handle hot partitions? How do you fix them?

Solutions:

  1. Write sharding: Append a random suffix (0–9) to the PK → spread 1 partition into 10 → scatter reads also use scatter-gather
  2. Caching layer: ElastiCache DAX (DynamoDB Accelerator) — in-memory caching at microsecond latency, reduces read load on hot items
  3. Better PK design: Choose high-cardinality attributes (UUID, user_id, order_id)
  4. Adaptive capacity (automatic): DynamoDB shifts capacity to hot partitions automatically within limits

Deep-dive question at Amazon, Flipkart data platform teams


Q25. What is AWS CloudTrail and how does it differ from CloudWatch?

FeatureCloudTrailCloudWatch
PurposeAPI audit logMetrics, logs, alarms, dashboards
What it tracksWho did what (API calls)How resources are performing
Data typeEvents (JSON)Metrics, log streams
Retention90 days (free) or S3Configurable (indefinite)
Use caseSecurity audit, complianceOps monitoring, alerting

CloudTrail records every API call: who made it (user/role/service), from which IP, at what time, what resource was targeted, and whether it succeeded. Management events are enabled by default; data events (S3 object-level, Lambda invocations) cost extra.


Q26. What is Elastic Beanstalk? When would you NOT use it?

Do NOT use Elastic Beanstalk when:

  • You need fine-grained control over infrastructure configuration
  • You're running microservices that need container orchestration
  • You need to customize the underlying OS or runtime beyond what EB offers
  • Your team already uses Terraform/CDK for IaC (EB's environment configs don't integrate cleanly)
  • Cost optimization is critical (you'd over-provision with EB's opinionated setup)

Good for: monolithic applications, teams new to AWS, proof-of-concepts.


Q27. What is AWS Step Functions? Give a real use case.

Real use case — Order processing pipeline:

[Start]
  → ValidateOrder (Lambda)
  → CheckInventory (Lambda)
      → [Choice] In stock?
          → YES: ReserveInventory → ProcessPayment → SendConfirmation → [End]
          → NO: NotifyOutOfStock → RefundPayment → [End]
  → [Catch] PaymentFailed → RollbackInventory → SendFailureNotification → [End]

Step Functions handles retries, timeouts, parallel execution, and error handling automatically. Standard Workflows are for long-running (1 year max), exactly-once execution. Express Workflows are for high-throughput, short-duration (5 min max), at-least-once execution.

Asked by: Amazon, Flipkart, Myntra backend rounds


Q28. What is the difference between NAT Gateway and NAT Instance?

FeatureNAT GatewayNAT Instance
Managed byAWSCustomer
AvailabilityHighly available (within AZ)Single EC2 — SPOF unless you manage HA
BandwidthUp to 100 Gbps (auto-scales)Limited by instance size
Security groupsCannot attachCan attach
Cost$0.045/hour + $0.045/GBEC2 pricing + management overhead
MaintenanceNonePatching, monitoring required

Rule: Always use NAT Gateway in production. NAT Instances are obsolete except for very cost-sensitive dev environments.


Q29. How would you optimize AWS costs for a startup running 24/7 workloads?

  1. Reserved Instances (RI): 1-year or 3-year commitment for EC2, RDS, ElastiCache → up to 72% savings vs. On-Demand
  2. Savings Plans: More flexible than RIs — compute savings plans cover any EC2, Lambda, Fargate; 66% max savings
  3. Spot Instances: 70–90% cheaper than On-Demand for interruption-tolerant workloads (batch jobs, ML training, CI runners)
  4. Graviton instances: Switch t3→t4g, m5→m7g → 20–40% cost reduction, better perf/dollar
  5. S3 Intelligent-Tiering: Auto-moves infrequently accessed objects to cheaper tiers
  6. RDS Aurora Serverless v2: Scales to 0 ACUs in dev/staging (no cost when idle)
  7. Lambda instead of always-on EC2: For bursty or low-frequency workloads
  8. AWS Cost Explorer + Budgets: Set billing alarms, identify waste
  9. Right-sizing: Use Compute Optimizer recommendations to downsize over-provisioned instances
  10. Delete zombie resources: Unused EIPs ($3.65/month each), idle NAT Gateways, forgotten snapshots

Heavily asked at startup interviews (Razorpay, Zerodha, CRED)


Q30. What is AWS Config and how does it enforce compliance?

How it works:

  1. Config Recorder captures resource configurations and changes
  2. Config Rules evaluate configurations (AWS Managed Rules or custom Lambda rules)
  3. Non-compliant resources are flagged
  4. Remediation Actions (manual or auto via SSM Automation) fix violations

Common rules:

  • ec2-instance-no-public-ip — alert on EC2s with public IPs in private subnets
  • s3-bucket-public-read-prohibited — detect accidentally public S3 buckets
  • iam-root-access-key-check — ensure root has no access keys
  • rds-multi-az-support — enforce Multi-AZ for production RDS

Q31. What is the difference between STS AssumeRole and IAM instance profiles?

  • Instance Profile: An IAM role attached to an EC2 instance. The EC2 metadata service (169.254.169.254) vends temporary credentials automatically via the IMDSv2 endpoint. Applications running on EC2 call the metadata endpoint and get credentials without any configuration.
  • STS AssumeRole: An explicit API call (sts:AssumeRole) to obtain temporary credentials for a role. Used for cross-account access, federated identity (SSO), Lambda assuming another role, or EKS pods via IRSA.

IMDSv2 (Instance Metadata Service v2) is session-oriented and requires a PUT request first — mitigates SSRF attacks that can steal EC2 credentials (a real attack vector that got several companies breached).


Q32. What is Aurora Global Database?

Use cases:

  • Global applications needing low-latency reads worldwide
  • Disaster recovery with RPO of ~1 second and RTO of <1 minute

Failover: In a disaster, you can promote a secondary Region to primary. Global Write Forwarding allows secondary Regions to write — Aurora routes the write to the primary automatically (slight latency increase).

Cost: ~20% more than standard Aurora due to cross-region replication costs.


Q33. How does API Gateway handle throttling?

  1. Account-level: 10,000 RPS (requests per second) burst limit of 5,000 per Region (adjustable)
  2. Stage-level: Set default throttling per stage
  3. Method-level: Override per route (e.g., /payment stricter than /health)
  4. Usage Plans + API Keys: Throttle per API consumer (for public APIs with paying customers)

When throttled, clients receive HTTP 429 Too Many Requests. Implement exponential backoff + jitter in clients. Use SQS as a buffer in front of Lambda for bursty ingestion instead of direct API Gateway → Lambda if you can tolerate async processing.


Q34. What is AWS CDK vs. CloudFormation vs. Terraform?

FeatureCloudFormationCDKTerraform
LanguageYAML/JSONTypeScript, Python, Java, C#, GoHCL
Multi-cloudNoNoYes
AbstractionLowHigh (L3 constructs)Medium
State managementManaged by AWSDeploys via CloudFormationLocal/remote tfstate
Import existing resourcesLimitedLimitedYes (terraform import)
Module ecosystemNoneConstruct HubTerraform Registry
CommunityMediumGrowingVery large

CDK synthesizes into CloudFormation templates — so ultimately it's CloudFormation under the hood. CDK's high-level constructs (L3) encapsulate best practices (e.g., ApplicationLoadBalancedFargateService = ALB + ECS Fargate with sane defaults in one construct).

Asked at senior/architect level interviews


Q35. Explain SQS visibility timeout and dead-letter queues.

Visibility Timeout: When a consumer reads a message from SQS, the message becomes invisible to other consumers for the visibility timeout duration (default 30 seconds, max 12 hours). If the consumer successfully processes and deletes the message within this window, it's gone. If it crashes or takes too long, the message reappears for another consumer to pick up.

Dead-Letter Queue (DLQ): After a message is received N times (maxReceiveCount) without being deleted, SQS moves it to the DLQ. The DLQ is a regular SQS queue used for:

  • Debugging: inspect why messages failed
  • Alerting: CloudWatch alarm on DLQ depth > 0
  • Replaying: fix the consumer, then move messages back to the main queue

For FIFO queues, DLQs must also be FIFO.


Advanced-Level AWS Questions (Q36-Q50)

Don't skip the Advanced section — this is where interviewers separate Rs 20 LPA from Rs 50+ LPA candidates. These questions are asked at Amazon SDE-3, Flipkart L3, and senior cloud architect roles.

Q36. Design a serverless data pipeline ingesting 1 million events/minute on AWS.

Architecture:

Mobile/Web Apps
    |
Kinesis Data Streams (100 shards × 1MB/s = 100MB/s capacity)
    |
Kinesis Data Firehose (buffers, transforms, delivers)
   /        \
S3 (raw)    Lambda (real-time processing, enrich/filter)
   |              |
AWS Glue       DynamoDB (hot path — last 5 min aggregations)
(batch ETL)
   |
S3 (parquet, Hive-partitioned: year/month/day/hour)
   |
Amazon Athena (ad-hoc SQL on S3)
   |
QuickSight (dashboards)

Key choices: Kinesis over SQS for ordered, replay-capable, high-throughput streaming. Firehose auto-scales and handles delivery retries. Parquet format gives 3–5x query speedup in Athena. Glue Data Catalog as metastore for Athena schema discovery.


Q37. How would you implement blue/green deployments on AWS?

Option 1 — Route 53 weighted routing:

  • Blue (current production): 100% weight
  • Deploy Green (new version) to a separate stack
  • Shift 10% → 50% → 100% traffic via Route 53 weights
  • Monitor metrics; instant rollback by setting Green weight to 0

Option 2 — ALB listener rules:

  • Two target groups: Blue (v1) and Green (v2)
  • Shift traffic by modifying target group weights on the listener
  • AWS CodeDeploy automates this for ECS and Lambda

Option 3 — CodeDeploy for Lambda:

  • Linear10PercentEvery1Minute: shift 10% of traffic to new Lambda version every minute
  • Pre/PostTraffic hooks run validation Lambda functions before/after shift
  • Automatic rollback if CloudWatch alarms fire during deployment

Architecture question at Amazon L6, Flipkart Principal Engineer


Q38. How does EKS handle IAM authentication and authorization?

  1. Authentication: AWS IAM via aws-iam-authenticator. The kubectl command generates a pre-signed STS URL token. The EKS control plane validates the token against IAM.

  2. Authorization: Kubernetes RBAC. IAM identities are mapped to Kubernetes users/groups via the aws-auth ConfigMap in kube-system:

mapRoles:
- rolearn: arn:aws:iam::123456789:role/developer-role
  username: developer
  groups:
    - system:masters  # cluster-admin (or custom RBAC groups)

IRSA (IAM Roles for Service Accounts): Associate a Kubernetes Service Account with an IAM Role using OIDC federation. Pods get AWS credentials scoped to their service account → no more node-level IAM roles sharing credentials across all pods.


Q39. What is AWS Outposts? When would you deploy it?

When to use:

  • Data residency requirements preventing cloud migration (banking, government, healthcare in certain jurisdictions)
  • Ultra-low latency local processing with cloud connectivity
  • Gradual hybrid migration strategy
  • Manufacturing/industrial edge computing where internet connectivity is unreliable

Available in Outposts rack (full rack delivered by AWS), Outposts servers (1U/2U for smaller locations), and Local Zones (AWS-operated facility close to metro areas — different from Outposts).


Q40. Explain AWS Shield Standard vs. Advanced and how DDoS protection works.

Shield Standard (free, automatic):

  • Protects all AWS customers against Layer 3/4 attacks (SYN floods, UDP reflection)
  • Automatic detection and mitigation at the network edge

Shield Advanced ($3,000/month per organization):

  • Layer 7 protection (with WAF integration)
  • 24/7 DDoS Response Team (DRT) access
  • Cost protection (AWS credits DDoS-related scaling charges)
  • Advanced attack diagnostics in real-time
  • Protects: EC2, ELB, CloudFront, Global Accelerator, Route 53

DDoS mitigation architecture:

Attacker
    |
CloudFront (absorbs HTTP floods at edge — 450+ PoPs)
    |
Shield Advanced (Layer 3/4 scrubbing)
    |
WAF (rate-based rules, IP reputation lists)
    |
ALB (health checks drop bad traffic)
    |
Your application (sees only clean traffic)

Asked at Razorpay, Zerodha security rounds


Q41. How do you implement cross-account access in AWS?

Pattern 1 — Cross-account IAM Role:

  1. In Account B (target), create a role with a trust policy allowing Account A to assume it
  2. In Account A, attach an IAM policy to users/roles allowing sts:AssumeRole on Account B's role
  3. Application in Account A calls sts:AssumeRole → gets temporary credentials for Account B resources

Pattern 2 — Resource-based policies (S3, SQS, KMS, Lambda): Directly grant Account A principal access in the resource policy without assuming a role.

Pattern 3 — AWS Organizations + Service Control Policies (SCPs): SCPs are permission guardrails applied at OU/account level — they restrict what IAM policies CAN grant, even if the IAM policy allows it. Used for organization-wide compliance (e.g., "no one can disable CloudTrail", "all resources must be tagged").


Q42. What is the difference between CloudWatch Metrics, Logs, and X-Ray?

ToolPurposeData Type
CloudWatch MetricsNumeric time-series data (CPU, latency, error rate)Numbers with dimensions
CloudWatch LogsLog lines from applications and AWS servicesText streams
CloudWatch Log InsightsServerless SQL-like queries over log dataQuery language
AWS X-RayDistributed tracing across servicesTrace segments/subsegments

X-Ray provides an end-to-end trace view: API Gateway → Lambda → DynamoDB → external HTTP calls. It generates a service map showing latency contribution per service and error rates. Essential for debugging distributed latency in microservices.


Q43. How does S3 Transfer Acceleration work?

When it helps: Uploading from geographically distant clients (e.g., a user in India uploading to an S3 bucket in us-east-1 for compliance reasons). AWS provides a speed comparison tool at s3-accelerate-speedtest.s3-accelerate.amazonaws.com.

When it doesn't help (or hurts): Same-region uploads — the public internet path is comparable to the AWS backbone for short distances, and you'd pay the acceleration surcharge unnecessarily.


Q44. What is AWS Global Accelerator? How is it different from CloudFront?

FeatureCloudFrontGlobal Accelerator
Use caseHTTP/HTTPS content delivery and cachingTCP/UDP traffic acceleration, non-HTTP
CachingYesNo
Static IPsNoYes (2 anycast IPs)
ProtocolHTTP/HTTPS/WebSocketAny TCP/UDP
Health routingNo (edge-to-origin)Yes (routes around failures)
Best forWeb content, APIs with cachingGaming, IoT, multi-region ALB failover

Global Accelerator's two static anycast IPs are whitelisted in enterprise firewalls — critical for B2B SaaS. It routes traffic to healthy endpoints across regions automatically.


Q45. How do you encrypt data at rest in S3?

  1. SSE-S3 (AES-256): AWS manages keys entirely. Zero configuration. Free. Default since January 2023 for all new objects.
  2. SSE-KMS: Uses KMS CMK. You control key policy, audit usage via CloudTrail, enable key rotation. Adds KMS API call latency (~1ms) and cost ($0.03/10,000 requests).
  3. SSE-KMS with DSSE: Dual-layer encryption — two independent KMS calls. Meets CNSSI requirements for top-secret data.
  4. SSE-C (Customer-Provided Keys): You provide the key per request. AWS encrypts, then discards the key. You manage key storage.

For client-side encryption: use the AWS Encryption SDK. Data is encrypted before leaving the client. AWS never sees plaintext.


Q46. What is Amazon Bedrock? How does it integrate with existing AWS services?

Integration patterns:

  • Knowledge Bases for Bedrock: RAG pipeline — upload documents to S3 → Bedrock ingests, chunks, embeds into a vector store (OpenSearch Serverless or Pinecone) → query via RetrieveAndGenerate API
  • Bedrock Agents: Multi-step reasoning agents that call Lambda functions as tools, query databases, and complete tasks autonomously
  • Guardrails: Content filtering, PII redaction, topic blocking — applied to inputs and outputs
  • Model Evaluation: Compare models on custom datasets before choosing

Rapidly becoming a standard interview topic at AI-forward companies (2026)


Q47. Explain AWS Well-Architected Framework pillars with examples.

PillarKey PrincipleExample
Operational ExcellenceRun and monitor systems, continuously improveUse IaC (CloudFormation/CDK), implement runbooks, post-mortems
SecurityProtect information and systemsEnable MFA, use KMS, rotate credentials, enable GuardDuty
ReliabilityRecover from failures, meet demandMulti-AZ RDS, circuit breakers, chaos engineering with FIS
Performance EfficiencyUse resources efficiently as demand changesChoose right instance family, use CDN, profile before optimizing
Cost OptimizationAvoid unnecessary costsReserved Instances, Spot for batch, right-sizing
SustainabilityMinimize environmental impactGraviton (better perf/watt), serverless, archive to Glacier

AWS Fault Injection Service (FIS) is the chaos engineering tool — injects CPU stress, AZ failures, latency on EC2/ECS/EKS to validate resilience.


Q48. How does Amazon Kinesis Data Streams handle shard splitting and merging?

  • Shard Splitting: Split one shard into two when you need more throughput. The old shard becomes read-only (existing data still readable until retention expires); new shards receive new writes.
  • Shard Merging: Merge two adjacent shards (by hash key range) to reduce cost when throughput drops.

Re-sharding considerations:

  • Enhanced fan-out consumers (dedicated 2 MB/s per consumer per shard) are not affected
  • Partition key hashing must be understood — all records with the same partition key go to the same shard (ordering guarantee within a key)
  • Use DescribeStreamSummary to check current shard count before splitting

Q49. What is Service Control Policy (SCP) in AWS Organizations? Give a practical example.

Practical example — Prevent disabling security services:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyDisableGuardDuty",
      "Effect": "Deny",
      "Action": [
        "guardduty:DeleteDetector",
        "guardduty:DisassociateFromMasterAccount",
        "cloudtrail:StopLogging",
        "cloudtrail:DeleteTrail",
        "config:DeleteConfigurationRecorder"
      ],
      "Resource": "*"
    }
  ]
}

Applied to the root OU, this SCP prevents any account in the organization — even the root user of a member account — from disabling security monitoring. Essential for enterprise governance.

Asked at senior/lead level at fintech companies


Q50. How do you troubleshoot a Lambda function timing out at 15 minutes?

Diagnosis steps:

  1. Check CloudWatch Logs for the function — identify which part is slow (add timestamps/X-Ray traces)
  2. Check Lambda Insights (enhanced monitoring) for memory pressure, CPU throttling, init duration
  3. Check downstream service latency: DynamoDB, RDS, external HTTP calls, S3

Solutions:

  1. Break into smaller functions: Use Step Functions to orchestrate a workflow instead of one monolithic Lambda
  2. Async pattern: Trigger Lambda from SQS/SNS, return 202 Accepted immediately, poll for results
  3. Optimize the bottleneck: Add connection pooling for RDS (RDS Proxy), add ElastiCache, use batch operations instead of loops
  4. Switch to Fargate/ECS: For truly long-running tasks (video processing, ML inference), Fargate has no 15-minute limit
  5. Increase Lambda memory: CPU scales linearly with memory. Going from 512MB to 2GB often cuts runtime by 60-75% for CPU-bound tasks

FAQ Section — Your AWS Career Questions Answered

Q: Which AWS certification should I get first? Start with AWS Certified Solutions Architect - Associate. It's the most recognized cloud cert globally and has the highest ROI for career growth — many engineers report Rs 3-5 LPA salary bumps just from this one certification. It covers all core services and is required/preferred at most companies. After that, pursue the Solutions Architect - Professional or DevOps Engineer - Professional depending on your role.

Q: What is the difference between AWS and Azure for Indian companies? AWS dominates India with data centers in Mumbai (ap-south-1) and Hyderabad (ap-south-2). Most Indian unicorns (Flipkart, Razorpay, Zerodha, CRED, Swiggy) run on AWS. Azure is stronger in enterprises using Microsoft stack. GCP is growing in AI/ML workloads.

Q: Is Terraform or CloudFormation better for AWS? For AWS-only teams: CloudFormation or CDK is easier (native integration, no state file management). For multi-cloud or teams with strong Terraform expertise: Terraform. Most companies use Terraform in practice because of the larger ecosystem and better multi-account/multi-region patterns (Terragrunt, Atlantis).

Q: What salary can I expect for AWS roles in India in 2026? Here are the real numbers from verified offers: AWS Cloud Engineer (3-5 years): Rs 18-35 LPA. AWS Architect (7+ years): Rs 40-80 LPA. AWS at FAANG (Amazon): Rs 60 LPA-1.5 Cr+ including RSUs. DevOps/SRE with strong AWS: Rs 20-50 LPA at product companies. The cloud skills gap in India is massive — demand far exceeds supply.

Q: What is the difference between Reserved Instances and Savings Plans? Reserved Instances are tied to a specific instance type, region, and OS. Savings Plans are more flexible — Compute Savings Plans cover any EC2 instance, Fargate, and Lambda regardless of type, size, or region within the commitment amount. Savings Plans are generally recommended now over RIs.

Q: How do you handle secrets in AWS Lambda? Never hardcode secrets. Use AWS Secrets Manager (auto-rotation, versioning, fine-grained IAM policies) or SSM Parameter Store (SecureString type, free tier available). Access secrets at function startup and cache in memory (not in environment variables for highly sensitive secrets, as they're visible in the console).

Q: What is the difference between ECS on EC2 and ECS on Fargate? With EC2 launch type, you manage the underlying EC2 instances (patching, scaling, capacity planning). With Fargate, you specify CPU and memory per task and AWS manages the underlying infrastructure. Fargate is ~20-30% more expensive but eliminates operational overhead. Use Fargate for most workloads unless you have specific kernel requirements or need GPU access.

Q: How does AWS handle data sovereignty for Indian customers? AWS's Mumbai (ap-south-1) and Hyderabad (ap-south-2) Regions ensure data stays in India. RBI, SEBI, and IRDAI regulated entities can use these regions to comply with data localization requirements. AWS has a shared compliance responsibility — it holds certifications like ISO 27001, SOC 2, and PCI DSS, and customers inherit these for their workloads.


Summary: What Companies Actually Ask

CompanyFocus Areas
Amazon (AWS SDE/SRE)DynamoDB deep dives, distributed systems, Lambda internals, cost optimization, Leadership Principles
FlipkartMulti-region architecture, Kinesis streaming, EKS, cost optimization at scale
RazorpayVPC security, WAF, KMS, compliance (PCI DSS), API Gateway, Lambda
PhonePeHigh-availability patterns, RDS Aurora, caching strategies, incident response
ZerodhaSecurity (IAM, SCP, GuardDuty), CloudTrail, cost optimization, minimal infrastructure philosophy
Swiggy/ZomatoAuto-scaling for traffic spikes, ECS/EKS, ElastiCache, SQS patterns

Keep building your cloud & infrastructure interview toolkit:

Advertisement Placement

Explore this topic cluster

More resources in Interview Questions

Use the category hub to browse similar questions, exam patterns, salary guides, and preparation resources related to this topic.

Related Articles

More from PapersAdda

Share this guide: