AWS Cost Optimisation
10 Cost Optimisation Opportunities I Find in Every Well-Architected Review
After conducting dozens of Well-Architected Reviews, I keep seeing the same cost optimisation opportunities. Here are the 10 most common ways startups overspend on AWS—and how to fix them.
Cloud Associates
Last month, I worked with a startup spending $4,200/month on AWS. By the end of our 2-week review, I’d identified $1,680/month in saving 40% of their bill without touching performance or reliability.
The crazy part? None of these optimisations required sophisticated engineering. They were all straightforward configuration changes, rightsizing decisions, and removing waste.
After conducting dozens of Well-Architected Reviews, I keep seeing the same patterns. This guide covers the 10 most common ways startups overspend, how to spot them in your environment, and step-by-step fixes.
Before We Start: The Cost Optimisation Mindset
Cost optimisation is NOT:
- Choosing the cheapest option regardless of impact
- Sacrificing reliability or performance to save $50/month
- A one-time exercise
Cost optimisation IS:
- Using the right resources for your workload
- Removing waste (unused resources, inefficient configurations)
- Ongoing practice (review quarterly)
Every optimisation in this guide maintains or improves your infrastructure while reducing costs.
Opportunity #1: Unused or Underutilised EC2 Instances
What I Usually Find
Typical scenario:
- 5 EC2 instances running 24/7
- 2 of them have <5% average CPU utilisation over 30 days
- 1 is completely stopped but still has attached EBS volumes
How it happens:
- Someone spins up an instance for testing
- Forgets to terminate it after testing
- Instance runs for 6 months, costing $15-60/month
How to Identify
# List all running instances with CPU utilization
aws ec2 describe-instances \
--filters "Name=instance-state-name,Values=running" \
--query 'Reservations[*].Instances[*].[InstanceId,InstanceType,Tags[?Key==`Name`].Value|[0]]' \
--output table
# Check CloudWatch for CPU utilization (last 30 days)
for instance in $(aws ec2 describe-instances --query 'Reservations[*].Instances[*].InstanceId' --output text); do
avg_cpu=$(aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=$instance \
--start-time $(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 2592000 \
--statistics Average \
--query 'Datapoints[0].Average' \
--output text)
echo "$instance: Average CPU = $avg_cpu%"
done
How to Fix
For unused instances (stopped or <5% CPU for 30 days):
# Terminate unused instance
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0
# Delete associated EBS volumes (billed even when instance is stopped)
aws ec2 delete-volume --volume-id vol-123456
For underutilised instances (5-20% CPU consistently):
- Rightsize to smaller instance type
- Example: t3.large (2 vCPU, 8GB) → t3.medium (2 vCPU, 4GB) saves $30/month
# Resize instance (requires stop/start)
aws ec2 stop-instances --instance-ids i-1234567890abcdef0
aws ec2 modify-instance-attribute --instance-id i-1234567890abcdef0 --instance-type t3.medium
aws ec2 start-instances --instance-ids i-1234567890abcdef0
Typical savings: $200-800/month (depending on instance count and types)
Opportunity #2: No Reserved Instances or Savings Plans
What I Usually Find
Typical scenario:
- 10 EC2 instances running 24/7 for production
- All on On-Demand pricing
- Paying full price when 40-60% discount is available
Monthly waste:
- 10 × t3.large On-Demand: $750/month
- 10 × t3.large Reserved (1-year, no upfront): $450/month
- Savings: $300/month (40%) for same resources
How to Identify
# Check current On-Demand spend
aws ce get-cost-and-usage \
--time-period Start=2025-12-01,End=2025-12-31 \
--granularity MONTHLY \
--metrics UnblendedCost \
--group-by Type=SERVICE
# Get EC2 usage recommendations
aws ce get-reservation-purchase-recommendation \
--service EC2 \
--lookback-period-in-days SIXTY_DAYS \
--term-in-years ONE_YEAR \
--payment-option NO_UPFRONT
AWS Cost Explorer will tell you exactly which Reserved Instances to buy for maximum savings.
How to Fix
Step 1: Identify steady workloads (running 24/7 for >6 months)
Step 2: Purchase Reserved Instances or Compute Savings Plans
Reserved Instances:
- Best for specific instance types (e.g., 5× t3.large)
- Lock-in to instance type and region
- Up to 72% savings (3-year, all upfront)
Compute Savings Plans:
- More flexible (applies to any instance family, region)
- Up to 66% savings (3-year)
- Recommended for most startups
# Purchase 1-year Compute Savings Plan ($10/hour commitment)
aws savingsplans create-savings-plan \
--savings-plan-type "ComputeSavingsPlans" \
--commitment $10 \
--upfront-payment-amount $0 \
--term-duration-in-years 1
Recommendation:
- Start with 1-year, no upfront commitment
- Purchase enough to cover 70-80% of your baseline compute (not 100%)
- Re-evaluate every 6 months
Typical savings: $300-1,000/month (depending on On-Demand spend)
Opportunity #3: Oversized RDS Databases
What I Usually Find
Typical scenario:
- Production database: db.m5.2xlarge (8 vCPU, 32 GB RAM)
- Average CPU: 15%
- Average RAM: 35%
- Database is 3-4× larger than needed
Monthly cost:
- db.m5.2xlarge: $560/month
- db.m5.large (2 vCPU, 8 GB RAM): $140/month
- Savings: $420/month (75%)
How to Identify
# Check RDS instance sizes and utilisation
aws rds describe-db-instances \
--query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceClass,Engine]' \
--output table
# Check CPU utilisation (last 30 days)
aws cloudwatch get-metric-statistics \
--namespace AWS/RDS \
--metric-name CPUUtilization \
--dimensions Name=DBInstanceIdentifier,Value=your-db \
--start-time $(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 2592000 \
--statistics Average,Maximum \
--output table
Red flag: CPU <30% and RAM <50% consistently = oversized
How to Fix
Step 1: Take a snapshot (safety first)
aws rds create-db-snapshot \
--db-instance-identifier your-db \
--db-snapshot-identifier before-resize-$(date +%Y%m%d)
Step 2: Resize during maintenance window
aws rds modify-db-instance \
--db-instance-identifier your-db \
--db-instance-class db.m5.large \
--apply-immediately
Important:
--apply-immediatelycauses brief downtime (1-3 minutes)- Use
--no-apply-immediatelyto schedule during next maintenance window - Test in staging first
Typical savings: $300-600/month per database
Opportunity #4: Excessive Data Transfer Costs
What I Usually Find
Typical scenario:
- Monthly AWS bill: $1,200
- Data transfer out: $400 (33% of bill)
- Root cause: Serving static assets from S3 without CloudFront
How it happens:
- Images/CSS/JS served directly from S3
- Every request pays S3 data transfer: $0.09/GB
- CloudFront costs $0.085/GB but includes caching (fewer origin fetches)
How to Identify
# Check data transfer costs
aws ce get-cost-and-usage \
--time-period Start=2025-12-01,End=2025-12-31 \
--granularity MONTHLY \
--metrics UnblendedCost \
--filter file://filter.json
# filter.json
{
"Dimensions": {
"Key": "USAGE_TYPE_GROUP",
"Values": ["EC2: Data Transfer - Internet (Out)"]
}
}
Red flag: Data transfer >20% of total bill
How to Fix
Step 1: Add CloudFront in front of S3
# Create CloudFront distribution for S3 bucket
aws cloudfront create-distribution \
--distribution-config file://cloudfront-config.json
cloudfront-config.json:
{
"CallerReference": "my-static-assets",
"Comment": "Static assets CDN",
"Enabled": true,
"Origins": {
"Quantity": 1,
"Items": [
{
"Id": "S3-my-bucket",
"DomainName": "my-bucket.s3.ap-southeast-2.amazonaws.com",
"S3OriginConfig": {
"OriginAccessIdentity": ""
}
}
]
},
"DefaultCacheBehavior": {
"TargetOriginId": "S3-my-bucket",
"ViewerProtocolPolicy": "redirect-to-https",
"Compress": true,
"ForwardedValues": {
"QueryString": false,
"Cookies": { "Forward": "none" }
}
}
}
Step 2: Update application to use CloudFront URLs
// Before
const imageUrl = `https://my-bucket.s3.amazonaws.com/images/${image}.jpg`;
// After
const imageUrl = `https://d123456.cloudfront.net/images/${image}.jpg`;
Impact:
- S3 origin fetches reduced by 85-95% (caching)
- Data transfer costs drop 70-90%
Typical savings: $200-500/month
Opportunity #5: Unused Elastic IPs
What I Usually Find
Typical scenario:
- 5 Elastic IPs allocated
- 2 not associated with any running instance
- AWS charges $0.005/hour for unattached Elastic IPs ($3.60/month each)
How it happens:
- Engineer terminates EC2 instance
- Forgets to release Elastic IP
- Elastic IP sits unused for months
How to Identify
# List all Elastic IPs and their associations
aws ec2 describe-addresses \
--query 'Addresses[*].[PublicIp,InstanceId,AssociationId]' \
--output table
If InstanceId is empty, the Elastic IP is unused and costing money.
How to Fix
# Release unused Elastic IP
aws ec2 release-address --allocation-id eipalloc-12345678
Warning: Only release if you’re sure you don’t need it. Released IPs can’t be recovered.
Typical savings: $5-20/month (seemingly small, but it’s literally free money)
Opportunity #6: Old EBS Snapshots
What I Usually Find
Typical scenario:
- 150 EBS snapshots
- Most are >90 days old
- Many are from terminated instances
- Cost: $0.05/GB/month × 500 GB = $25/month for unused backups
How it happens:
- Automated daily snapshots created
- Retention policy not configured
- Snapshots accumulate forever
How to Identify
# List snapshots older than 90 days
aws ec2 describe-snapshots \
--owner-ids self \
--query "Snapshots[?StartTime<='$(date -d '90 days ago' -u +%Y-%m-%dT%H:%M:%S.000Z)'].[SnapshotId,StartTime,VolumeSize,Description]" \
--output table
How to Fix
Step 1: Define retention policy
- Production volumes: Keep 30 days of snapshots
- Development volumes: Keep 7 days of snapshots
- Long-term backups: Archive to S3 Glacier ($0.004/GB vs. $0.05/GB)
Step 2: Delete old snapshots
# Delete snapshots older than 90 days
for snapshot in $(aws ec2 describe-snapshots --owner-ids self --query "Snapshots[?StartTime<='$(date -d '90 days ago' -u +%Y-%m-%dT%H:%M:%S.000Z)'].SnapshotId" --output text); do
echo "Deleting $snapshot"
aws ec2 delete-snapshot --snapshot-id $snapshot
done
Step 3: Automate snapshot lifecycle
Use AWS Data Lifecycle Manager to automatically delete old snapshots:
aws dlm create-lifecycle-policy \
--description "Delete EBS snapshots after 30 days" \
--state ENABLED \
--execution-role-arn arn:aws:iam::123456789:role/AWSDataLifecycleManagerDefaultRole \
--policy-details file://lifecycle-policy.json
Typical savings: $20-100/month
Opportunity #7: S3 Storage Class Optimisation
What I Usually Find
Typical scenario:
- 500 GB in S3 Standard ($0.023/GB) = $11.50/month
- Analysis shows:
- 300 GB not accessed in 90+ days (should be Glacier: $0.004/GB)
- 100 GB accessed 1-2 times/month (should be Standard-IA: $0.0125/GB)
Potential savings:
- 300 GB to Glacier: $11.50 → $1.20 (saves $10.30)
- 100 GB to Standard-IA: $2.30 → $1.25 (saves $1.05)
- Total savings: $11.35/month
This compounds at scale. 10 TB = $1,000+/month savings.
How to Identify
# Enable S3 Storage Lens (free)
aws s3control put-storage-lens-configuration \
--account-id 123456789 \
--config-id default-account-dashboard \
--storage-lens-configuration file://storage-lens-config.json
S3 Storage Lens shows access patterns and recommends storage class transitions.
How to Fix
Create lifecycle policies to automatically transition objects:
aws s3api put-bucket-lifecycle-configuration \
--bucket my-bucket \
--lifecycle-configuration file://lifecycle.json
lifecycle.json:
{
"Rules": [
{
"Id": "Move to IA after 30 days",
"Status": "Enabled",
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
}
]
},
{
"Id": "Move to Glacier after 90 days",
"Status": "Enabled",
"Transitions": [
{
"Days": 90,
"StorageClass": "GLACIER"
}
]
},
{
"Id": "Delete after 365 days",
"Status": "Enabled",
"Expiration": {
"Days": 365
}
}
]
}
Adjust based on your use case:
- Logs: Standard → Glacier after 30 days, delete after 90 days
- Backups: Standard → Glacier immediately, delete after 1 year
- User uploads: Keep in Standard-IA (frequent access)
Typical savings: $50-500/month (depending on data volume)
Opportunity #8: NAT Gateway Costs
What I Usually Find
Typical scenario:
- 2 NAT Gateways (one per AZ for redundancy)
- Cost: $0.045/hour × 2 = $0.09/hour × 730 hours = $65.70/month
- Data processing: $0.045/GB × 500 GB = $22.50/month
- Total: $88/month just for NAT Gateways
How it happens:
- Following best practice: “Use NAT Gateway in each AZ for redundancy”
- Didn’t consider cost vs. risk trade-off
How to Identify
# Check NAT Gateway costs
aws ce get-cost-and-usage \
--time-period Start=2025-12-01,End=2025-12-31 \
--granularity MONTHLY \
--metrics UnblendedCost \
--filter file://nat-filter.json
# nat-filter.json
{
"Dimensions": {
"Key": "SERVICE",
"Values": ["Amazon Elastic Compute Cloud - Compute"]
}
}
Look for “NatGateway” line items.
How to Fix
Option 1: Reduce to single NAT Gateway (save 50%)
If you can tolerate single-AZ failure (NAT Gateway unavailable for 10-30 minutes during AZ outage):
# Delete second NAT Gateway
aws ec2 delete-nat-gateway --nat-gateway-id nat-0123456789abcdef
# Update route tables to use single NAT Gateway
aws ec2 replace-route \
--route-table-id rtb-123456 \
--destination-cidr-block 0.0.0.0/0 \
--nat-gateway-id nat-remaining
Savings: ~$45/month
Option 2: Use NAT Instance instead (save 70%)
For very low traffic (<10 GB/day):
# Launch t4g.nano as NAT instance ($3/month)
aws ec2 run-instances \
--image-id ami-nat-instance \
--instance-type t4g.nano \
--key-name your-key \
--security-group-ids sg-123456
# Enable source/destination check disable
aws ec2 modify-instance-attribute \
--instance-id i-nat \
--no-source-dest-check
# Update route tables
aws ec2 replace-route \
--route-table-id rtb-123456 \
--destination-cidr-block 0.0.0.0/0 \
--instance-id i-nat
Savings: ~$60/month (but less reliable than NAT Gateway)
Option 3: VPC Endpoints for AWS services
If your NAT traffic is mostly to AWS services (S3, DynamoDB, etc.), use VPC Endpoints:
# Create S3 VPC Endpoint (free)
aws ec2 create-vpc-endpoint \
--vpc-id vpc-123456 \
--service-name com.amazonaws.ap-southeast-2.s3 \
--route-table-ids rtb-123456
This bypasses NAT Gateway for S3 traffic (free data transfer within same region).
Typical savings: $40-70/month
Opportunity #9: Load Balancer Optimisation
What I Usually Find
Typical scenario:
- 3 Application Load Balancers (ALBs)
- ALB-1: Production API
- ALB-2: Development environment
- ALB-3: Staging environment
Cost:
- 3 × $16/month base = $48/month
- 3 × $8/month LCU charges = $24/month
- Total: $72/month
Optimisation:
- Combine dev/staging into single ALB using host-based routing
- Saves 1 ALB = $24/month
How to Identify
# List all load balancers
aws elbv2 describe-load-balancers \
--query 'LoadBalancers[*].[LoadBalancerName,LoadBalancerArn,State.Code]' \
--output table
How to Fix
Consolidate using host-based routing:
# Add listener rules to single ALB
aws elbv2 create-rule \
--listener-arn arn:aws:elasticloadbalancing:ap-southeast-2:123456789:listener/app/my-alb/50dc6c495c0c9188/f2f7dc8efc522ab2 \
--priority 10 \
--conditions Field=host-header,Values=dev.example.com \
--actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:ap-southeast-2:123456789:targetgroup/dev-targets/50dc6c495c0c9188
aws elbv2 create-rule \
--listener-arn arn:aws:elasticloadbalancing:ap-southeast-2:123456789:listener/app/my-alb/50dc6c495c0c9188/f2f7dc8efc522ab2 \
--priority 20 \
--conditions Field=host-header,Values=staging.example.com \
--actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:ap-southeast-2:123456789:targetgroup/staging-targets/50dc6c495c0c9188
Delete unused ALB:
aws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:ap-southeast-2:123456789:loadbalancer/app/dev-alb/50dc6c495c0c9188
Typical savings: $20-50/month
Opportunity #10: RDS Reserved Instances
What I Usually Find
Typical scenario:
- Production database: db.m5.large, running 24/7 for 12+ months
- On-Demand cost: $140/month
- Reserved Instance (1-year, no upfront): $90/month
- Savings: $50/month (36%)
This is like #2 (Reserved Instances for EC2) but specifically for RDS.
How to Identify
# Get RDS Reserved Instance recommendations
aws ce get-reservation-purchase-recommendation \
--service "Amazon RDS" \
--lookback-period-in-days SIXTY_DAYS \
--term-in-years ONE_YEAR \
--payment-option NO_UPFRONT
How to Fix
# Purchase RDS Reserved Instance
aws rds purchase-reserved-db-instances-offering \
--reserved-db-instances-offering-id 12345678-1234-1234-1234-123456789012 \
--reserved-db-instance-id my-reserved-db
Typical savings: $50-300/month per database
Summary: Typical Savings Breakdown
For a startup spending $4,200/month on AWS:
| Opportunity | Typical Savings |
|---|---|
| 1. Terminate unused EC2 instances | $200-400/mo |
| 2. Purchase Reserved Instances | $300-600/mo |
| 3. Rightsize RDS databases | $300-500/mo |
| 4. Add CloudFront for data transfer | $200-400/mo |
| 5. Release unused Elastic IPs | $10-20/mo |
| 6. Delete old EBS snapshots | $30-80/mo |
| 7. S3 storage class optimisation | $50-200/mo |
| 8. Reduce NAT Gateway redundancy | $40-70/mo |
| 9. Consolidate load balancers | $20-40/mo |
| 10. RDS Reserved Instances | $50-200/mo |
Total potential savings: $1,200-2,510/month (30-60% of bill)
Time investment: 1-2 days to implement all optimisations
Ongoing maintenance: Quarterly reviews (4 hours)
The Cost Optimisation Process
Month 1: Low-hanging fruit
- Terminate unused resources
- Delete old snapshots and unused Elastic IPs
- Add CloudFront
Month 2: Rightsizing
- Analyse CPU/RAM usage
- Rightsize EC2 and RDS
- Implement S3 lifecycle policies
Month 3: Commitment-based savings
- Purchase Reserved Instances / Savings Plans
- Review and optimise
Ongoing (quarterly):
- Review AWS Cost Explorer
- Check for new waste
- Adjust reservations based on growth
Tools to Help
AWS-native:
- AWS Cost Explorer (identify trends)
- AWS Trusted Advisor (free recommendations)
- AWS Compute Optimiser (rightsizing recommendations)
- AWS Cost Anomaly Detection (alerts for spikes)
Third-party (optional):
- CloudHealth / CloudCheckr (advanced cost management)
- Infracost (IaC cost estimation)
Conclusion
These 10 optimisations appear in virtually every Well-Architected Review I conduct:
- Terminate unused EC2 instances
- Purchase Reserved Instances / Savings Plans
- Rightsize oversized RDS databases
- Reduce data transfer with CloudFront
- Release unused Elastic IPs
- Delete old EBS snapshots
- Optimise S3 storage classes
- Reduce NAT Gateway redundancy
- Consolidate load balancers
- Purchase RDS Reserved Instances
Average savings: 30-60% of AWS bill
Time to implement: 1-2 days
Payback period: Immediate
Ongoing effort: Quarterly review (4 hours)
The key is treating cost optimisation as an ongoing practice, not a one-time project. Set a calendar reminder for quarterly reviews, and you’ll continuously find new savings as your infrastructure evolves.
Need help identifying cost optimisation opportunities in your AWS environment? My Well-Architected Review includes comprehensive cost analysis, prioritised savings recommendations, and implementation support delivered in weeks instead of months. Most clients save 5-10× the review cost in the first month.