10 Cost Optimisation Opportunities I Find in Every Well-Architected Review

Last month, I worked with a startup spending $4,200/month on AWS. By the end of our 2-week review, I’d identified $1,680/month in saving 40% of their bill without touching performance or reliability.

The crazy part? None of these optimisations required sophisticated engineering. They were all straightforward configuration changes, rightsizing decisions, and removing waste.

After conducting dozens of Well-Architected Reviews, I keep seeing the same patterns. This guide covers the 10 most common ways startups overspend, how to spot them in your environment, and step-by-step fixes.

Before We Start: The Cost Optimisation Mindset

Cost optimisation is NOT:

Choosing the cheapest option regardless of impact
Sacrificing reliability or performance to save $50/month
A one-time exercise

Cost optimisation IS:

Using the right resources for your workload
Removing waste (unused resources, inefficient configurations)
Ongoing practice (review quarterly)

Every optimisation in this guide maintains or improves your infrastructure while reducing costs.

Opportunity #1: Unused or Underutilised EC2 Instances

What I Usually Find

Typical scenario:

5 EC2 instances running 24/7
2 of them have <5% average CPU utilisation over 30 days
1 is completely stopped but still has attached EBS volumes

How it happens:

Someone spins up an instance for testing
Forgets to terminate it after testing
Instance runs for 6 months, costing $15-60/month

How to Identify

# List all running instances with CPU utilization
aws ec2 describe-instances \
  --filters "Name=instance-state-name,Values=running" \
  --query 'Reservations[*].Instances[*].[InstanceId,InstanceType,Tags[?Key==`Name`].Value|[0]]' \
  --output table

# Check CloudWatch for CPU utilization (last 30 days)
for instance in $(aws ec2 describe-instances --query 'Reservations[*].Instances[*].InstanceId' --output text); do
  avg_cpu=$(aws cloudwatch get-metric-statistics \
    --namespace AWS/EC2 \
    --metric-name CPUUtilization \
    --dimensions Name=InstanceId,Value=$instance \
    --start-time $(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%S) \
    --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
    --period 2592000 \
    --statistics Average \
    --query 'Datapoints[0].Average' \
    --output text)

  echo "$instance: Average CPU = $avg_cpu%"
done

How to Fix

For unused instances (stopped or <5% CPU for 30 days):

# Terminate unused instance
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0

# Delete associated EBS volumes (billed even when instance is stopped)
aws ec2 delete-volume --volume-id vol-123456

For underutilised instances (5-20% CPU consistently):

Rightsize to smaller instance type
Example: t3.large (2 vCPU, 8GB) → t3.medium (2 vCPU, 4GB) saves $30/month

# Resize instance (requires stop/start)
aws ec2 stop-instances --instance-ids i-1234567890abcdef0
aws ec2 modify-instance-attribute --instance-id i-1234567890abcdef0 --instance-type t3.medium
aws ec2 start-instances --instance-ids i-1234567890abcdef0

Typical savings: $200-800/month (depending on instance count and types)

Opportunity #2: No Reserved Instances or Savings Plans

What I Usually Find

Typical scenario:

10 EC2 instances running 24/7 for production
All on On-Demand pricing
Paying full price when 40-60% discount is available

Monthly waste:

10 × t3.large On-Demand: $750/month
10 × t3.large Reserved (1-year, no upfront): $450/month
Savings: $300/month (40%) for same resources

How to Identify

# Check current On-Demand spend
aws ce get-cost-and-usage \
  --time-period Start=2025-12-01,End=2025-12-31 \
  --granularity MONTHLY \
  --metrics UnblendedCost \
  --group-by Type=SERVICE

# Get EC2 usage recommendations
aws ce get-reservation-purchase-recommendation \
  --service EC2 \
  --lookback-period-in-days SIXTY_DAYS \
  --term-in-years ONE_YEAR \
  --payment-option NO_UPFRONT

AWS Cost Explorer will tell you exactly which Reserved Instances to buy for maximum savings.

How to Fix

Step 1: Identify steady workloads (running 24/7 for >6 months)

Step 2: Purchase Reserved Instances or Compute Savings Plans

Reserved Instances:

Best for specific instance types (e.g., 5× t3.large)
Lock-in to instance type and region
Up to 72% savings (3-year, all upfront)

Compute Savings Plans:

More flexible (applies to any instance family, region)
Up to 66% savings (3-year)
Recommended for most startups

# Purchase 1-year Compute Savings Plan ($10/hour commitment)
aws savingsplans create-savings-plan \
  --savings-plan-type "ComputeSavingsPlans" \
  --commitment $10 \
  --upfront-payment-amount $0 \
  --term-duration-in-years 1

Recommendation:

Start with 1-year, no upfront commitment
Purchase enough to cover 70-80% of your baseline compute (not 100%)
Re-evaluate every 6 months

Typical savings: $300-1,000/month (depending on On-Demand spend)

Opportunity #3: Oversized RDS Databases

What I Usually Find

Typical scenario:

Production database: db.m5.2xlarge (8 vCPU, 32 GB RAM)
Average CPU: 15%
Average RAM: 35%
Database is 3-4× larger than needed

Monthly cost:

db.m5.2xlarge: $560/month
db.m5.large (2 vCPU, 8 GB RAM): $140/month
Savings: $420/month (75%)

How to Identify

# Check RDS instance sizes and utilisation
aws rds describe-db-instances \
  --query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceClass,Engine]' \
  --output table

# Check CPU utilisation (last 30 days)
aws cloudwatch get-metric-statistics \
  --namespace AWS/RDS \
  --metric-name CPUUtilization \
  --dimensions Name=DBInstanceIdentifier,Value=your-db \
  --start-time $(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
  --period 2592000 \
  --statistics Average,Maximum \
  --output table

Red flag: CPU <30% and RAM <50% consistently = oversized

How to Fix

Step 1: Take a snapshot (safety first)

aws rds create-db-snapshot \
  --db-instance-identifier your-db \
  --db-snapshot-identifier before-resize-$(date +%Y%m%d)

Step 2: Resize during maintenance window

aws rds modify-db-instance \
  --db-instance-identifier your-db \
  --db-instance-class db.m5.large \
  --apply-immediately

Important:

--apply-immediately causes brief downtime (1-3 minutes)
Use --no-apply-immediately to schedule during next maintenance window
Test in staging first

Typical savings: $300-600/month per database

Opportunity #4: Excessive Data Transfer Costs

What I Usually Find

Typical scenario:

Monthly AWS bill: $1,200
Data transfer out: $400 (33% of bill)
Root cause: Serving static assets from S3 without CloudFront

How it happens:

Images/CSS/JS served directly from S3
Every request pays S3 data transfer: $0.09/GB
CloudFront costs $0.085/GB but includes caching (fewer origin fetches)

How to Identify

# Check data transfer costs
aws ce get-cost-and-usage \
  --time-period Start=2025-12-01,End=2025-12-31 \
  --granularity MONTHLY \
  --metrics UnblendedCost \
  --filter file://filter.json

# filter.json
{
  "Dimensions": {
    "Key": "USAGE_TYPE_GROUP",
    "Values": ["EC2: Data Transfer - Internet (Out)"]
  }
}

Red flag: Data transfer >20% of total bill

How to Fix

Step 1: Add CloudFront in front of S3

# Create CloudFront distribution for S3 bucket
aws cloudfront create-distribution \
  --distribution-config file://cloudfront-config.json

cloudfront-config.json:

{
  "CallerReference": "my-static-assets",
  "Comment": "Static assets CDN",
  "Enabled": true,
  "Origins": {
    "Quantity": 1,
    "Items": [
      {
        "Id": "S3-my-bucket",
        "DomainName": "my-bucket.s3.ap-southeast-2.amazonaws.com",
        "S3OriginConfig": {
          "OriginAccessIdentity": ""
        }
      }
    ]
  },
  "DefaultCacheBehavior": {
    "TargetOriginId": "S3-my-bucket",
    "ViewerProtocolPolicy": "redirect-to-https",
    "Compress": true,
    "ForwardedValues": {
      "QueryString": false,
      "Cookies": { "Forward": "none" }
    }
  }
}

Step 2: Update application to use CloudFront URLs

// Before
const imageUrl = `https://my-bucket.s3.amazonaws.com/images/${image}.jpg`;

// After
const imageUrl = `https://d123456.cloudfront.net/images/${image}.jpg`;

Impact:

S3 origin fetches reduced by 85-95% (caching)
Data transfer costs drop 70-90%

Typical savings: $200-500/month

Opportunity #5: Unused Elastic IPs

What I Usually Find

Typical scenario:

5 Elastic IPs allocated
2 not associated with any running instance
AWS charges $0.005/hour for unattached Elastic IPs ($3.60/month each)

How it happens:

Engineer terminates EC2 instance
Forgets to release Elastic IP
Elastic IP sits unused for months

How to Identify

# List all Elastic IPs and their associations
aws ec2 describe-addresses \
  --query 'Addresses[*].[PublicIp,InstanceId,AssociationId]' \
  --output table

If InstanceId is empty, the Elastic IP is unused and costing money.

How to Fix

# Release unused Elastic IP
aws ec2 release-address --allocation-id eipalloc-12345678

Warning: Only release if you’re sure you don’t need it. Released IPs can’t be recovered.

Typical savings: $5-20/month (seemingly small, but it’s literally free money)

Opportunity #6: Old EBS Snapshots

What I Usually Find

Typical scenario:

150 EBS snapshots
Most are >90 days old
Many are from terminated instances
Cost: $0.05/GB/month × 500 GB = $25/month for unused backups

How it happens:

Automated daily snapshots created
Retention policy not configured
Snapshots accumulate forever

How to Identify

# List snapshots older than 90 days
aws ec2 describe-snapshots \
  --owner-ids self \
  --query "Snapshots[?StartTime<='$(date -d '90 days ago' -u +%Y-%m-%dT%H:%M:%S.000Z)'].[SnapshotId,StartTime,VolumeSize,Description]" \
  --output table

How to Fix

Step 1: Define retention policy

Production volumes: Keep 30 days of snapshots
Development volumes: Keep 7 days of snapshots
Long-term backups: Archive to S3 Glacier ($0.004/GB vs. $0.05/GB)

Step 2: Delete old snapshots

# Delete snapshots older than 90 days
for snapshot in $(aws ec2 describe-snapshots --owner-ids self --query "Snapshots[?StartTime<='$(date -d '90 days ago' -u +%Y-%m-%dT%H:%M:%S.000Z)'].SnapshotId" --output text); do
  echo "Deleting $snapshot"
  aws ec2 delete-snapshot --snapshot-id $snapshot
done

Step 3: Automate snapshot lifecycle

Use AWS Data Lifecycle Manager to automatically delete old snapshots:

aws dlm create-lifecycle-policy \
  --description "Delete EBS snapshots after 30 days" \
  --state ENABLED \
  --execution-role-arn arn:aws:iam::123456789:role/AWSDataLifecycleManagerDefaultRole \
  --policy-details file://lifecycle-policy.json

Typical savings: $20-100/month

Opportunity #7: S3 Storage Class Optimisation

What I Usually Find

Typical scenario:

500 GB in S3 Standard ($0.023/GB) = $11.50/month
Analysis shows:
- 300 GB not accessed in 90+ days (should be Glacier: $0.004/GB)
- 100 GB accessed 1-2 times/month (should be Standard-IA: $0.0125/GB)

Potential savings:

300 GB to Glacier: $11.50 → $1.20 (saves $10.30)
100 GB to Standard-IA: $2.30 → $1.25 (saves $1.05)
Total savings: $11.35/month

This compounds at scale. 10 TB = $1,000+/month savings.

How to Identify

# Enable S3 Storage Lens (free)
aws s3control put-storage-lens-configuration \
  --account-id 123456789 \
  --config-id default-account-dashboard \
  --storage-lens-configuration file://storage-lens-config.json

S3 Storage Lens shows access patterns and recommends storage class transitions.

How to Fix

Create lifecycle policies to automatically transition objects:

aws s3api put-bucket-lifecycle-configuration \
  --bucket my-bucket \
  --lifecycle-configuration file://lifecycle.json

lifecycle.json:

{
  "Rules": [
    {
      "Id": "Move to IA after 30 days",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        }
      ]
    },
    {
      "Id": "Move to Glacier after 90 days",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        }
      ]
    },
    {
      "Id": "Delete after 365 days",
      "Status": "Enabled",
      "Expiration": {
        "Days": 365
      }
    }
  ]
}

Adjust based on your use case:

Logs: Standard → Glacier after 30 days, delete after 90 days
Backups: Standard → Glacier immediately, delete after 1 year
User uploads: Keep in Standard-IA (frequent access)

Typical savings: $50-500/month (depending on data volume)

Opportunity #8: NAT Gateway Costs

What I Usually Find

Typical scenario:

2 NAT Gateways (one per AZ for redundancy)
Cost: $0.045/hour × 2 = $0.09/hour × 730 hours = $65.70/month
Data processing: $0.045/GB × 500 GB = $22.50/month
Total: $88/month just for NAT Gateways

How it happens:

Following best practice: “Use NAT Gateway in each AZ for redundancy”
Didn’t consider cost vs. risk trade-off

How to Identify

# Check NAT Gateway costs
aws ce get-cost-and-usage \
  --time-period Start=2025-12-01,End=2025-12-31 \
  --granularity MONTHLY \
  --metrics UnblendedCost \
  --filter file://nat-filter.json

# nat-filter.json
{
  "Dimensions": {
    "Key": "SERVICE",
    "Values": ["Amazon Elastic Compute Cloud - Compute"]
  }
}

Look for “NatGateway” line items.

How to Fix

Option 1: Reduce to single NAT Gateway (save 50%)

If you can tolerate single-AZ failure (NAT Gateway unavailable for 10-30 minutes during AZ outage):

# Delete second NAT Gateway
aws ec2 delete-nat-gateway --nat-gateway-id nat-0123456789abcdef

# Update route tables to use single NAT Gateway
aws ec2 replace-route \
  --route-table-id rtb-123456 \
  --destination-cidr-block 0.0.0.0/0 \
  --nat-gateway-id nat-remaining

Savings: ~$45/month

Option 2: Use NAT Instance instead (save 70%)

For very low traffic (<10 GB/day):

# Launch t4g.nano as NAT instance ($3/month)
aws ec2 run-instances \
  --image-id ami-nat-instance \
  --instance-type t4g.nano \
  --key-name your-key \
  --security-group-ids sg-123456

# Enable source/destination check disable
aws ec2 modify-instance-attribute \
  --instance-id i-nat \
  --no-source-dest-check

# Update route tables
aws ec2 replace-route \
  --route-table-id rtb-123456 \
  --destination-cidr-block 0.0.0.0/0 \
  --instance-id i-nat

Savings: ~$60/month (but less reliable than NAT Gateway)

Option 3: VPC Endpoints for AWS services

If your NAT traffic is mostly to AWS services (S3, DynamoDB, etc.), use VPC Endpoints:

# Create S3 VPC Endpoint (free)
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-123456 \
  --service-name com.amazonaws.ap-southeast-2.s3 \
  --route-table-ids rtb-123456

This bypasses NAT Gateway for S3 traffic (free data transfer within same region).

Typical savings: $40-70/month

Opportunity #9: Load Balancer Optimisation

What I Usually Find

Typical scenario:

3 Application Load Balancers (ALBs)
ALB-1: Production API
ALB-2: Development environment
ALB-3: Staging environment

Cost:

3 × $16/month base = $48/month
3 × $8/month LCU charges = $24/month
Total: $72/month

Optimisation:

Combine dev/staging into single ALB using host-based routing
Saves 1 ALB = $24/month

How to Identify

# List all load balancers
aws elbv2 describe-load-balancers \
  --query 'LoadBalancers[*].[LoadBalancerName,LoadBalancerArn,State.Code]' \
  --output table

How to Fix

Consolidate using host-based routing:

# Add listener rules to single ALB
aws elbv2 create-rule \
  --listener-arn arn:aws:elasticloadbalancing:ap-southeast-2:123456789:listener/app/my-alb/50dc6c495c0c9188/f2f7dc8efc522ab2 \
  --priority 10 \
  --conditions Field=host-header,Values=dev.example.com \
  --actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:ap-southeast-2:123456789:targetgroup/dev-targets/50dc6c495c0c9188

aws elbv2 create-rule \
  --listener-arn arn:aws:elasticloadbalancing:ap-southeast-2:123456789:listener/app/my-alb/50dc6c495c0c9188/f2f7dc8efc522ab2 \
  --priority 20 \
  --conditions Field=host-header,Values=staging.example.com \
  --actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:ap-southeast-2:123456789:targetgroup/staging-targets/50dc6c495c0c9188

Delete unused ALB:

aws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:ap-southeast-2:123456789:loadbalancer/app/dev-alb/50dc6c495c0c9188

Typical savings: $20-50/month

Opportunity #10: RDS Reserved Instances

What I Usually Find

Typical scenario:

Production database: db.m5.large, running 24/7 for 12+ months
On-Demand cost: $140/month
Reserved Instance (1-year, no upfront): $90/month
Savings: $50/month (36%)

This is like #2 (Reserved Instances for EC2) but specifically for RDS.

How to Identify

# Get RDS Reserved Instance recommendations
aws ce get-reservation-purchase-recommendation \
  --service "Amazon RDS" \
  --lookback-period-in-days SIXTY_DAYS \
  --term-in-years ONE_YEAR \
  --payment-option NO_UPFRONT

How to Fix

# Purchase RDS Reserved Instance
aws rds purchase-reserved-db-instances-offering \
  --reserved-db-instances-offering-id 12345678-1234-1234-1234-123456789012 \
  --reserved-db-instance-id my-reserved-db

Typical savings: $50-300/month per database

Summary: Typical Savings Breakdown

For a startup spending $4,200/month on AWS:

Opportunity	Typical Savings
1. Terminate unused EC2 instances	$200-400/mo
2. Purchase Reserved Instances	$300-600/mo
3. Rightsize RDS databases	$300-500/mo
4. Add CloudFront for data transfer	$200-400/mo
5. Release unused Elastic IPs	$10-20/mo
6. Delete old EBS snapshots	$30-80/mo
7. S3 storage class optimisation	$50-200/mo
8. Reduce NAT Gateway redundancy	$40-70/mo
9. Consolidate load balancers	$20-40/mo
10. RDS Reserved Instances	$50-200/mo

Total potential savings: $1,200-2,510/month (30-60% of bill)

Time investment: 1-2 days to implement all optimisations

Ongoing maintenance: Quarterly reviews (4 hours)

The Cost Optimisation Process

Month 1: Low-hanging fruit

Terminate unused resources
Delete old snapshots and unused Elastic IPs
Add CloudFront

Month 2: Rightsizing

Analyse CPU/RAM usage
Rightsize EC2 and RDS
Implement S3 lifecycle policies

Month 3: Commitment-based savings

Purchase Reserved Instances / Savings Plans
Review and optimise

Ongoing (quarterly):

Review AWS Cost Explorer
Check for new waste
Adjust reservations based on growth

Tools to Help

AWS-native:

AWS Cost Explorer (identify trends)
AWS Trusted Advisor (free recommendations)
AWS Compute Optimiser (rightsizing recommendations)
AWS Cost Anomaly Detection (alerts for spikes)

Third-party (optional):

CloudHealth / CloudCheckr (advanced cost management)
Infracost (IaC cost estimation)

Conclusion

These 10 optimisations appear in virtually every Well-Architected Review I conduct:

Terminate unused EC2 instances
Purchase Reserved Instances / Savings Plans
Rightsize oversized RDS databases
Reduce data transfer with CloudFront
Release unused Elastic IPs
Delete old EBS snapshots
Optimise S3 storage classes
Reduce NAT Gateway redundancy
Consolidate load balancers
Purchase RDS Reserved Instances

Average savings: 30-60% of AWS bill

Time to implement: 1-2 days

Payback period: Immediate

Ongoing effort: Quarterly review (4 hours)

The key is treating cost optimisation as an ongoing practice, not a one-time project. Set a calendar reminder for quarterly reviews, and you’ll continuously find new savings as your infrastructure evolves.

Need help identifying cost optimisation opportunities in your AWS environment? My Well-Architected Review includes comprehensive cost analysis, prioritised savings recommendations, and implementation support delivered in weeks instead of months. Most clients save 5-10× the review cost in the first month.