FinOps for Developers: Cutting Cloud Costs Without Slowing Down

Cloud costs have become a top-three engineering concern. In 2026, with AI workloads layered on top of existing infrastructure, bills that once seemed manageable can spiral quickly. The FinOps discipline — bringing financial accountability to cloud engineering — is no longer just a CFO concern. Developers are increasingly expected to understand and optimize the cost of what they build.

This post covers actionable tactics, not theory. Each section includes specific numbers and implementation details.

Cloud cost optimization Photo by Towfiqu barbhuiya on Unsplash

The FinOps Mindset for Engineers

FinOps isn’t about being cheap — it’s about cost-efficiency. The goal is the highest business value per dollar, not the lowest bill. Some principles:

Visibility first — you can’t optimize what you can’t see
Waste is different from investment — idle EC2 ≠ reserved capacity for scale events
Cost is a feature — “this will cost 30% more but reduce latency by 200ms” is a valid engineering trade-off
Shift left on cost — review cost implications in architecture review, not after the bill arrives

1. Right-Sizing: The Biggest Win

Studies consistently show 30-40% of cloud spend is on over-provisioned resources. Right-sizing is the highest-ROI activity.

Compute Right-Sizing on AWS

# Use AWS Compute Optimizer
aws compute-optimizer get-ec2-instance-recommendations \
  --account-ids 123456789012 \
  --filters name=Finding,values=Overprovisioned

# Or use the CLI to find idle instances
aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value=i-0123456789 \
  --start-time 2026-02-17T00:00:00Z \
  --end-time 2026-03-17T00:00:00Z \
  --period 86400 \  # daily
  --statistics Average Maximum

Target thresholds for right-sizing candidates:

CPU average < 5% → likely 2+ sizes too large
CPU average 5-20% → consider one size down
Memory average < 30% → evaluate memory-optimized ratio

Kubernetes Pod Right-Sizing

# VPA (Vertical Pod Autoscaler) for recommendations
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-deployment
  updatePolicy:
    updateMode: "Off"  # Recommendation-only mode (safe)

# Read VPA recommendations
kubectl get vpa api-vpa -o jsonpath='{.status.recommendation}'

Typical finding: pods requested 2 CPU / 4Gi memory, VPA recommends 0.3 CPU / 512Mi. An 85% cost reduction for that workload.

2. Spot/Preemptible Instances for Stateless Workloads

Spot instances are 60-90% cheaper than on-demand. The only cost: they can be interrupted with 2-minute notice. For stateless, horizontally-scaled workloads, this is a non-issue.

Kubernetes Mixed Node Groups (AWS)

# EKS Managed Node Group with mixed instance types
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
managedNodeGroups:
- name: workers
  instanceTypes:
    - m6i.xlarge
    - m6a.xlarge
    - m5.xlarge
    - m5a.xlarge  # fallback pool
  spot: true
  minSize: 3
  maxSize: 50
  desiredCapacity: 10

Pod Disruption Budget (protect critical workloads)

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
spec:
  minAvailable: "75%"  # at least 75% of pods always running
  selector:
    matchLabels:
      app: api

Tolerations for Spot Scheduling

# Allow pods to schedule on spot nodes
spec:
  tolerations:
  - key: "node.kubernetes.io/spot"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 80
        preference:
          matchExpressions:
          - key: "node.kubernetes.io/spot"
            operator: In
            values: ["true"]

Real numbers: A team with 40 on-demand m5.xlarge instances ($0.192/hr each = $5,529/mo) switched to 80% spot: savings of ~$3,400/month.

3. Storage Cost Reduction

Storage is often invisible until it’s not. S3, EBS snapshots, and database storage compound silently.

S3 Intelligent-Tiering

import boto3

s3 = boto3.client("s3")

# Apply intelligent tiering to buckets with unpredictable access patterns
s3.put_bucket_intelligent_tiering_configuration(
    Bucket="my-data-bucket",
    Id="EntireS3Bucket",
    IntelligentTieringConfiguration={
        "Id": "EntireS3Bucket",
        "Status": "Enabled",
        "Tierings": [
            {
                "Days": 90,
                "AccessTier": "ARCHIVE_ACCESS"  # 40% cheaper than Standard
            },
            {
                "Days": 180,
                "AccessTier": "DEEP_ARCHIVE_ACCESS"  # 75% cheaper
            }
        ]
    }
)

S3 Lifecycle Policies (Infrastructure as Code)

resource "aws_s3_bucket_lifecycle_configuration" "logs" {
  bucket = aws_s3_bucket.logs.id

  rule {
    id     = "logs-lifecycle"
    status = "Enabled"

    transition {
      days          = 30
      storage_class = "STANDARD_IA"  # ~45% cheaper than Standard
    }

    transition {
      days          = 90
      storage_class = "GLACIER_IR"   # ~68% cheaper than Standard
    }

    expiration {
      days = 365  # delete after 1 year
    }
  }
}

EBS Snapshot Cleanup

#!/bin/bash
# Find snapshots older than 30 days with no associated AMI

aws ec2 describe-snapshots \
  --owner-ids self \
  --query 'Snapshots[?StartTime<=`'"$(date -d '30 days ago' --utc +%Y-%m-%dT%H:%M:%S)"'`].[SnapshotId,StartTime,Description]' \
  --output table

4. Database Cost Optimization

Databases are often 20-30% of cloud spend. Several levers:

RDS Reserved Instances

Committing to 1-year reserved instances saves ~40% over on-demand. 3-year saves ~60%.

# Identify RDS instances running > 90 days (reservation candidates)
import boto3
from datetime import datetime, timezone, timedelta

rds = boto3.client('rds')
response = rds.describe_db_instances()

cutoff = datetime.now(timezone.utc) - timedelta(days=90)

candidates = [
    db for db in response['DBInstances']
    if db['InstanceCreateTime'] < cutoff
    and db['DBInstanceStatus'] == 'available'
]

for db in candidates:
    print(f"{db['DBInstanceIdentifier']}: {db['DBInstanceClass']}, "
          f"running since {db['InstanceCreateTime'].date()}")

Read Replica Cost Check

Many teams add read replicas for performance without measuring read traffic:

-- PostgreSQL: check if read replica is actually being used
SELECT 
    query,
    calls,
    total_exec_time / calls as avg_ms
FROM pg_stat_statements
WHERE query NOT LIKE 'SET %'
ORDER BY calls DESC
LIMIT 20;

If your read replica serves <10% of queries, it may not justify its cost.

DynamoDB On-Demand vs Provisioned

# Calculate if on-demand or provisioned is cheaper for your pattern
def should_switch_to_provisioned(
    avg_read_capacity: float,
    avg_write_capacity: float,
    peak_multiplier: float = 3.0
) -> dict:
    
    # On-demand pricing (us-east-1, approximate 2026 rates)
    on_demand_read_price = 0.25 / 1_000_000   # per RCU
    on_demand_write_price = 1.25 / 1_000_000  # per WCU
    
    # Provisioned pricing
    provisioned_read_price = 0.00013 / 3600   # per RCU-hour
    provisioned_write_price = 0.00065 / 3600  # per WCU-hour
    
    # Estimate monthly costs
    monthly_rcus = avg_read_capacity * 3600 * 24 * 30
    monthly_wcus = avg_write_capacity * 3600 * 24 * 30
    
    on_demand_monthly = (monthly_rcus * on_demand_read_price + 
                         monthly_wcus * on_demand_write_price)
    
    provisioned_monthly = (
        avg_read_capacity * peak_multiplier * provisioned_read_price * 3600 * 24 * 30 +
        avg_write_capacity * peak_multiplier * provisioned_write_price * 3600 * 24 * 30
    )
    
    return {
        "on_demand_monthly": f"${on_demand_monthly:.2f}",
        "provisioned_monthly": f"${provisioned_monthly:.2f}",
        "recommendation": "provisioned" if provisioned_monthly < on_demand_monthly else "on-demand"
    }

5. Serverless Cost Anti-Patterns

Lambda and serverless can cost more than EC2 if misused.

Lambda Memory Tuning

More memory = higher per-invocation cost, but shorter duration. The optimum isn’t always lowest memory:

# Use AWS Lambda Power Tuning (open source tool)
# https://github.com/alexcasalboni/aws-lambda-power-tuning

# Rough calculation for your function
def optimal_lambda_memory(
    baseline_ms_at_128mb: float,
    invocations_per_month: int
) -> dict:
    memory_configs = [128, 256, 512, 1024, 2048, 3008]
    # Rough scaling: 2x memory ≈ 0.7x duration for CPU-bound functions
    
    results = []
    for mb in memory_configs:
        scale = (128 / mb) ** 0.6  # empirical scaling factor
        duration = baseline_ms_at_128mb * scale
        # Lambda pricing: $0.0000166667 per GB-second
        gb_seconds = (mb / 1024) * (duration / 1000) * invocations_per_month
        cost = gb_seconds * 0.0000166667
        results.append({"memory_mb": mb, "est_duration_ms": duration, "monthly_cost": cost})
    
    return sorted(results, key=lambda x: x["monthly_cost"])

# Find the sweet spot
print(optimal_lambda_memory(baseline_ms_at_128mb=800, invocations_per_month=1_000_000))

Avoid Lambda for Long-Running Tasks

Lambda bills per 100ms, up to 15 minutes. Anything running > 30 seconds that’s invoked frequently is often cheaper on Fargate or even EC2 spot.

Rule of thumb: Lambda is cheapest for sporadic, short-duration workloads. Continuous, predictable workloads belong on containers.

6. Cost Visibility: Tagging Strategy

You can’t optimize what you can’t attribute. A consistent tagging strategy is foundational:

# Apply tags to all AWS resources via Terraform default tags
terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
    }
  }
}

provider "aws" {
  region = "us-east-1"
  
  default_tags {
    tags = {
      Team        = "platform"
      Service     = "api"
      Environment = "production"
      CostCenter  = "engineering"
      ManagedBy   = "terraform"
    }
  }
}

Cost Allocation with AWS Cost Explorer API

import boto3
from datetime import date, timedelta

ce = boto3.client('cost-explorer', region_name='us-east-1')

# Get cost by service+team for last 30 days
response = ce.get_cost_and_usage(
    TimePeriod={
        'Start': (date.today() - timedelta(days=30)).isoformat(),
        'End': date.today().isoformat()
    },
    Granularity='MONTHLY',
    Filter={
        'Tags': {
            'Key': 'Team',
            'Values': ['platform', 'data', 'frontend']
        }
    },
    GroupBy=[
        {'Type': 'TAG', 'Key': 'Team'},
        {'Type': 'DIMENSION', 'Key': 'SERVICE'}
    ],
    Metrics=['BlendedCost']
)

for result in response['ResultsByTime'][0]['Groups']:
    keys = result['Keys']
    cost = result['Metrics']['BlendedCost']['Amount']
    print(f"{keys[0]} / {keys[1]}: ${float(cost):.2f}")

Building a Cost Dashboard

Monthly cost reviews shouldn’t be surprises. Build automated alerting:

# Lambda function: alert on cost anomalies
import boto3
import json

def check_cost_anomaly(event, context):
    ce = boto3.client('cost-explorer')
    sns = boto3.client('sns')
    
    anomalies = ce.get_anomalies(
        DateInterval={
            'StartDate': '2026-03-01',
            'EndDate': '2026-03-17'
        },
        TotalImpact={
            'NumericOperator': 'GREATER_THAN',
            'StartValue': 100  # alert on $100+ anomalies
        }
    )
    
    if anomalies['Anomalies']:
        message = "⚠️ Cost Anomalies Detected:\n\n"
        for anomaly in anomalies['Anomalies']:
            impact = anomaly['Impact']['TotalImpact']
            message += f"- {anomaly['RootCauses'][0].get('Service', 'Unknown')}: "
            message += f"${impact:.2f} above expected\n"
        
        sns.publish(
            TopicArn='arn:aws:sns:us-east-1:123456789:cost-alerts',
            Message=message,
            Subject='AWS Cost Anomaly Alert'
        )

Quick Wins Checklist

Action	Typical Savings	Effort
Delete unattached EBS volumes	$50-500/mo	Low
Remove unused Elastic IPs	$10-100/mo	Low
Enable S3 Intelligent-Tiering	10-30% S3 bill	Low
Right-size obvious outliers	20-40% compute	Medium
Spot instances for stateless K8s	60-80% node cost	Medium
1-year RDS reservations	30-40% RDS bill	Medium
Clean up old snapshots	$20-200/mo	Low
Consolidate CloudWatch log groups	20-40% logging	Medium

Conclusion

The best time to think about cloud cost is during architecture design. The second best time is now.

Start with visibility (tagging + cost explorer), identify the top 3 cost drivers, and tackle them systematically. Most teams find 20-30% savings with 2-3 weeks of focused effort — and those savings compound as the team builds cost awareness into their development habits.

FinOps isn’t a one-time project. It’s a practice.

Tags: Cloud, AWS, FinOps, Cost Optimization, Kubernetes, Serverless, DevOps

이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)