FinOps for Developers: Cutting Cloud Costs Without Slowing Down
on Cloud, Finops, Aws, Cost optimization, Devops
FinOps for Developers: Cutting Cloud Costs Without Slowing Down
Cloud costs have become a top-three engineering concern. In 2026, with AI workloads layered on top of existing infrastructure, bills that once seemed manageable can spiral quickly. The FinOps discipline — bringing financial accountability to cloud engineering — is no longer just a CFO concern. Developers are increasingly expected to understand and optimize the cost of what they build.
This post covers actionable tactics, not theory. Each section includes specific numbers and implementation details.
Photo by Towfiqu barbhuiya on Unsplash
The FinOps Mindset for Engineers
FinOps isn’t about being cheap — it’s about cost-efficiency. The goal is the highest business value per dollar, not the lowest bill. Some principles:
- Visibility first — you can’t optimize what you can’t see
- Waste is different from investment — idle EC2 ≠ reserved capacity for scale events
- Cost is a feature — “this will cost 30% more but reduce latency by 200ms” is a valid engineering trade-off
- Shift left on cost — review cost implications in architecture review, not after the bill arrives
1. Right-Sizing: The Biggest Win
Studies consistently show 30-40% of cloud spend is on over-provisioned resources. Right-sizing is the highest-ROI activity.
Compute Right-Sizing on AWS
# Use AWS Compute Optimizer
aws compute-optimizer get-ec2-instance-recommendations \
--account-ids 123456789012 \
--filters name=Finding,values=Overprovisioned
# Or use the CLI to find idle instances
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-0123456789 \
--start-time 2026-02-17T00:00:00Z \
--end-time 2026-03-17T00:00:00Z \
--period 86400 \ # daily
--statistics Average Maximum
Target thresholds for right-sizing candidates:
- CPU average < 5% → likely 2+ sizes too large
- CPU average 5-20% → consider one size down
- Memory average < 30% → evaluate memory-optimized ratio
Kubernetes Pod Right-Sizing
# VPA (Vertical Pod Autoscaler) for recommendations
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api-deployment
updatePolicy:
updateMode: "Off" # Recommendation-only mode (safe)
# Read VPA recommendations
kubectl get vpa api-vpa -o jsonpath='{.status.recommendation}'
Typical finding: pods requested 2 CPU / 4Gi memory, VPA recommends 0.3 CPU / 512Mi. An 85% cost reduction for that workload.
2. Spot/Preemptible Instances for Stateless Workloads
Spot instances are 60-90% cheaper than on-demand. The only cost: they can be interrupted with 2-minute notice. For stateless, horizontally-scaled workloads, this is a non-issue.
Kubernetes Mixed Node Groups (AWS)
# EKS Managed Node Group with mixed instance types
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
managedNodeGroups:
- name: workers
instanceTypes:
- m6i.xlarge
- m6a.xlarge
- m5.xlarge
- m5a.xlarge # fallback pool
spot: true
minSize: 3
maxSize: 50
desiredCapacity: 10
Pod Disruption Budget (protect critical workloads)
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: "75%" # at least 75% of pods always running
selector:
matchLabels:
app: api
Tolerations for Spot Scheduling
# Allow pods to schedule on spot nodes
spec:
tolerations:
- key: "node.kubernetes.io/spot"
operator: "Equal"
value: "true"
effect: "NoSchedule"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: "node.kubernetes.io/spot"
operator: In
values: ["true"]
Real numbers: A team with 40 on-demand m5.xlarge instances ($0.192/hr each = $5,529/mo) switched to 80% spot: savings of ~$3,400/month.
3. Storage Cost Reduction
Storage is often invisible until it’s not. S3, EBS snapshots, and database storage compound silently.
S3 Intelligent-Tiering
import boto3
s3 = boto3.client("s3")
# Apply intelligent tiering to buckets with unpredictable access patterns
s3.put_bucket_intelligent_tiering_configuration(
Bucket="my-data-bucket",
Id="EntireS3Bucket",
IntelligentTieringConfiguration={
"Id": "EntireS3Bucket",
"Status": "Enabled",
"Tierings": [
{
"Days": 90,
"AccessTier": "ARCHIVE_ACCESS" # 40% cheaper than Standard
},
{
"Days": 180,
"AccessTier": "DEEP_ARCHIVE_ACCESS" # 75% cheaper
}
]
}
)
S3 Lifecycle Policies (Infrastructure as Code)
resource "aws_s3_bucket_lifecycle_configuration" "logs" {
bucket = aws_s3_bucket.logs.id
rule {
id = "logs-lifecycle"
status = "Enabled"
transition {
days = 30
storage_class = "STANDARD_IA" # ~45% cheaper than Standard
}
transition {
days = 90
storage_class = "GLACIER_IR" # ~68% cheaper than Standard
}
expiration {
days = 365 # delete after 1 year
}
}
}
EBS Snapshot Cleanup
#!/bin/bash
# Find snapshots older than 30 days with no associated AMI
aws ec2 describe-snapshots \
--owner-ids self \
--query 'Snapshots[?StartTime<=`'"$(date -d '30 days ago' --utc +%Y-%m-%dT%H:%M:%S)"'`].[SnapshotId,StartTime,Description]' \
--output table
4. Database Cost Optimization
Databases are often 20-30% of cloud spend. Several levers:
RDS Reserved Instances
Committing to 1-year reserved instances saves ~40% over on-demand. 3-year saves ~60%.
# Identify RDS instances running > 90 days (reservation candidates)
import boto3
from datetime import datetime, timezone, timedelta
rds = boto3.client('rds')
response = rds.describe_db_instances()
cutoff = datetime.now(timezone.utc) - timedelta(days=90)
candidates = [
db for db in response['DBInstances']
if db['InstanceCreateTime'] < cutoff
and db['DBInstanceStatus'] == 'available'
]
for db in candidates:
print(f"{db['DBInstanceIdentifier']}: {db['DBInstanceClass']}, "
f"running since {db['InstanceCreateTime'].date()}")
Read Replica Cost Check
Many teams add read replicas for performance without measuring read traffic:
-- PostgreSQL: check if read replica is actually being used
SELECT
query,
calls,
total_exec_time / calls as avg_ms
FROM pg_stat_statements
WHERE query NOT LIKE 'SET %'
ORDER BY calls DESC
LIMIT 20;
If your read replica serves <10% of queries, it may not justify its cost.
DynamoDB On-Demand vs Provisioned
# Calculate if on-demand or provisioned is cheaper for your pattern
def should_switch_to_provisioned(
avg_read_capacity: float,
avg_write_capacity: float,
peak_multiplier: float = 3.0
) -> dict:
# On-demand pricing (us-east-1, approximate 2026 rates)
on_demand_read_price = 0.25 / 1_000_000 # per RCU
on_demand_write_price = 1.25 / 1_000_000 # per WCU
# Provisioned pricing
provisioned_read_price = 0.00013 / 3600 # per RCU-hour
provisioned_write_price = 0.00065 / 3600 # per WCU-hour
# Estimate monthly costs
monthly_rcus = avg_read_capacity * 3600 * 24 * 30
monthly_wcus = avg_write_capacity * 3600 * 24 * 30
on_demand_monthly = (monthly_rcus * on_demand_read_price +
monthly_wcus * on_demand_write_price)
provisioned_monthly = (
avg_read_capacity * peak_multiplier * provisioned_read_price * 3600 * 24 * 30 +
avg_write_capacity * peak_multiplier * provisioned_write_price * 3600 * 24 * 30
)
return {
"on_demand_monthly": f"${on_demand_monthly:.2f}",
"provisioned_monthly": f"${provisioned_monthly:.2f}",
"recommendation": "provisioned" if provisioned_monthly < on_demand_monthly else "on-demand"
}
5. Serverless Cost Anti-Patterns
Lambda and serverless can cost more than EC2 if misused.
Lambda Memory Tuning
More memory = higher per-invocation cost, but shorter duration. The optimum isn’t always lowest memory:
# Use AWS Lambda Power Tuning (open source tool)
# https://github.com/alexcasalboni/aws-lambda-power-tuning
# Rough calculation for your function
def optimal_lambda_memory(
baseline_ms_at_128mb: float,
invocations_per_month: int
) -> dict:
memory_configs = [128, 256, 512, 1024, 2048, 3008]
# Rough scaling: 2x memory ≈ 0.7x duration for CPU-bound functions
results = []
for mb in memory_configs:
scale = (128 / mb) ** 0.6 # empirical scaling factor
duration = baseline_ms_at_128mb * scale
# Lambda pricing: $0.0000166667 per GB-second
gb_seconds = (mb / 1024) * (duration / 1000) * invocations_per_month
cost = gb_seconds * 0.0000166667
results.append({"memory_mb": mb, "est_duration_ms": duration, "monthly_cost": cost})
return sorted(results, key=lambda x: x["monthly_cost"])
# Find the sweet spot
print(optimal_lambda_memory(baseline_ms_at_128mb=800, invocations_per_month=1_000_000))
Avoid Lambda for Long-Running Tasks
Lambda bills per 100ms, up to 15 minutes. Anything running > 30 seconds that’s invoked frequently is often cheaper on Fargate or even EC2 spot.
Rule of thumb: Lambda is cheapest for sporadic, short-duration workloads. Continuous, predictable workloads belong on containers.
6. Cost Visibility: Tagging Strategy
You can’t optimize what you can’t attribute. A consistent tagging strategy is foundational:
# Apply tags to all AWS resources via Terraform default tags
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
}
}
provider "aws" {
region = "us-east-1"
default_tags {
tags = {
Team = "platform"
Service = "api"
Environment = "production"
CostCenter = "engineering"
ManagedBy = "terraform"
}
}
}
Cost Allocation with AWS Cost Explorer API
import boto3
from datetime import date, timedelta
ce = boto3.client('cost-explorer', region_name='us-east-1')
# Get cost by service+team for last 30 days
response = ce.get_cost_and_usage(
TimePeriod={
'Start': (date.today() - timedelta(days=30)).isoformat(),
'End': date.today().isoformat()
},
Granularity='MONTHLY',
Filter={
'Tags': {
'Key': 'Team',
'Values': ['platform', 'data', 'frontend']
}
},
GroupBy=[
{'Type': 'TAG', 'Key': 'Team'},
{'Type': 'DIMENSION', 'Key': 'SERVICE'}
],
Metrics=['BlendedCost']
)
for result in response['ResultsByTime'][0]['Groups']:
keys = result['Keys']
cost = result['Metrics']['BlendedCost']['Amount']
print(f"{keys[0]} / {keys[1]}: ${float(cost):.2f}")
Building a Cost Dashboard
Monthly cost reviews shouldn’t be surprises. Build automated alerting:
# Lambda function: alert on cost anomalies
import boto3
import json
def check_cost_anomaly(event, context):
ce = boto3.client('cost-explorer')
sns = boto3.client('sns')
anomalies = ce.get_anomalies(
DateInterval={
'StartDate': '2026-03-01',
'EndDate': '2026-03-17'
},
TotalImpact={
'NumericOperator': 'GREATER_THAN',
'StartValue': 100 # alert on $100+ anomalies
}
)
if anomalies['Anomalies']:
message = "⚠️ Cost Anomalies Detected:\n\n"
for anomaly in anomalies['Anomalies']:
impact = anomaly['Impact']['TotalImpact']
message += f"- {anomaly['RootCauses'][0].get('Service', 'Unknown')}: "
message += f"${impact:.2f} above expected\n"
sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789:cost-alerts',
Message=message,
Subject='AWS Cost Anomaly Alert'
)
Quick Wins Checklist
| Action | Typical Savings | Effort |
|---|---|---|
| Delete unattached EBS volumes | $50-500/mo | Low |
| Remove unused Elastic IPs | $10-100/mo | Low |
| Enable S3 Intelligent-Tiering | 10-30% S3 bill | Low |
| Right-size obvious outliers | 20-40% compute | Medium |
| Spot instances for stateless K8s | 60-80% node cost | Medium |
| 1-year RDS reservations | 30-40% RDS bill | Medium |
| Clean up old snapshots | $20-200/mo | Low |
| Consolidate CloudWatch log groups | 20-40% logging | Medium |
Conclusion
The best time to think about cloud cost is during architecture design. The second best time is now.
Start with visibility (tagging + cost explorer), identify the top 3 cost drivers, and tackle them systematically. Most teams find 20-30% savings with 2-3 weeks of focused effort — and those savings compound as the team builds cost awareness into their development habits.
FinOps isn’t a one-time project. It’s a practice.
Tags: Cloud, AWS, FinOps, Cost Optimization, Kubernetes, Serverless, DevOps
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
