Platform Engineering in 2026: Building Internal Developer Platforms That Teams Actually Love
on Platform engineering, Devops, Internal developer platform, Kubernetes, Gitops, Developer experience
Platform Engineering in 2026: Building Internal Developer Platforms That Teams Actually Love
Platform Engineering has gone from buzzword to essential discipline. As organizations scale their engineering teams, the gap between developers who can ship quickly and those drowning in infrastructure complexity keeps widening. Internal Developer Platforms (IDPs) are the solution — but building one that developers actually embrace is harder than it looks.
Photo by Growtika on Unsplash
What Is Platform Engineering?
Platform Engineering is the practice of building and maintaining an internal platform that provides self-service capabilities to development teams. Think of it as building a product — your customers are your internal developers.
The key insight: you’re not building infrastructure, you’re building developer experience.
A good IDP makes it so that a developer can:
- Deploy a new microservice in under 10 minutes
- Get observability out of the box
- Manage secrets without a SRE ticket
- Run database migrations safely
- Set up CI/CD pipelines from a template
The Anatomy of a Modern IDP
┌──────────────────────────────────────────────────────────────────┐
│ Developer Portal (Backstage) │
│ Service Catalog | Templates | TechDocs | Plugins │
└──────────────────────────────┬───────────────────────────────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌─────────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ CI/CD Layer │ │ Infra Layer │ │ Observ Layer│
│ GitHub Actions│ │ Terraform │ │ Prometheus │
│ ArgoCD │ │ Crossplane │ │ Grafana │
└────────────────┘ └─────────────┘ └─────────────┘
│ │ │
└────────────────┼────────────────┘
│
┌──────────▼──────────┐
│ Kubernetes Cluster │
│ (EKS/GKE/AKS) │
└─────────────────────┘
Building Blocks of a Production IDP
1. The Developer Portal: Backstage
Backstage (open source, from Spotify) is the de facto standard for developer portals:
# backstage/catalog-info.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: payment-service
description: Handles all payment processing
annotations:
github.com/project-slug: myorg/payment-service
grafana/dashboard-url: https://grafana.internal/d/payment
pagerduty.com/integration-key: "abc123"
vault.io/role: "payment-service"
tags:
- payments
- critical
- go
spec:
type: service
lifecycle: production
owner: payments-team
system: checkout-system
dependsOn:
- component:order-service
- resource:payments-db
providesApis:
- payment-api
Creating a new service from a template:
// packages/backend/src/plugins/scaffolder.ts
import { createBuiltinActions, createRouter } from '@backstage/plugin-scaffolder-backend';
import { ScmIntegrations } from '@backstage/integration';
// Custom action to provision infrastructure
export const createDeployAction = () => ({
id: 'custom:create-k8s-deployment',
description: 'Creates Kubernetes deployment with best practices',
schema: {
input: {
required: ['serviceName', 'team', 'tier'],
properties: {
serviceName: { type: 'string' },
team: { type: 'string' },
tier: {
type: 'string',
enum: ['critical', 'standard', 'batch']
},
replicas: { type: 'number', default: 2 },
memory: { type: 'string', default: '256Mi' },
cpu: { type: 'string', default: '250m' }
}
}
},
async handler(ctx) {
const { serviceName, team, tier, replicas, memory, cpu } = ctx.input;
// Generate k8s manifests
const deployment = generateDeployment({
serviceName, team, tier, replicas, memory, cpu
});
// Apply to cluster via ArgoCD
await ctx.createTemporaryDirectory(async (dir) => {
await writeFiles(dir, deployment);
await gitCommitAndPush(dir, `feat: add ${serviceName} service`);
});
ctx.logger.info(`✅ Created deployment for ${serviceName}`);
}
});
2. GitOps with ArgoCD
All infrastructure changes should go through Git:
# argocd/application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: payment-service
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: production
source:
repoURL: https://github.com/myorg/k8s-manifests
targetRevision: main
path: services/payment-service
destination:
server: https://kubernetes.default.svc
namespace: payments
syncPolicy:
automated:
prune: true # Remove resources not in Git
selfHeal: true # Auto-revert manual changes
syncOptions:
- CreateNamespace=true
- ServerSideApply=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
3. Infrastructure as Code with Crossplane
Crossplane lets developers provision cloud resources using Kubernetes CRDs:
# Developer creates a database by applying a simple YAML
apiVersion: database.platform.io/v1alpha1
kind: PostgreSQLInstance
metadata:
name: payment-db
namespace: payments
spec:
parameters:
storageGB: 20
tier: standard # platform team defines what "standard" means
region: us-east-1
writeConnectionSecretToRef:
name: payment-db-credentials
The platform team defines the implementation:
# Crossplane Composition - Platform team defines this
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: postgresql-standard
spec:
compositeTypeRef:
apiVersion: database.platform.io/v1alpha1
kind: PostgreSQLInstance
resources:
- name: rds-instance
base:
apiVersion: rds.aws.upbound.io/v1beta1
kind: Instance
spec:
forProvider:
region: us-east-1
dbInstanceClass: db.t3.medium
engine: postgres
engineVersion: "16"
multiAz: false
storageEncrypted: true
deletionProtection: true
patches:
- fromFieldPath: spec.parameters.storageGB
toFieldPath: spec.forProvider.allocatedStorage
- fromFieldPath: spec.parameters.region
toFieldPath: spec.forProvider.region
Developers get a self-service database without knowing AWS internals.
4. Secret Management
Zero-trust secrets with External Secrets Operator:
# Developer YAML - simple and clean
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: payment-service-secrets
namespace: payments
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: payment-service-secrets
creationPolicy: Owner
data:
- secretKey: DATABASE_URL
remoteRef:
key: payments/database
property: url
- secretKey: STRIPE_API_KEY
remoteRef:
key: payments/stripe
property: api-key
# Application code - just reads env vars, no Vault SDK needed
import os
db_url = os.environ['DATABASE_URL'] # Injected by k8s secret
stripe_key = os.environ['STRIPE_API_KEY'] # Same
5. Service Mesh with Istio/Linkerd
Observability and traffic management without application code changes:
# Automatic mTLS, retries, circuit breaking
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: payment-service
namespace: payments
spec:
hosts:
- payment-service
http:
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: payment-service
subset: canary
weight: 100
- route:
- destination:
host: payment-service
subset: stable
weight: 100
retries:
attempts: 3
perTryTimeout: 2s
retryOn: gateway-error,connect-failure,retriable-4xx
timeout: 10s
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: payment-service
spec:
host: payment-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 1000
maxRequestsPerConnection: 10
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
subsets:
- name: stable
labels:
version: stable
- name: canary
labels:
version: canary
The Golden Path: Service Templates
The key to IDP adoption is making the right thing the easy thing. Service templates (Golden Paths) encode best practices:
my-service/
├── .github/
│ └── workflows/
│ ├── ci.yaml # Test, lint, build, push
│ ├── deploy-staging.yaml
│ └── deploy-prod.yaml # Requires approval
├── k8s/
│ ├── deployment.yaml # Best practices baked in
│ ├── service.yaml
│ ├── hpa.yaml # Auto-scaling pre-configured
│ ├── pdb.yaml # Disruption budget
│ └── servicemonitor.yaml # Prometheus scraping
├── Dockerfile # Multi-stage, non-root
├── catalog-info.yaml # Backstage registration
└── README.md
CI/CD pipeline template:
# .github/workflows/ci.yaml (template)
name: CI
on:
push:
branches: [main]
pull_request:
env:
REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
IMAGE_NAME: $
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run tests
run: make test
- name: Upload coverage
uses: codecov/codecov-action@v4
security-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Trivy scan
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
security-checks: 'vuln,secret,config'
exit-code: '1'
severity: 'HIGH,CRITICAL'
build-push:
needs: [test, security-scan]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/github-actions
aws-region: us-east-1
- name: Build and push
uses: docker/build-push-action@v5
with:
push: true
tags: |
$/$:latest
$/$:$
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Update image tag in GitOps repo
run: |
git clone https://github.com/myorg/k8s-manifests
cd k8s-manifests
yq e '.spec.template.spec.containers[0].image = "$/$:$"' \
-i services/$/deployment.yaml
git commit -am "chore: update $ to $"
git push
Photo by Annie Spratt on Unsplash
Measuring Platform Success
Don’t guess if your platform is working — measure it:
# Platform metrics you should track
metrics = {
# DORA Metrics
"deployment_frequency": "deployments_per_day_per_team",
"lead_time_for_changes": "commit_to_production_minutes",
"change_failure_rate": "rollbacks_percent",
"time_to_restore": "incident_resolution_minutes",
# Platform-specific
"self_service_rate": "tickets_avoided_percent",
"new_service_time": "minutes_from_template_to_first_deploy",
"platform_nps": "developer_satisfaction_score",
"onboarding_time": "days_from_hire_to_first_pr",
}
Target benchmarks for elite teams (DORA 2025):
| Metric | Elite | High | Medium |
|---|---|---|---|
| Deployment Frequency | Multiple/day | Daily | Weekly |
| Lead Time | < 1 hour | < 1 day | < 1 week |
| Change Failure Rate | < 5% | < 10% | < 15% |
| Time to Restore | < 1 hour | < 4 hours | < 1 day |
Common Anti-Patterns to Avoid
1. The Ticket Platform (Anti-Pattern)
Developer → Creates JIRA ticket → Platform team → Eventually does the thing
If your “platform” is just a ticketing system, you haven’t built a platform. You’ve built a bureaucracy.
2. Building Before Listening
Talk to your developers. Run surveys. Watch them work. The best IDPs are built around observed pain points, not hypothetical needs.
3. Forcing Migration
# Wrong: Mandate migration by deadline
# Right: Make the platform so good that teams choose to migrate
Adoption through value beats adoption through mandate every time.
4. Over-engineering Early
Start with a portal, CI templates, and basic self-service. Add complexity only when you have evidence it’s needed.
Implementation Roadmap
Month 1-2: Foundation
- Deploy Backstage with a basic service catalog
- Create 2-3 service templates covering most common use cases
- Set up basic observability stack (Prometheus + Grafana)
Month 3-4: Self-Service
- Implement Crossplane for database provisioning
- Add External Secrets Operator
- Create developer portal documentation
Month 5-6: Polish
- Add service mesh (start with metrics, not full mTLS)
- Build cost visibility into the portal
- Create developer SLO dashboard
Month 7+: Optimization
- Measure and iterate based on DORA metrics
- Add advanced features based on team requests
- Consider Platform Orchestration (Port, Cortex)
Conclusion
Platform Engineering done right is a force multiplier. When developers can go from idea to production in hours instead of days, when they don’t need to file tickets for basic infrastructure, when observability is automatic — you’ve built something valuable.
The most successful IDPs I’ve seen share one thing: they’re treated as products, with real customers (developers), regular user research, and a commitment to reducing friction over adding features.
Build the platform your developers deserve.
Building a platform at your company? I’d love to hear about your stack and challenges.
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
