Istio Service Mesh: Complete Production Guide for Kubernetes
on Istio, Service mesh, Kubernetes, Microservices, Cloud native
Istio Service Mesh: Complete Production Guide for Kubernetes
Service mesh technology has become essential for managing microservices communication at scale. Istio, the most widely adopted service mesh, provides traffic management, security, and observability features that transform how we operate distributed systems.
Photo by Shubham Dhage on Unsplash
Why Istio in 2026?
| Challenge | Istio Solution |
|---|---|
| Service-to-service auth | mTLS everywhere |
| Traffic routing | VirtualService & DestinationRule |
| Rate limiting | EnvoyFilter & local rate limit |
| Observability | Distributed tracing, metrics |
| Canary deployments | Weighted routing |
| Circuit breaking | OutlierDetection |
Istio Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Control Plane │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Istiod │ │ Istiod │ │ Istiod │ │
│ │ (Primary) │ │ (Replica) │ │ (Replica) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └────────────────┼────────────────┘ │
│ │ xDS API │
└──────────────────────────┼──────────────────────────────────────────┘
│
┌──────────────────────────┼──────────────────────────────────────────┐
│ Data Plane │
│ ┌────────────────┴────────────────┐ │
│ │ │ │
│ ┌────┴────┐ ┌────┴────┐ │
│ │ Pod A │ │ Pod B │ │
│ │┌───────┐│ ──── mTLS ────────── │┌───────┐│ │
│ ││ Envoy ││ ││ Envoy ││ │
│ │└───────┘│ │└───────┘│ │
│ │┌───────┐│ │┌───────┐│ │
│ ││ App ││ ││ App ││ │
│ │└───────┘│ │└───────┘│ │
│ └─────────┘ └─────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Production Installation
Helm-Based Installation
# Add Istio Helm repository
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
# Create namespace
kubectl create namespace istio-system
# Install Istio base (CRDs)
helm install istio-base istio/base -n istio-system --set defaultRevision=default
# Install Istiod (Control Plane)
helm install istiod istio/istiod -n istio-system --wait \
--set pilot.resources.requests.memory=512Mi \
--set pilot.resources.requests.cpu=500m \
--set pilot.autoscaleMin=2 \
--set pilot.autoscaleMax=5 \
--set global.proxy.resources.requests.cpu=100m \
--set global.proxy.resources.requests.memory=128Mi
# Install Ingress Gateway
helm install istio-ingress istio/gateway -n istio-system \
--set service.type=LoadBalancer \
--set autoscaling.enabled=true \
--set autoscaling.minReplicas=2 \
--set autoscaling.maxReplicas=10
IstioOperator Configuration
# istio-operator.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
name: istio-control-plane
namespace: istio-system
spec:
profile: default
meshConfig:
enableTracing: true
enableAutoMtls: true
accessLogFile: /dev/stdout
accessLogFormat: |
{"timestamp":"%START_TIME%","method":"%REQ(:METHOD)%","path":"%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%","status":"%RESPONSE_CODE%","duration":"%DURATION%","upstream":"%UPSTREAM_HOST%","trace_id":"%REQ(X-B3-TRACEID)%"}
defaultConfig:
tracing:
sampling: 100
zipkin:
address: jaeger-collector.observability:9411
proxyMetadata:
ISTIO_META_DNS_CAPTURE: "true"
ISTIO_META_DNS_AUTO_ALLOCATE: "true"
components:
pilot:
k8s:
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
hpaSpec:
minReplicas: 2
maxReplicas: 5
podDisruptionBudget:
minAvailable: 1
ingressGateways:
- name: istio-ingressgateway
enabled: true
k8s:
hpaSpec:
minReplicas: 2
maxReplicas: 10
resources:
requests:
cpu: 200m
memory: 256Mi
service:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
name: http2
- port: 443
targetPort: 8443
name: https
egressGateways:
- name: istio-egressgateway
enabled: true
k8s:
hpaSpec:
minReplicas: 2
values:
global:
proxy:
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
Traffic Management
VirtualService for Routing
# virtualservice.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: product-service
namespace: ecommerce
spec:
hosts:
- product-service
- products.example.com
gateways:
- mesh
- istio-system/main-gateway
http:
# Canary routing based on header
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: product-service
subset: canary
weight: 100
# A/B testing based on user agent
- match:
- headers:
user-agent:
regex: ".*Mobile.*"
route:
- destination:
host: product-service
subset: mobile-optimized
# Default traffic split
- route:
- destination:
host: product-service
subset: stable
weight: 90
- destination:
host: product-service
subset: canary
weight: 10
# Retry configuration
retries:
attempts: 3
perTryTimeout: 2s
retryOn: connect-failure,refused-stream,unavailable,cancelled,retriable-4xx
# Timeout
timeout: 10s
# Fault injection for testing
# fault:
# delay:
# percentage:
# value: 5
# fixedDelay: 5s
DestinationRule for Load Balancing
# destinationrule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: product-service
namespace: ecommerce
spec:
host: product-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
connectTimeout: 30s
http:
h2UpgradePolicy: UPGRADE
http1MaxPendingRequests: 100
http2MaxRequests: 1000
maxRequestsPerConnection: 10
maxRetries: 3
loadBalancer:
simple: LEAST_REQUEST
localityLbSetting:
enabled: true
failover:
- from: us-west-1
to: us-east-1
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
minHealthPercent: 30
subsets:
- name: stable
labels:
version: v1
trafficPolicy:
connectionPool:
http:
http2MaxRequests: 500
- name: canary
labels:
version: v2
trafficPolicy:
connectionPool:
http:
http2MaxRequests: 100
- name: mobile-optimized
labels:
version: v1-mobile
Photo by FlyD on Unsplash
Security Configuration
mTLS and Authorization Policies
# peer-authentication.yaml
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
---
# authorization-policy.yaml
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: product-service-authz
namespace: ecommerce
spec:
selector:
matchLabels:
app: product-service
action: ALLOW
rules:
# Allow API Gateway
- from:
- source:
principals: ["cluster.local/ns/istio-system/sa/istio-ingressgateway"]
to:
- operation:
methods: ["GET", "POST", "PUT", "DELETE"]
paths: ["/api/*"]
# Allow Order Service
- from:
- source:
namespaces: ["ecommerce"]
principals: ["cluster.local/ns/ecommerce/sa/order-service"]
to:
- operation:
methods: ["GET"]
paths: ["/api/products/*", "/api/inventory/*"]
# Allow health checks from anywhere
- to:
- operation:
methods: ["GET"]
paths: ["/health", "/ready"]
---
# request-authentication.yaml
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
name: jwt-auth
namespace: ecommerce
spec:
selector:
matchLabels:
app: product-service
jwtRules:
- issuer: "https://auth.example.com"
jwksUri: "https://auth.example.com/.well-known/jwks.json"
audiences:
- "product-api"
forwardOriginalToken: true
outputPayloadToHeader: x-jwt-payload
Rate Limiting with EnvoyFilter
# rate-limit.yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: rate-limit
namespace: istio-system
spec:
workloadSelector:
labels:
istio: ingressgateway
configPatches:
- applyTo: HTTP_FILTER
match:
context: GATEWAY
listener:
filterChain:
filter:
name: "envoy.filters.network.http_connection_manager"
subFilter:
name: "envoy.filters.http.router"
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 1000
tokens_per_fill: 100
fill_interval: 1s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append: false
header:
key: x-rate-limit-limit
value: "1000"
Observability Setup
Distributed Tracing
# tracing-config.yaml
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: mesh-default
namespace: istio-system
spec:
tracing:
- providers:
- name: jaeger
randomSamplingPercentage: 100
customTags:
environment:
literal:
value: production
cluster:
literal:
value: primary
---
# jaeger-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: jaeger
namespace: observability
spec:
replicas: 1
selector:
matchLabels:
app: jaeger
template:
metadata:
labels:
app: jaeger
spec:
containers:
- name: jaeger
image: jaegertracing/all-in-one:1.54
ports:
- containerPort: 16686 # UI
- containerPort: 9411 # Zipkin
- containerPort: 4317 # OTLP gRPC
env:
- name: COLLECTOR_ZIPKIN_HOST_PORT
value: ":9411"
- name: COLLECTOR_OTLP_ENABLED
value: "true"
Kiali Dashboard
# kiali.yaml
apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
name: kiali
namespace: istio-system
spec:
auth:
strategy: anonymous
deployment:
accessible_namespaces:
- "**"
resources:
requests:
cpu: 200m
memory: 256Mi
external_services:
prometheus:
url: "http://prometheus.observability:9090"
grafana:
enabled: true
url: "http://grafana.observability:3000"
tracing:
enabled: true
url: "http://jaeger.observability:16686"
server:
web_root: /kiali
Progressive Delivery with Flagger
# flagger-canary.yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: product-service
namespace: ecommerce
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: product-service
progressDeadlineSeconds: 600
service:
port: 80
targetPort: 8080
gateways:
- main-gateway.istio-system.svc.cluster.local
hosts:
- products.example.com
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 1m
webhooks:
- name: load-test
type: rollout
url: http://flagger-loadtester/
metadata:
cmd: "hey -z 2m -q 10 -c 2 http://product-service.ecommerce/"
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester/
metadata:
cmd: "curl -s http://product-service-canary.ecommerce/health | grep ok"
Best Practices
- Start with permissive mTLS then migrate to strict
- Use revision-based upgrades for zero-downtime Istio updates
- Limit sidecar scope with Sidecar resources
- Configure appropriate resource limits for Envoy proxies
- Implement circuit breakers for all external calls
- Use locality-aware load balancing for multi-region deployments
Conclusion
Istio provides a comprehensive solution for managing microservices communication in Kubernetes. By implementing proper traffic management, security policies, and observability, teams can operate complex distributed systems with confidence and gain deep insights into service behavior.
Resources
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
