GitOps in Production: ArgoCD vs Flux and the Patterns That Actually Work
on Gitops, Argocd, Flux, Kubernetes, Devops
GitOps in Production: ArgoCD vs Flux and the Patterns That Actually Work
GitOps has moved from buzzword to baseline. The idea — git is the single source of truth, and a controller continuously reconciles the cluster state to match what’s in git — is simple. The implementation at scale is not.
After two years of GitOps rollouts across engineering teams of different sizes and complexity, patterns have emerged: what works, what doesn’t, and where the popular advice falls flat.
This post is the practical guide I wish I had before my first GitOps project.
Photo by Growtika on Unsplash
ArgoCD vs Flux: Making the Call
The most common first question. The honest answer: both are excellent, both are production-proven, and your context matters more than the tools.
ArgoCD
Strengths:
- Rich UI — best-in-class visual representation of application state
- Application-centric model with health status and sync status per app
- Strong RBAC model — easy to give teams access only to their apps
- Better multi-tenant story out of the box
- App-of-Apps and ApplicationSets for managing many apps
Weaknesses:
- Heavier install (more Kubernetes resources)
- Drift from pure Kubernetes-native patterns (has its own CRD ecosystem)
- Larger learning curve for operators new to it
Flux
Strengths:
- Pure Kubernetes-native — everything is a Custom Resource
- Modular architecture (source, kustomize, helm, notification controllers)
- Better at multi-tenancy via separate namespaced resources
- GitRepository + HelmRepository + OCIRepository support out of the box
- Lighter weight
Weaknesses:
- No built-in UI (need Weave GitOps or third-party)
- Steeper conceptual learning curve
- Less hand-holding for newcomers
Decision Framework
| Scenario | Recommendation |
|---|---|
| Team is new to GitOps | ArgoCD — the UI helps with learning |
| Platform team managing 50+ microservices | ArgoCD + ApplicationSets |
| Security-sensitive, minimizing attack surface | Flux — smaller footprint |
| Full Kubernetes-native, no GUI preference | Flux |
| Need fine-grained team access control | ArgoCD |
For most teams starting fresh: ArgoCD wins on developer experience. For platform teams who are Kubernetes-native and prefer CLI: Flux wins on elegance.
Repository Structure: The Decision That Shapes Everything
How you structure your GitOps repos affects team velocity, blast radius of changes, and security posture. Three viable patterns:
Pattern 1: Monorepo (Everything Together)
gitops-repo/
├── apps/
│ ├── production/
│ │ ├── frontend/
│ │ │ ├── deployment.yaml
│ │ │ └── kustomization.yaml
│ │ └── backend/
│ │ ├── deployment.yaml
│ │ └── kustomization.yaml
│ └── staging/
│ ├── frontend/
│ └── backend/
├── infrastructure/
│ ├── cert-manager/
│ ├── ingress-nginx/
│ └── monitoring/
└── clusters/
├── production/
│ └── kustomization.yaml
└── staging/
└── kustomization.yaml
Works well for: Small teams, early-stage projects, high coordination needs.
Problems at scale: Every team modifies the same repo, PR reviews become bottlenecks, CODEOWNERS gets complicated.
Pattern 2: Platform + App Split
platform-gitops/ # Platform team owns this
├── clusters/
├── infrastructure/
└── base-apps/
team-a-gitops/ # Team A owns this
├── production/
└── staging/
team-b-gitops/ # Team B owns this
├── production/
└── staging/
Works well for: Multi-team orgs where platform and application teams have different release cadences.
ArgoCD implementation using ApplicationSets:
# platform-gitops/argocd/app-set.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: team-apps
namespace: argocd
spec:
generators:
- git:
repoURL: https://github.com/myorg/platform-gitops
revision: HEAD
directories:
- path: "teams/*/production"
template:
metadata:
name: '-production'
spec:
project: '' # team name as ArgoCD project
source:
repoURL: https://github.com/myorg/platform-gitops
targetRevision: HEAD
path: ''
destination:
server: https://kubernetes.default.svc
namespace: ''
syncPolicy:
automated:
prune: true
selfHeal: true
Pattern 3: Config Repo + App Repo (Separate CD from CI)
app-repo/ # Application code + CI
├── src/
├── Dockerfile
└── .github/workflows/
└── ci.yaml # Builds image, then updates config repo
config-repo/ # GitOps source of truth
├── production/
│ └── deployment.yaml # image: myapp:v1.2.3 ← CI updates this
└── staging/
└── deployment.yaml # image: myapp:v1.2.3-rc1
This is the canonical GitOps pattern. CI builds and tests the app, then opens a PR to the config repo updating the image tag. ArgoCD watches the config repo and deploys automatically.
# CI workflow: update image tag in config repo
- name: Update image tag
run: |
git clone https://github.com/myorg/config-repo
cd config-repo
# Update image tag using yq
yq e ".spec.template.spec.containers[0].image = \"${IMAGE}:${TAG}\"" \
-i staging/deployment.yaml
git config user.name "GitHub Actions"
git config user.email "ci@myorg.com"
git commit -am "ci: update staging image to ${TAG}"
git push
# Create PR to production (requires human approval)
gh pr create \
--title "Deploy ${TAG} to production" \
--body "Promotes ${TAG} from staging to production" \
--base main \
--head "staging-${TAG}"
Secrets Management: The Hardest Part
“Everything in git” breaks immediately when you have secrets. You cannot put raw secrets in git. Solutions in order of recommendation:
1. External Secrets Operator (ESO)
ESO pulls secrets from external stores (AWS Secrets Manager, HashiCorp Vault, GCP Secret Manager) and creates Kubernetes Secrets. The ExternalSecret manifest references the secret by name — safe to commit.
# Safe to commit — no secret values here
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: database-credentials
namespace: production
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: database-credentials # Creates this K8s Secret
creationPolicy: Owner
data:
- secretKey: DB_PASSWORD # K8s Secret key
remoteRef:
key: prod/database # AWS Secrets Manager key
property: password
- secretKey: DB_HOST
remoteRef:
key: prod/database
property: host
Configure the ClusterSecretStore with IAM role (not credentials) using IRSA:
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: aws-secrets-manager
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa
namespace: external-secrets
2. Sealed Secrets
Encrypt secrets using a cluster-specific public key. The encrypted SealedSecret is safe to commit; only the cluster can decrypt it.
# Encrypt a secret (one-time)
kubectl create secret generic db-password \
--from-literal=password="super-secret" \
--dry-run=client -o yaml | \
kubeseal --controller-namespace kube-system \
--format yaml > sealed-db-password.yaml
# sealed-db-password.yaml is safe to commit
ESO vs Sealed Secrets: ESO wins for most production setups because it keeps secrets out of git entirely. Sealed Secrets are simpler to set up but tie you to cluster-specific encryption keys.
Sync Policies and the Automation Question
The most controversial GitOps decision: how much to automate?
# ArgoCD sync policy options
syncPolicy:
automated:
prune: true # Delete resources removed from git
selfHeal: true # Revert manual kubectl changes
allowEmpty: false # Don't sync if nothing would deploy
syncOptions:
- CreateNamespace=true
- PrunePropagationPolicy=foreground
retry:
limit: 3
backoff:
duration: 5s
factor: 2
maxDuration: 3m
Recommended policy by environment:
| Environment | Auto-sync | Self-heal | Prune | Manual gate |
|---|---|---|---|---|
| Dev | ✅ | ✅ | ✅ | None |
| Staging | ✅ | ✅ | ✅ | None |
| Production | ⚠️ | ✅ | ✅ | PR approval |
For production, auto-sync on merge to main (after PR approval) is safer than fully automated. Self-heal is almost always safe to enable — it prevents drift from manual kubectl changes that bypass git.
Health Checks and Progressive Delivery
GitOps ships code to git; progressive delivery ships code to users. Combine them with Argo Rollouts:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: backend-api
spec:
replicas: 10
strategy:
canary:
steps:
- setWeight: 10 # Send 10% traffic to new version
- pause: {} # Wait for manual approval
- setWeight: 50 # 50% traffic
- pause:
duration: 30m # Auto-advance after 30 mins if healthy
- setWeight: 100 # Full rollout
canaryService: backend-api-canary
stableService: backend-api-stable
trafficRouting:
istio:
virtualService:
name: backend-api-vs
analysis:
templates:
- templateName: success-rate
startingStep: 2
args:
- name: service-name
value: backend-api-canary
ArgoCD tracks the Rollout’s health status and shows you exactly where a deployment is in the canary progression.
Photo by Growtika on Unsplash
Drift Detection and Alerting
GitOps without alerting is just hoping. Set up alerts for drift and sync failures:
# ArgoCD notification template + trigger
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-notifications-cm
data:
trigger.on-sync-failed: |
- when: app.status.operationState.phase in ['Error', 'Failed']
send: [slack-sync-failed]
trigger.on-health-degraded: |
- when: app.status.health.status == 'Degraded'
send: [slack-degraded]
template.slack-sync-failed: |
message: |
:red_circle: ** sync failed
Reason:
</applications/|View in ArgoCD>
service.slack: |
token: $slack-token
channels:
- name: platform-alerts
Common Pitfalls to Avoid
Storing environment-specific values in the app repo — they belong in the config/GitOps repo. Mixing them creates tight coupling between CI and CD.
Giant ApplicationSets with no blast radius control — ApplicationSets that generate 100 apps mean one bad template breaks 100 apps. Add wave sync to control rollout order.
Skipping staging — teams that go directly to production GitOps without a staging environment lose the safety net of seeing sync behavior before it matters.
Ignoring the ArgoCD API server’s security — the ArgoCD API server should not be publicly exposed. Use SSO + RBAC + private VPN/bastion access.
Pruning without understanding resource ownership —
prune: truedeletes resources no longer in git. If you have resources managed by other tools (Helm, Terraform), they’ll get deleted. Be explicit about what ArgoCD owns.
Key Takeaways
GitOps done right gives you:
- Full audit trail — every change has a git commit and PR
- Rollback in 30 seconds —
git revert+ merge = instant rollback - Declarative everything — cluster state is defined, not imperative
- Drift prevention — controllers fix drift automatically
GitOps done wrong gives you a repo nobody trusts and a cluster that drifts anyway. The difference is structure, discipline, and the right amount of automation.
Pick one tool (ArgoCD or Flux), commit to a repo structure, solve secrets early, and start with one service. The patterns scale up from there.
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
