OpenTofu and Terraform IaC Best Practices in 2026: The Complete Production Guide
on Terraform, Opentofu, Infrastructure as code, Devops, Cloud, Aws
OpenTofu and Terraform IaC Best Practices in 2026: The Complete Production Guide
The infrastructure-as-code landscape shifted significantly in 2023 when HashiCorp relicensed Terraform from MPL to BSL. OpenTofu, the community fork maintained by the Linux Foundation, has since emerged as a robust open-source alternative with full Terraform compatibility and new features the community has been requesting for years.
This guide covers production IaC patterns that work equally well with Terraform and OpenTofu — and highlights where the two diverge.
Photo by NASA on Unsplash
OpenTofu vs. Terraform in 2026
The fork has matured. Here’s the honest comparison:
| Aspect | OpenTofu | Terraform |
|---|---|---|
| License | MPL 2.0 (open) | BSL 1.1 (source-available) |
| Compatibility | HCL-compatible, drop-in for most | Original |
| State management | Compatible backends | Compatible backends |
| Provider ecosystem | Uses Terraform provider registry | Same registry |
| New features | Faster iteration (community-driven) | Slower, enterprise-focused |
| Enterprise support | Community + vendors (Spacelift, env0) | HashiCorp / HCP |
| Cost | Free | Free CLI, paid HCP |
Migration from Terraform to OpenTofu:
# It's largely a drop-in replacement
brew install opentofu
# In your project
tofu init # instead of terraform init
tofu plan
tofu apply
# Check compatibility
tofu version
# OpenTofu v1.8.x
Notable OpenTofu-Only Features (v1.7+)
State encryption — a long-requested feature:
# terraform.tf
terraform {
encryption {
key_provider "pbkdf2" "my_key" {
passphrase = var.state_encryption_passphrase
}
method "aes_gcm" "my_method" {
keys = key_provider.pbkdf2.my_key
}
state {
method = method.aes_gcm.my_method
}
plan {
method = method.aes_gcm.my_method
}
}
}
Project Structure for Scale
The hardest IaC problem isn’t syntax — it’s organization. Here’s a structure that scales:
infra/
├── modules/ # Reusable components
│ ├── vpc/
│ ├── eks-cluster/
│ ├── rds-postgres/
│ └── ecs-service/
├── environments/ # Environment-specific configs
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ └── prod/
├── platform/ # Shared infrastructure
│ ├── networking/
│ ├── security/
│ └── observability/
└── stacks/ # Terragrunt stacks (optional)
├── dev.hcl
└── prod.hcl
Module Design Principles
# modules/ecs-service/main.tf
# Good module: opinionated defaults, override points
variable "name" {
description = "Service name — used in all resource names"
type = string
}
variable "image" {
description = "Container image URI"
type = string
}
variable "cpu" {
description = "CPU units (256, 512, 1024, 2048)"
type = number
default = 512
validation {
condition = contains([256, 512, 1024, 2048, 4096], var.cpu)
error_message = "CPU must be one of: 256, 512, 1024, 2048, 4096."
}
}
variable "memory" {
description = "Memory in MB"
type = number
default = 1024
}
variable "desired_count" {
description = "Number of task instances"
type = number
default = 2
}
# Always tag everything
locals {
common_tags = {
ManagedBy = "terraform"
Environment = var.environment
Service = var.name
CostCenter = var.cost_center
}
}
Remote State and Locking
S3 Backend (AWS)
# environments/prod/backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state-prod"
key = "services/api/terraform.tfstate"
region = "us-east-1"
encrypt = true
kms_key_id = "arn:aws:kms:us-east-1:123456789012:key/abcd-1234"
# DynamoDB for state locking
dynamodb_table = "terraform-state-lock"
# Enable state file versioning
versioning = true
}
}
State Isolation Strategy
One of the most important decisions: how to split your state files.
# TOO MONOLITHIC (dangerous)
state/monolith.tfstate # Everything in one file — one mistake destroys everything
# TOO GRANULAR (operationally painful)
state/vpc.tfstate
state/subnet-1a.tfstate
state/subnet-1b.tfstate
...
# GOOD: Split by lifecycle and blast radius
state/networking.tfstate # VPC, subnets (changes rarely)
state/platform.tfstate # EKS, RDS (changes occasionally)
state/apps/api.tfstate # API service (changes frequently)
state/apps/workers.tfstate
Terragrunt for DRY Infrastructure
Terragrunt eliminates the boilerplate of repeating backend configs and module versions across environments:
# terragrunt.hcl (root)
locals {
env_vars = read_terragrunt_config(find_in_parent_folders("env.hcl"))
environment = local.env_vars.locals.environment
aws_region = local.env_vars.locals.aws_region
account_id = local.env_vars.locals.aws_account_id
}
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "mycompany-tf-state-${local.account_id}"
key = "${path_relative_to_include()}/terraform.tfstate"
region = local.aws_region
encrypt = true
dynamodb_table = "terraform-lock-${local.account_id}"
}
}
generate "provider" {
path = "provider.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
provider "aws" {
region = "${local.aws_region}"
default_tags {
tags = {
Environment = "${local.environment}"
ManagedBy = "terragrunt"
}
}
}
EOF
}
# services/api/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://github.com/mycompany/infra-modules.git//ecs-service?ref=v2.3.0"
}
inputs = {
name = "api"
image = "mycompany/api:${get_env("IMAGE_TAG", "latest")}"
cpu = 1024
memory = 2048
desired_count = 3
}
Testing Infrastructure Code
IaC without tests is a liability. Here are three layers of testing.
Static Analysis: tflint + tfsec
# tflint — lints HCL, catches provider-specific errors
tflint --init
tflint --recursive
# tfsec — security scanning
tfsec . --minimum-severity MEDIUM
# trivy — comprehensive IaC scanning
trivy config .
# .tflint.hcl
plugin "aws" {
enabled = true
version = "0.27.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_required_version" {
enabled = true
}
rule "terraform_required_providers" {
enabled = true
}
Unit Tests: Terraform Test Framework
# modules/ecs-service/tests/basic.tftest.hcl
run "creates_ecs_service" {
variables {
name = "test-api"
image = "nginx:latest"
cluster_arn = "arn:aws:ecs:us-east-1:123456789012:cluster/test"
subnet_ids = ["subnet-12345"]
environment = "test"
cost_center = "engineering"
}
assert {
condition = aws_ecs_service.this.name == "test-api"
error_message = "ECS service name should match input"
}
assert {
condition = aws_ecs_service.this.desired_count == 2
error_message = "Default desired count should be 2"
}
}
Integration Tests: Terratest
// test/ecs_service_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/stretchr/testify/assert"
)
func TestECSServiceModule(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/ecs-service",
Vars: map[string]interface{}{
"name": "terratest-api",
"image": "nginx:latest",
"environment": "test",
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
serviceName := terraform.Output(t, terraformOptions, "service_name")
assert.Equal(t, "terratest-api", serviceName)
// Verify the service actually exists in AWS
service := aws.GetEcsService(t, "us-east-1", "test-cluster", serviceName)
assert.Equal(t, "ACTIVE", *service.Status)
}
CI/CD for Infrastructure
GitHub Actions Workflow
# .github/workflows/terraform.yml
name: Terraform
on:
pull_request:
paths:
- 'infra/**'
push:
branches: [main]
paths:
- 'infra/**'
jobs:
plan:
runs-on: ubuntu-latest
permissions:
id-token: write # OIDC auth to AWS
contents: read
pull-requests: write # Comment plan on PR
steps:
- uses: actions/checkout@v4
- uses: opentofu/setup-opentofu@v1
with:
tofu_version: 1.8.x
- name: Configure AWS credentials (OIDC)
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActions-Terraform
aws-region: us-east-1
- name: OpenTofu Init
run: tofu init
working-directory: infra/environments/prod
- name: OpenTofu Format Check
run: tofu fmt -check -recursive
working-directory: infra/
- name: Validate
run: tofu validate
working-directory: infra/environments/prod
- name: tflint
uses: terraform-linters/setup-tflint@v4
- run: tflint --recursive
working-directory: infra/
- name: Plan
id: plan
run: tofu plan -out=tfplan -no-color
working-directory: infra/environments/prod
- name: Comment plan on PR
uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
const plan = `$`
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Terraform Plan\n\`\`\`\n${plan}\n\`\`\``
})
apply:
runs-on: ubuntu-latest
needs: plan
if: github.ref == 'refs/heads/main'
environment: production # Requires manual approval
steps:
# ... same init/auth steps ...
- run: tofu apply -auto-approve tfplan
working-directory: infra/environments/prod
Drift Detection
Infrastructure drift (manual changes in the console) is dangerous. Detect it automatically:
# Run plan in CI on a schedule
# If plan shows changes but nothing was merged, that's drift!
# GitHub Actions scheduled job
on:
schedule:
- cron: '0 8 * * 1-5' # Weekdays at 8 AM
jobs:
drift-check:
steps:
- run: tofu plan -detailed-exitcode
id: plan
# Exit code 0 = no changes (good)
# Exit code 1 = error
# Exit code 2 = changes detected (DRIFT!)
- name: Alert on drift
if: steps.plan.outputs.exitcode == '2'
uses: slackapi/slack-github-action@v1
with:
payload: |
{"text": "⚠️ Infrastructure drift detected in production!"}
Secrets Management
Never put secrets in Terraform state. Use references:
# BAD: Secret in state file (encrypted or not, it's still risky)
resource "aws_db_instance" "postgres" {
password = var.db_password # This ends up in state!
}
# GOOD: Reference from Secrets Manager
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "prod/myapp/db-password"
}
resource "aws_db_instance" "postgres" {
manage_master_user_password = true # AWS manages rotation
}
# Or for non-AWS: Use Vault
data "vault_generic_secret" "db" {
path = "secret/myapp/db"
}
Photo by Pedro Lastra on Unsplash
Common Pitfalls to Avoid
1. Force-destroying stateful resources:
# Protect your database from accidental destruction
resource "aws_db_instance" "postgres" {
deletion_protection = true
lifecycle {
prevent_destroy = true
}
}
2. Hardcoding availability zones:
# BAD
availability_zone = "us-east-1a"
# GOOD
data "aws_availability_zones" "available" {}
availability_zone = data.aws_availability_zones.available.names[0]
3. Missing resource dependencies:
# Explicit dependency when implicit doesn't work
resource "aws_s3_bucket_policy" "policy" {
bucket = aws_s3_bucket.logs.id
policy = data.aws_iam_policy_document.logs.json
depends_on = [aws_s3_bucket_public_access_block.logs]
}
Resources
- OpenTofu documentation
- Terragrunt documentation
- Terraform Test framework
- Terratest
- tflint
- tfsec
- Gruntwork Infrastructure as Code Library
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
