AI-Powered Code Review with GitHub Actions: Automate Quality Gates in 2026
on Github actions, Ai, Code review, Devops, Ci/cd, Claude, Automation
AI-Powered Code Review with GitHub Actions: Automate Quality Gates in 2026
By 2026, AI code review has moved from novelty to standard practice. Teams that still rely entirely on human reviewers for first-pass review are slower, and frankly, their reviewers are less happy. AI doesn’t catch everything — but it’s available 24/7, never tired, and is surprisingly good at catching security issues, missing error handling, and style drift.
This guide shows you how to build a robust AI code review pipeline with GitHub Actions that complements (not replaces) your human reviewers.
Photo by Growtika on Unsplash
What AI Code Review Is Good At
Before building the pipeline, understand where AI excels vs. where humans still win:
AI excels:
- ✅ Security vulnerabilities (SQL injection, XSS, exposed secrets)
- ✅ Missing error handling and edge cases
- ✅ Code style and consistency
- ✅ Documentation and comment quality
- ✅ Obvious bugs (off-by-one, null deref patterns)
- ✅ Import/dependency issues
- ✅ Accessibility problems in UI code
- ✅ TypeScript type safety issues
Humans still win:
- ✅ Business logic correctness
- ✅ Architecture decisions
- ✅ Product requirements alignment
- ✅ Team-specific context
- ✅ Long-term maintainability judgment
- ✅ “This approach is fine but here’s a better pattern for our codebase”
The right model: AI does triage and catches automatable issues first, humans review the diff + AI feedback together.
Architecture Overview
PR Opened/Updated
│
▼
GitHub Action triggers
│
▼
Fetch PR diff (changed files only)
│
▼
Chunk by file/function (stay within token limits)
│
▼
Claude API call with specialized prompts
│
├── Security scan
├── Code quality review
└── Documentation check
│
▼
Post structured comments to PR
│
├── Inline comments (file:line)
└── Summary comment
│
▼
Set PR status check (pass/fail)
The GitHub Actions Workflow
# .github/workflows/ai-code-review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize, ready_for_review]
# Skip WIP/draft PRs
workflow_dispatch: # Allow manual trigger
permissions:
contents: read
pull-requests: write # Needed to post comments
jobs:
ai-review:
# Skip draft PRs unless manually triggered
if: >
github.event_name == 'workflow_dispatch' ||
github.event.pull_request.draft == false
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for diff
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install dependencies
run: pip install anthropic pygithub requests
- name: Run AI Code Review
env:
ANTHROPIC_API_KEY: $
GITHUB_TOKEN: $
PR_NUMBER: $
REPO_NAME: $
BASE_SHA: $
HEAD_SHA: $
run: python .github/scripts/ai_review.py
- name: Upload review artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: ai-review-results
path: review_output.json
retention-days: 30
The Review Script
# .github/scripts/ai_review.py
import os
import json
import subprocess
import anthropic
from github import Github
from pathlib import Path
# Configuration
MAX_DIFF_LINES = 3000 # Stay within token limits
SKIP_EXTENSIONS = {'.lock', '.svg', '.png', '.jpg', '.ico', '.min.js', '.min.css'}
SKIP_FILES = {'package-lock.json', 'bun.lockb', 'yarn.lock', 'pnpm-lock.yaml'}
def get_pr_diff() -> dict[str, str]:
"""Get changed files and their diffs."""
base_sha = os.environ["BASE_SHA"]
head_sha = os.environ["HEAD_SHA"]
# Get list of changed files
result = subprocess.run(
["git", "diff", "--name-only", base_sha, head_sha],
capture_output=True, text=True
)
changed_files = result.stdout.strip().split('\n')
file_diffs = {}
for filepath in changed_files:
# Skip binary/generated files
if not filepath:
continue
path = Path(filepath)
if path.suffix in SKIP_EXTENSIONS or path.name in SKIP_FILES:
continue
# Get diff for this file
diff_result = subprocess.run(
["git", "diff", base_sha, head_sha, "--", filepath],
capture_output=True, text=True
)
diff = diff_result.stdout
if diff and len(diff.split('\n')) < MAX_DIFF_LINES:
file_diffs[filepath] = diff
return file_diffs
def review_with_claude(file_path: str, diff: str) -> dict:
"""Send diff to Claude for review."""
client = anthropic.Anthropic()
# Determine file type for specialized prompting
extension = Path(file_path).suffix
review_prompt = f"""You are a senior software engineer doing a code review.
Review the following code diff for the file `{file_path}`.
Focus on:
1. **Security issues**: SQL injection, XSS, exposed secrets, insecure dependencies, path traversal, etc.
2. **Bugs**: Off-by-one errors, null/undefined access, race conditions, wrong error handling
3. **Code quality**: Readability, maintainability, overly complex logic
4. **Missing tests**: Changes that likely need tests but don't have them
5. **Performance**: Obvious inefficiencies (N+1 queries, unnecessary re-renders, etc.)
6. **Type safety**: Missing type annotations, unsafe casts, any types
Be concise. Only flag real issues, not style preferences.
For each issue, provide the line number if identifiable from the diff.
Return a JSON object:
overall
]
}}
If the changes look good, return overall
CODE DIFF:
```diff
{diff}
```"""
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=2048,
messages=[{"role": "user", "content": review_prompt}]
)
try:
# Extract JSON from response
content = response.content[0].text
# Find JSON in the response
start = content.find('{')
end = content.rfind('}') + 1
if start >= 0 and end > start:
return json.loads(content[start:end])
except (json.JSONDecodeError, IndexError):
pass
return {"overall": "minor", "summary": "Review completed", "issues": []}
def security_scan(file_path: str, diff: str) -> list[dict]:
"""Dedicated security scan with focused prompting."""
client = anthropic.Anthropic()
prompt = f"""You are a security engineer. Scan this code diff for security vulnerabilities ONLY.
Check for:
- Hardcoded secrets, API keys, passwords
- SQL/NoSQL injection
- XSS vulnerabilities
- Path traversal / directory traversal
- Insecure cryptography
- Authentication/authorization flaws
- Sensitive data in logs/errors
- Dependency vulnerabilities (known bad packages)
- CSRF vulnerabilities
- Open redirects
Return a JSON array of security issues found (empty array if none):
[
severity
]
FILE: {file_path}
DIFF:
```diff
{diff}
```"""
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
try:
content = response.content[0].text
start = content.find('[')
end = content.rfind(']') + 1
if start >= 0 and end > start:
return json.loads(content[start:end])
except (json.JSONDecodeError, IndexError):
pass
return []
def post_review_to_github(pr_number: int, reviews: dict[str, dict], security_issues: list):
"""Post AI review results as PR comments."""
gh = Github(os.environ["GITHUB_TOKEN"])
repo = gh.get_repo(os.environ["REPO_NAME"])
pr = repo.get_pull(pr_number)
# Determine overall verdict
all_issues = []
for file_review in reviews.values():
all_issues.extend(file_review.get("issues", []))
all_issues.extend([{**i, "severity": i.get("severity", "major")} for i in security_issues])
critical_count = sum(1 for i in all_issues if i["severity"] == "critical")
major_count = sum(1 for i in all_issues if i["severity"] == "major")
minor_count = sum(1 for i in all_issues if i["severity"] in ("minor", "low"))
if critical_count > 0:
verdict = "🚨 Critical issues found"
status_state = "failure"
elif major_count > 0:
verdict = "⚠️ Major issues found"
status_state = "failure"
elif minor_count > 0:
verdict = "💡 Minor suggestions"
status_state = "success"
else:
verdict = "✅ LGTM from AI reviewer"
status_state = "success"
# Build summary comment
comment_parts = [
f"## 🤖 AI Code Review — {verdict}\n",
f"*Reviewed {len(reviews)} files | {len(all_issues)} issues found*\n",
]
if security_issues:
comment_parts.append("\n### 🔒 Security Issues\n")
for issue in security_issues:
emoji = "🚨" if issue["severity"] in ("critical", "high") else "⚠️"
comment_parts.append(f"{emoji} **[{issue['severity'].upper()}]** {issue['title']}")
comment_parts.append(f"\n> {issue['description']}\n")
if issue.get("remediation"):
comment_parts.append(f"> **Fix:** {issue['remediation']}\n")
# Per-file summaries
comment_parts.append("\n### 📁 File Reviews\n")
for filepath, review in reviews.items():
overall = review.get("overall", "lgtm")
emoji = {"lgtm": "✅", "minor": "💡", "major": "⚠️", "critical": "🚨"}.get(overall, "✅")
comment_parts.append(f"\n**{emoji} `{filepath}`** — {review.get('summary', '')}")
issues = review.get("issues", [])
if issues:
for issue in issues[:3]: # Show top 3 per file
sev_emoji = {"critical": "🚨", "major": "❌", "minor": "💡", "suggestion": "💬"}.get(
issue.get("severity", "minor"), "💡"
)
line_ref = f"L{issue['line']}" if issue.get("line") else "general"
comment_parts.append(f"\n - {sev_emoji} **{issue['title']}** ({line_ref})")
comment_parts.append("\n\n---")
comment_parts.append("*AI review by Claude Sonnet 4.5. Always verify AI suggestions with human judgment.*")
summary_comment = "\n".join(comment_parts)
# Delete previous AI review comments (keep thread clean)
for comment in pr.get_issue_comments():
if "🤖 AI Code Review" in comment.body:
comment.delete()
# Post new summary
pr.create_issue_comment(summary_comment)
return status_state
def main():
pr_number = int(os.environ["PR_NUMBER"])
print("🔍 Fetching PR diff...")
file_diffs = get_pr_diff()
if not file_diffs:
print("No reviewable files changed. Skipping AI review.")
return
print(f"📝 Reviewing {len(file_diffs)} files...")
reviews = {}
all_security_issues = []
for filepath, diff in file_diffs.items():
print(f" → {filepath}")
# General code review
reviews[filepath] = review_with_claude(filepath, diff)
# Security scan for backend/config files
if any(filepath.endswith(ext) for ext in ['.py', '.ts', '.js', '.go', '.java', '.env', '.yaml', '.json']):
security_issues = security_scan(filepath, diff)
all_security_issues.extend(security_issues)
# Save results
output = {"reviews": reviews, "security_issues": all_security_issues}
with open("review_output.json", "w") as f:
json.dump(output, f, indent=2)
print("💬 Posting review to GitHub...")
status = post_review_to_github(pr_number, reviews, all_security_issues)
print(f"✅ Review complete. Status: {status}")
# Exit with failure for critical/major issues (blocks merge if branch protection enabled)
critical = sum(1 for issues in [all_security_issues] + [r.get("issues", []) for r in reviews.values()]
for i in issues if i.get("severity") in ("critical", "major"))
if critical > 0:
print(f"❌ {critical} critical/major issues found. Failing CI.")
exit(1)
if __name__ == "__main__":
main()
Adding Custom Rules for Your Team
Extend the system with team-specific rules:
CUSTOM_RULES = """
Additional rules specific to our codebase:
1. Database queries must use our `db` wrapper, never raw `pg` connections
2. API endpoints must have rate limiting middleware
3. All user-facing errors must use our `AppError` class (never expose raw stack traces)
4. React components must have proper loading/error states
5. All API calls must handle network errors (not just HTTP errors)
6. Dates must use our `DateUtils` helper, never `new Date()` directly in components
7. Environment variables must be accessed through our `config` module
"""
def review_with_custom_rules(file_path: str, diff: str) -> dict:
prompt = f"""...standard prompt...
{CUSTOM_RULES}
{diff}
"""
Using AI Review Comments Inline
For more targeted feedback, post comments on specific lines:
def post_inline_comments(pr, filepath: str, issues: list[dict], commits):
"""Post comments on specific lines of the diff."""
for issue in issues:
if not issue.get("line"):
continue
try:
# Get the commit for this file
commit = commits.reversed[0] # Latest commit
pr.create_review_comment(
body=f"**{issue['title']}** ({issue['severity']})\n\n{issue['description']}",
commit=commit,
path=filepath,
line=issue["line"],
)
except Exception as e:
print(f"Could not post inline comment: {e}")
Cost Optimization
AI review can get expensive. Here’s how to keep costs low:
# Only review files with significant changes
- name: Check diff size
id: diff-check
run: |
DIFF_SIZE=$(git diff --stat $BASE_SHA $HEAD_SHA | tail -1 | grep -oP '\d+ insertion')
echo "diff_size=$DIFF_SIZE" >> $GITHUB_OUTPUT
- name: Run AI review
if: |
steps.diff-check.outputs.diff_size != '' &&
steps.diff-check.outputs.diff_size > 10
run: python .github/scripts/ai_review.py
# Use cheaper model for small diffs
model = "claude-haiku-3-5" if len(diff) < 500 else "claude-sonnet-4-5"
# claude-haiku: ~10x cheaper, good for simple reviews
Average costs per PR review (2026 pricing):
- Small PR (< 200 lines): ~$0.01-0.05
- Medium PR (200-1000 lines): ~$0.05-0.20
- Large PR (1000+ lines): ~$0.20-0.50
For a team making 50 PRs/week, that’s typically $10-50/month — less than one hour of developer time.
Integration with GitHub Branch Protection
# .github/branch-protection.yml (via GitHub REST API or Terraform)
required_status_checks:
strict: true
contexts:
- "ai-review" # Block merge if critical AI issues found
- "ci/tests"
- "ci/lint"
Make AI review a non-blocking soft-gate until your team builds trust with it:
# Start with informational-only (never fails)
- name: AI Review (advisory)
continue-on-error: true # PR can still merge
run: python .github/scripts/ai_review.py
After a few weeks, review the AI’s track record and decide whether to make it blocking.
Sample Output
Here’s what a typical AI review comment looks like on a PR:
## 🤖 AI Code Review — ⚠️ Major issues found
*Reviewed 3 files | 5 issues found*
### 🔒 Security Issues
🚨 **[HIGH]** Potential SQL injection in user search
> String interpolation used in query: `f"SELECT * FROM users WHERE name = '{search_term}'"`
> **Fix:** Use parameterized query: `db.execute("SELECT * FROM users WHERE name = %s", (search_term,))`
### 📁 File Reviews
**⚠️ `src/api/users.py`** — Adds user search endpoint with filtering
- 🚨 **SQL injection in search** (L47)
- 💡 **Missing pagination** (L52)
- 💡 **No rate limiting on search endpoint** (general)
**✅ `src/models/user.py`** — Adds email validation field
**💡 `tests/test_users.py`** — Adds tests for user model
- 💡 **Missing test for empty search term** (L34)
Conclusion
AI code review in 2026 is a force multiplier, not a replacement. The teams doing it well use AI to handle the mechanical, automatable parts of review — leaving human reviewers to focus on architecture, business logic, and team knowledge transfer.
The pipeline in this guide is production-ready. Start with continue-on-error: true and let your team build trust with the AI reviewer before making it a hard gate. Within a few sprints, you’ll wonder how you reviewed code without it.
Ship faster. Catch more bugs. Keep your human reviewers happy.
References:
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
