Claude Code Complete Architecture Analysis: 7 Execution Modes, 45+ Tools, Coordinator Multi-Agent, and 8 Design Patterns
on Claude, Claude code, Ai, Architecture, Multi-agent, Typescript, React, Mcp, Developer tools
Claude Code Complete Architecture Analysis: 7 Execution Modes, 45+ Tools, Coordinator Multi-Agent, and 8 Design Patterns
Claude Code is Anthropic’s AI-powered terminal coding assistant — and beneath its conversational surface lies a remarkably sophisticated engineering system. A source code analysis of the 2026-03-31 release reveals approximately 1,884 TypeScript + React files composing a full-featured agentic platform. This post is a complete technical deep-dive into every major subsystem: execution modes, startup sequence, query loop, tool system, permission architecture, coordinator multi-agent, bridge system, hook system, memory system, UI layer, service layer, and the eight core design patterns that hold it all together.
The Three Core Principles
Every architectural decision in Claude Code traces back to three guiding principles:
- Safety — Dangerous operations are identified, blocked, or explicitly confirmed by the user. An AI classifier (not just regex) evaluates bash commands for risk.
- Performance — Streaming responses, parallel tool execution, memoized context, lazy imports, and frame-throttled terminal rendering keep the experience fast.
- Extensibility — Tools, skills, plugins, and the Model Context Protocol (MCP) allow Claude Code to be extended without modifying the core.
The 4-Phase Execution Flow
At the highest level, Claude Code operates in four phases:
Startup → Query Loop → Tool Execution → Display
Every interaction—from typing a question to seeing a diff rendered in the terminal—passes through these phases. The subsystems described below implement each phase in depth.
Phase 1: Startup Sequence (6 Steps)
The entire application is compiled into a ~800KB monolithic bundle (main.tsx). This deliberate choice eliminates Node.js module resolution overhead at startup — a classic performance-over-modularity trade-off.
Step 1: Parse CLI Arguments → Determine Execution Mode
The very first thing Claude Code does is parse argv. The result determines which of the 7 execution modes to enter (detailed in the next section). Flags like -p (headless), --coordinator, --bridge, and --daemon route to completely different code paths.
Step 2: Load Configuration (Settings Hierarchy)
Configuration is loaded from four levels, each overriding the previous:
system defaults
→ ~/.claude/settings.json (global user config)
→ .claude/settings.json (project config)
→ session overrides (runtime flags)
This layered approach means a team can enforce project-level settings (e.g., disallowing --dangerously-skip-permissions) while individual developers keep personal preferences in their global config.
Step 3: Parallel I/O Prefetch
Rather than fetching context serially, Claude Code launches three async operations simultaneously:
- Reading
CLAUDE.md(project instructions) - Running
git statusto understand the repository state - Collecting environment info (OS, shell, working directory, Node version)
This parallelism shaves hundreds of milliseconds from time-to-first-prompt.
Step 4: Initialize Services
The service layer is instantiated: API client (with auth, retry logic, and streaming), MCP server connections, and the cost tracker (real-time token counting with per-model pricing).
Step 5: Setup React/Ink UI or Headless Renderer
For interactive modes, Claude Code initializes a React + Ink terminal UI. For headless mode (-p flag), it skips the UI entirely and uses a lightweight stream renderer. This clean separation means headless mode has no UI overhead.
Step 6: Enter Execution Mode
Finally, the chosen execution mode takes over. The startup sequence is complete.
The 7 Execution Modes
One of Claude Code’s most distinctive architectural features is its support for seven fundamentally different execution modes — each designed for a specific usage context.
1. REPL (Interactive Terminal)
The default mode. Presents a terminal REPL where users type natural language requests. The query loop runs continuously until the user exits. This is what most developers experience day-to-day.
2. Headless (Non-Interactive / Pipe Mode)
Activated with the -p flag. Accepts a single prompt, executes it, prints the result, and exits. Designed for scripting and automation:
# Pipe Claude Code into a CI pipeline
echo "Review this diff for security issues" | claude -p
No UI is initialized. Output goes directly to stdout/stderr, making it composable with other Unix tools.
3. Coordinator (Multi-Agent Orchestration)
The most architecturally complex mode. When activated, Claude Code spawns a Leader agent that receives the overall task, decomposes it into subtasks, and assigns each subtask to independent Worker agents. Workers are full Claude Code instances with their own isolated contexts.
Communication between Leader and Workers uses structured JSON messages via the Task tool. This mode is used for large-scale refactors, multi-file parallel tasks, and anything that benefits from concurrent AI execution.
4. Bridge (CCR Cloud Connection)
Connects a local Claude Code terminal session to CCR (Claude Code Remote) — Anthropic’s cloud infrastructure. The protocol is WebSocket-based and supports up to 32 parallel sessions. This enables:
- Remote control of a local session from a mobile device or web browser
- Seamless handoff between devices (start on laptop, continue on phone)
- Shared sessions for pair programming scenarios
5. Kairos (Scheduled Task Execution)
A specialized mode for running Claude Code tasks on a schedule. Think cron-but-AI-aware: Kairos can be configured to run specific tasks at specific times, useful for nightly code reviews, automated documentation updates, or periodic refactoring passes.
6. Daemon (Background Service)
Runs Claude Code as a persistent background service. Rather than starting a new process per invocation, Daemon mode keeps the service alive and accepts requests over a local socket or IPC channel. This amortizes startup costs across many requests — critical for editor integrations where latency matters.
7. Viewer (Read-Only Session)
A read-only mode for observing an active Claude Code session without the ability to send commands. Useful for debugging, auditing, or demo scenarios where you want to watch a session without interfering.
Phase 2: The Query Loop (5 Steps)
The query loop is the beating heart of Claude Code, implemented in query.ts (~68KB). Every turn — every time you send a message and Claude Code responds — goes through these five steps:
Step 1: Build System Prompt
A rich system prompt is assembled from multiple sources:
- Identity block: Claude’s role, personality, and behavioral guidelines
- Environment info: OS, shell, working directory, git status, active MCP tools
- Tool definitions: JSON schemas for all available tools
- CLAUDE.md content: Project-specific and user-specific instructions
- Memory: Relevant saved learnings from previous sessions
The system prompt is rebuilt on every turn to reflect the current state of the environment.
Step 2: Stream API Response (Async Generator)
The request is sent to the Claude API. The response is consumed via an async generator pattern — tokens stream in as they are produced, rather than waiting for the full response. This is what makes Claude Code feel responsive even for long completions.
// Simplified pseudocode
async function* streamResponse(prompt: SystemPrompt): AsyncGenerator<ContentBlock> {
const stream = await anthropicClient.messages.stream({ ... });
for await (const chunk of stream) {
yield parseChunk(chunk);
}
}
Step 3: Parse Content Blocks
The stream produces two types of content blocks:
textblocks: prose responses to display to the usertool_useblocks: structured requests to invoke a specific tool with specific parameters
The parser handles interleaved text and tool calls gracefully.
Step 4: Execute Tools (Parallel Where Safe)
Tool calls are executed. Claude Code intelligently groups tools that can run concurrently (e.g., multiple file reads) while serializing those that cannot (e.g., writes that depend on previous reads). The 10-step tool execution pipeline (described below) handles each individual tool call.
Step 5: Build tool_result Blocks → Loop
Each tool’s output is formatted into a tool_result block and appended to the conversation. The loop then returns to Step 1 to build a new system prompt and continue the turn.
Auto-Compact Trigger
When the conversation context exceeds 95% of the model’s context window, the auto-compact service activates: it summarizes older turns into a compact representation, discards the originals, and continues. This allows Claude Code sessions to run indefinitely without hitting context limits.
Phase 3: Tool Execution Pipeline (10 Steps)
Each tool invocation follows a rigorous 10-step pipeline:
1. Parse tool_use block from API response
2. Look up tool definition in registry
3. Run PreToolUse hooks
4. Permission check (Default / Auto / Plan / Bypass)
5. User confirmation if needed
6. Concurrency control
7. Execute tool function
8. Run PostToolUse hooks
9. Format tool_result
10. Add to conversation → continue loop
Step 3 (PreToolUse hooks) and Step 8 (PostToolUse hooks) are user-extensible injection points. A PreToolUse hook can abort the tool call entirely — for example, a hook that prevents any git push without a ticket number in the commit message.
Step 4 (Permission check) is where the safety architecture lives. The AI classifier evaluates whether the operation is dangerous. If it is, and the mode requires user confirmation, the UI presents a PermissionPrompt component.
Step 6 (Concurrency control) uses a token-based semaphore. Some tools (Bash) are inherently sequential; others (Read, WebFetch) can safely run in parallel up to a configured concurrency limit.
The Tool System: 45+ Built-In Tools
Claude Code ships with a rich built-in toolset organized into categories:
File Operations
| Tool | Description | |——|————-| | Read | Read file contents | | Write | Create or overwrite a file | | Edit | Make precise edits (find/replace) | | MultiEdit | Multiple edits in one operation | | LS | List directory contents | | Glob | Find files by pattern | | Grep | Search file contents |
Execution
| Tool | Description | |——|————-| | Bash | Persistent shell session | | Computer | GUI automation (screenshots, clicks) |
The Bash tool deserves special attention: it runs in a persistent shell process. Unlike running bash -c "..." per call, Claude Code’s Bash tool maintains the same shell session across calls. This means:
cd /some/dirin one call persists for the next call- Environment variable assignments persist
- Shell state (aliases, functions defined in the session) persists
Security is enforced via Tree-sitter AST analysis — Claude Code parses the bash command into an abstract syntax tree and identifies dangerous patterns (pipe to sudo, rm -rf /, etc.) before the AI classifier even sees it.
AI / Agent Tools
| Tool | Description | |——|————-| | Task | Spawn a sub-agent for a subtask | | Agent | Coordinator agent management |
Web Tools
| Tool | Description | |——|————-| | WebFetch | Fetch and parse a URL | | WebSearch | Search the web |
Notebook Tools
| Tool | Description | |——|————-| | NotebookRead | Read Jupyter notebook cells | | NotebookEdit | Edit notebook cells |
Memory Tools
| Tool | Description | |——|————-| | MemoryRead | Read from memory store | | MemoryWrite | Write to memory store |
MCP Tools
Tools dynamically injected from connected MCP (Model Context Protocol) servers at startup. The registry is populated at runtime, so Claude Code can be extended with domain-specific tools without any code changes.
The Permission System
Claude Code’s permission system has four modes that control how aggressively it confirms actions:
Default Mode
Normal operation. Dangerous operations (file deletion, running unknown scripts, network requests to sensitive endpoints) require explicit user confirmation. Safe operations (file reads, listing directories) proceed automatically.
Auto Mode
Safe operations are auto-approved. Dangerous operations still ask. Useful for developers who trust Claude’s judgment but want a safety net for truly risky ops.
Plan Mode
No execution at all — Claude Code only plans what it would do. The user reviews the plan and then either approves it (switching to another mode) or refines the request. Ideal for complex multi-step operations where you want to review before committing.
Bypass Mode (--dangerously-skip-permissions)
All permission checks are skipped. This flag is intentionally named to be alarming — it is designed for fully trusted automation environments (CI/CD, sandboxed containers) where the overhead of confirmation is unacceptable.
The AI Classifier
What distinguishes Claude Code’s safety system from simple allow/deny lists is the use of Claude itself as a classifier. When a Bash command is submitted, Claude Code sends it to a fast Claude inference call asking: “Is this command dangerous?” The response informs the permission check. This catches novel dangerous patterns that regex could never anticipate.
The Hook System
Hooks are Claude Code’s extensibility mechanism for wrapping tool execution and session lifecycle events. There are four hook types:
PreToolUse
Runs a shell command before a tool executes. Can abort the tool call by returning a non-zero exit code. Use cases:
- Logging all tool calls to an audit trail
- Blocking certain operations based on custom business rules
- Validating input parameters beyond what the AI classifier checks
PostToolUse
Runs after a tool completes. Can inspect the tool’s output. Use cases:
- Sending notifications when specific files are modified
- Triggering external CI/CD webhooks after a write operation
- Accumulating tool call statistics
UserPromptSubmit
Runs before a user’s message is sent to the AI. Can modify or augment the message. Use cases:
- Automatically appending project context to every message
- Sanitizing sensitive data from prompts before they reach the API
- Injecting current ticket/issue numbers for traceability
Stop
Runs when the Claude Code session ends (graceful exit or timeout). Use cases:
- Saving session summaries
- Cleaning up temporary resources
- Sending end-of-session analytics
Hooks are defined as shell commands in CLAUDE.md or settings.json:
{
"hooks": {
"PreToolUse": "echo 'Tool: $TOOL_NAME' >> ~/.claude/audit.log",
"Stop": "~/.claude/scripts/session-cleanup.sh"
}
}
Coordinator Mode: Multi-Agent Architecture in Depth
The Coordinator execution mode implements a Leader/Worker multi-agent pattern that allows Claude Code to tackle tasks too large or complex for a single context window.
Architecture
User Request
↓
Leader Agent
(task decomposition)
↓
┌──────────────────────────────┐
│ Worker 1 Worker 2 ...N │
│ (context) (context) │
└──────────────────────────────┘
↓
Leader Agent
(result synthesis)
↓
User Response
Leader Agent Responsibilities
- Receives the full user request
- Analyzes the codebase or task scope
- Decomposes the task into independent (or ordered) subtasks
- Assigns subtasks to Worker agents via the
Tasktool - Collects results and synthesizes a coherent response
Worker Agent Characteristics
- Each Worker is a full, independent Claude Code instance
- Each has its own context window (not shared with Leader or other Workers)
- Workers communicate results back to the Leader via structured JSON
- Workers can use the full tool system (Bash, Read, Write, etc.)
- Workers can be created and destroyed dynamically as the task evolves
Communication Protocol
The Task tool sends a structured JSON payload:
{
"task_id": "refactor-auth-module",
"description": "Refactor the authentication module to use JWT",
"context": {
"files": ["src/auth/index.ts", "src/auth/middleware.ts"],
"constraints": ["maintain backward compatibility", "add unit tests"]
}
}
Workers return results in a matching structured format, which the Leader uses to track progress and synthesize the final output.
When to Use Coordinator Mode
- Large refactors spanning dozens of files (each file can be a separate Worker task)
- Parallel test generation (one Worker per module)
- Multi-language documentation (one Worker per target language)
- Codebase analysis where different modules can be analyzed independently
The Bridge System
The Bridge execution mode is Claude Code’s answer to multi-device workflows. It establishes a WebSocket-based connection between a local Claude Code terminal session and CCR (Claude Code Remote) cloud infrastructure.
Technical Details
- Protocol: WebSocket (bidirectional, low-latency)
- Maximum parallel sessions: 32
- Authentication: uses the same Anthropic API credentials as the main client
- Session persistence: sessions can be suspended and resumed
Use Cases
Remote Control: Start a coding session on your laptop, then continue it from your phone via a web interface. The local terminal remains the execution environment; the remote interface provides the UI.
Device Handoff: Complete a debugging session on a desktop workstation, hand it off to a colleague who picks it up on their own device — full session state preserved.
Mobile Oversight: Monitor a long-running refactor from a mobile device. You can intervene, ask questions, or redirect the session without being physically at your computer.
CI/CD Integration: A CI pipeline can connect to a running Claude Code Bridge session to inspect progress or inject additional context mid-run.
The Memory System
Claude Code implements a sophisticated 4-tier memory architecture:
Tier 1: User Memory (~/.claude/CLAUDE.md)
Global instructions that apply to every Claude Code session for this user, regardless of project. Examples:
- Preferred code style (“always use semicolons in TypeScript”)
- Personal workflow preferences (“always run tests before committing”)
- Identity context (“I am a senior backend engineer at Acme Corp”)
Tier 2: Project Memory (./CLAUDE.md)
Project-specific instructions committed alongside the code. Every team member using Claude Code on this project gets these instructions automatically. Examples:
- Project architecture documentation
- Build and test commands
- Coding standards specific to this codebase
- Known gotchas and footguns
Tier 3: Feedback Memory
Auto-generated memory. When Claude Code learns something useful during a session — the correct build command, a debugging technique that worked, a non-obvious project convention — it automatically saves this to ~/.claude/projects/<hash>/memory.md. This creates a persistent knowledge base that improves over time.
Tier 4: Reference Files
Additional context files that can be attached to a session. These can be specification documents, API schemas, architecture diagrams (as text), or any other reference material that Claude should have access to.
The UI Layer: React + Ink
The interactive terminal UI is built on React + Ink — a framework that renders React components as terminal output. This gives Claude Code the full power of React’s component model in a terminal context.
Key UI Optimizations
Double-buffering: A custom double-buffer implementation prevents visual tearing during rapid updates (streaming output).
Dirty Tracking: Only components whose data has changed are re-rendered. This is critical for performance when streaming long outputs — updating the cost display shouldn’t re-render the diff viewer.
Frame Throttling: The renderer is capped at 30 frames per second. This prevents excessive CPU usage during rapid streaming while maintaining smooth visual responsiveness.
Key UI Components
PermissionPrompt: The interactive confirmation dialog shown when a risky operation requires user approval. Shows the exact command/operation, risk level, and yes/no/explain options.DiffViewer: A colored unified diff renderer for file modifications.CostDisplay: Real-time token count and USD cost for the current session.ToolOutput: Structured display for tool execution results (collapsible, syntax-highlighted).
The Service Layer
Four major services back Claude Code’s operation:
API Client
Handles the full lifecycle of API communication:
- Authentication (API key management, CCH request signing)
- Retry logic with exponential backoff
- Streaming response consumption
- Rate limit handling
Auto-Compact Service
Continuously monitors the size of the conversation context relative to the model’s context limit. When it crosses the 95% threshold, it:
- Takes the oldest N turns
- Sends them to Claude for summarization into a compact representation
- Replaces the original turns with the compact summary
- Continues the session
This runs transparently — users see a brief “compacting…” indicator and the session continues without interruption.
MCP Client
Connects to configured MCP (Model Context Protocol) servers at startup. Each MCP server can provide:
- New tools (injected into the tool registry)
- Resource access (files, databases, APIs)
- Prompt templates
The MCP client handles connection management, tool schema injection, and proxying tool calls to the appropriate server.
Cost Tracker
Maintains real-time token counts broken down by:
- Input tokens (system prompt + conversation)
- Output tokens (Claude’s responses)
- Cache hits (for repeated context — significant discount)
- Per-model pricing (Claude 3.5 Sonnet, Claude 3 Opus, etc.)
- Cumulative session cost in USD
Slash Commands (80+)
Claude Code supports 80+ slash commands divided into two categories:
Prompt-Type Commands
Inject text or context into the conversation:
/help— Show available commands and usage/context— Show current context size and memory usage/memory— Show and edit memory contents/mcp— Show connected MCP servers and their tools
UI-Type Commands
Directly control the terminal UI:
/clear— Clear conversation history (new session)/compact— Manually trigger context compaction/settings— Open interactive settings editor/plan— Switch to Plan mode (review-before-execute)
The 8 Core Design Patterns
The Claude Code codebase applies eight recurring design patterns across its implementation. Understanding these patterns is key to understanding why the system behaves as it does.
Pattern 1: Generator Streaming
What: Async generators (async function*) are used throughout for streaming data — API responses, tool outputs, progress updates. Why: Generators enable lazy consumption. The UI can start rendering output the moment the first token arrives, without waiting for the complete response. Memory usage stays flat regardless of response length.
async function* streamApiResponse(): AsyncGenerator<Token> {
for await (const chunk of apiStream) {
yield parseToken(chunk);
}
}
// Consumer starts processing immediately
for await (const token of streamApiResponse()) {
renderToken(token);
}
Pattern 2: Feature Gates
What: A runtime feature flag system that enables/disables features without code changes. Why: Allows Anthropic to roll out new capabilities gradually, run A/B tests, and disable features in specific environments (e.g., disable the Bridge system in air-gapped environments).
if (featureGate('coordinator_mode')) {
// Coordinator-specific initialization
}
Pattern 3: Memoized Context
What: React’s useMemo hook is used extensively to cache expensive computed values (e.g., the assembled system prompt, tool schemas). Why: The system prompt is assembled from many sources on every turn. Without memoization, this would be recomputed even when the inputs haven’t changed. Memoization ensures the prompt is only rebuilt when CLAUDE.md, tools, or environment actually change.
Pattern 4: Withhold & Recover
What: When a tool produces a partial result before failing, the partial result is preserved and returned rather than discarded. Why: In long-running tool operations (large file reads, bash commands with substantial output), a mid-stream error would otherwise lose all accumulated output. Withhold & Recover saves the partial output and adds an error annotation, letting Claude decide whether to retry or use what’s available.
Pattern 5: Lazy Import
What: Heavy modules are imported dynamically (import() rather than require()) and only when needed. Why: The startup monolith is ~800KB, but many capabilities (the Coordinator system, the Bridge WebSocket stack, notebook support) are only needed in specific execution modes. Lazy imports keep the initial parse-and-execute time minimal.
// Heavy module loaded only when coordinator mode is activated
const { CoordinatorOrchestrator } = await import('./coordinator/orchestrator');
Pattern 6: Immutable State
What: All application state is wrapped in DeepImmutable<T> and managed via Zustand stores. Why: Immutable state makes concurrency reasoning trivial. When multiple async operations (streaming API response + parallel tool executions) are modifying state simultaneously, immutability ensures each operation sees a consistent snapshot. Zustand’s Redux-like update model means state transitions are explicit and auditable.
Pattern 7: Interruption Resilience
What: Claude Code gracefully handles Ctrl+C (SIGINT) at any point in the execution pipeline. Why: Users frequently want to stop a long-running tool or redirect the AI mid-stream. Rather than crashing or leaving state corrupted, Claude Code:
- Captures the SIGINT signal
- Completes any in-flight tool writes (to avoid partial file corruption)
- Returns control to the user with a clean state
- Optionally shows what was accomplished before the interruption
Pattern 8: Dependency Injection
What: Services (API client, MCP client, cost tracker, etc.) are instantiated once at startup and injected into components/tools that need them, rather than being accessed as global singletons. Why: Dependency injection makes testing trivial (inject a mock API client), enables multiple instances (e.g., two MCP clients for two different servers), and makes the dependency graph explicit. It also enables the Coordinator mode to give each Worker agent its own isolated service instances.
Putting It All Together
The elegance of Claude Code’s architecture lies in how these systems compose. Consider a complex user request like “Refactor the authentication module to use OAuth2 instead of session cookies, update all callers, and add integration tests”:
- Startup has already loaded CLAUDE.md, git status, and env info in parallel.
- The user’s request enters the Query Loop. The system prompt includes the project context from CLAUDE.md.
- Claude responds with a
tool_useblock for theTasktool (activating Coordinator mode). - The Coordinator spawns a Leader agent that decomposes the task into: (a) refactor auth module, (b) update callers in module A, (c) update callers in module B, (d) write integration tests.
- Four Worker agents execute in parallel. Each uses the Bash and Edit tools. Each tool invocation passes through the 10-step execution pipeline, including PreToolUse hooks that log changes to an audit trail.
- The Permission system in Auto mode auto-approves file reads and edits, presenting a confirmation only for a
git commitoperation. - Workers return results via structured JSON. The Leader synthesizes them and presents a summary.
- The Cost tracker shows the total token usage. The auto-compact service has been quietly compacting early turns to keep the context window healthy.
- The DiffViewer renders all modified files as a colored unified diff.
All of this, from a single natural language request.
Conclusion
Claude Code is not a thin wrapper around the Claude API. It is a fully realized agentic platform with:
- 7 execution modes covering interactive, headless, multi-agent, remote, scheduled, daemon, and viewer use cases
- 45+ tools covering files, execution, web, notebooks, memory, and MCP extensions
- A 10-step tool execution pipeline with hooks, permissions, and concurrency control
- A 4-tier memory system that persists knowledge across sessions
- 8 design patterns that make the system safe, fast, and extensible
- A terminal UI built on React + Ink with double-buffering, dirty tracking, and frame throttling
For developers building on top of Claude Code — or just trying to use it more effectively — understanding this architecture is the difference between treating it as a chatbot and treating it as the sophisticated agentic platform it actually is.
Analysis based on Claude Code source code as of 2026-03-31. Source: wikidocs.net/338204 — “별첨 91. 클로드 코드 소스 코드 분석서”.
Photo by Luca Bravo on Unsplash
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
