Building Production-Ready AI Agents with LangChain: A Complete Guide
in Ai on Langchain, Ai-agents, Llm, Python, Openai, Production
Building Production-Ready AI Agents with LangChain: A Complete Guide
AI agents are transforming how we build software. Unlike simple chatbots, agents can reason, plan, and execute complex tasks using tools. LangChain makes building them practical.
Photo by Igor Omilaev on Unsplash
What Are AI Agents?
An AI agent is an LLM that can:
- Reason about a problem
- Plan a sequence of actions
- Execute those actions using tools
- Observe results and iterate
Think: an AI that can actually do things, not just say things.
LangChain Agent Architecture
User Query → Agent (LLM) → Thought → Action → Tool → Observation → ... → Final Answer
Key components:
- LLM: The brain (GPT-4, Claude, etc.)
- Tools: Functions the agent can call
- Memory: Conversation and context storage
- Prompt Template: Instructions for the agent
Building Your First Agent
Installation
pip install langchain langchain-openai langgraph
Basic Agent with Tools
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain import hub
# Define tools
def search_database(query: str) -> str:
"""Search the product database."""
# Your database logic here
return f"Found 3 products matching '{query}'"
def calculate_price(expression: str) -> str:
"""Calculate price with discount."""
result = eval(expression) # Use safer eval in production
return f"Result: ${result:.2f}"
tools = [
Tool(
name="ProductSearch",
func=search_database,
description="Search for products in the database"
),
Tool(
name="PriceCalculator",
func=calculate_price,
description="Calculate prices. Input: math expression"
),
]
# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Run
result = agent_executor.invoke({
"input": "Find laptops under $1000 and calculate 15% discount on $899"
})
Photo by Florian Olivo on Unsplash
Production Patterns
1. Structured Tool Definitions
Use Pydantic for type-safe tools:
from langchain.tools import StructuredTool
from pydantic import BaseModel, Field
class SearchInput(BaseModel):
query: str = Field(description="Search query")
category: str = Field(default="all", description="Product category")
max_results: int = Field(default=10, description="Maximum results")
def search_products(query: str, category: str = "all", max_results: int = 10):
# Implementation
pass
search_tool = StructuredTool.from_function(
func=search_products,
name="ProductSearch",
description="Search products with filters",
args_schema=SearchInput
)
2. Memory Management
For conversations that span multiple turns:
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(
memory_key="chat_history",
k=10, # Keep last 10 exchanges
return_messages=True
)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
memory=memory,
verbose=True
)
For production, use persistent memory:
from langchain.memory import RedisChatMessageHistory
message_history = RedisChatMessageHistory(
url="redis://localhost:6379",
session_id="user-123"
)
3. Error Handling and Retries
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
max_iterations=5, # Prevent infinite loops
max_execution_time=30, # Timeout in seconds
handle_parsing_errors=True, # Graceful error handling
return_intermediate_steps=True # For debugging
)
4. Streaming Responses
For better UX with long-running agents:
async for event in agent_executor.astream_events(
{"input": query},
version="v1"
):
if event["event"] == "on_chat_model_stream":
print(event["data"]["chunk"].content, end="")
elif event["event"] == "on_tool_start":
print(f"\n🔧 Using tool: {event['name']}")
Advanced: LangGraph for Complex Workflows
When agents need more control flow:
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
class AgentState(TypedDict):
messages: Annotated[list, operator.add]
next_action: str
def should_continue(state: AgentState) -> str:
last_message = state["messages"][-1]
if "FINAL ANSWER" in last_message:
return "end"
return "continue"
# Build graph
workflow = StateGraph(AgentState)
workflow.add_node("agent", agent_node)
workflow.add_node("tools", tool_node)
workflow.add_conditional_edges(
"agent",
should_continue,
{"continue": "tools", "end": END}
)
workflow.add_edge("tools", "agent")
workflow.set_entry_point("agent")
app = workflow.compile()
Production Checklist
Observability
from langchain.callbacks import LangChainTracer
from langsmith import Client
# Enable LangSmith tracing
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "my-agent-prod"
Rate Limiting
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4",
max_retries=3,
request_timeout=30,
max_tokens=1000 # Control costs
)
Input Validation
def validate_input(user_input: str) -> str:
if len(user_input) > 1000:
raise ValueError("Input too long")
# Add more validation
return user_input.strip()
Fallback Strategies
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
primary_llm = ChatOpenAI(model="gpt-4")
fallback_llm = ChatAnthropic(model="claude-3-sonnet")
llm_with_fallback = primary_llm.with_fallbacks([fallback_llm])
Cost Optimization
| Strategy | Implementation | Savings |
|---|---|---|
| Caching | Redis/SQLite cache | 40-60% |
| Model routing | GPT-3.5 for simple, GPT-4 for complex | 30-50% |
| Token limits | max_tokens parameter | Variable |
| Batch processing | Async concurrent calls | Time savings |
Semantic Caching
from langchain.cache import RedisSemanticCache
from langchain_openai import OpenAIEmbeddings
import langchain
langchain.llm_cache = RedisSemanticCache(
redis_url="redis://localhost:6379",
embedding=OpenAIEmbeddings(),
score_threshold=0.95
)
Testing Agents
import pytest
from unittest.mock import patch
def test_agent_uses_correct_tool():
with patch.object(search_tool, 'func') as mock_search:
mock_search.return_value = "Found 5 items"
result = agent_executor.invoke({
"input": "Find all laptops"
})
mock_search.assert_called_once()
assert "laptops" in mock_search.call_args[0][0].lower()
Deployment Architecture
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ FastAPI │────▶│ Agent │────▶│ Tools │
│ Server │ │ Executor │ │ (APIs) │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Redis │ │ LangSmith │
│ (Cache) │ │ (Tracing) │
└─────────────┘ └─────────────┘
Conclusion
Building production AI agents requires more than just connecting an LLM to tools. You need:
- Robust error handling - Agents will fail; plan for it
- Observability - You can’t fix what you can’t see
- Cost controls - LLM calls add up fast
- Testing - Especially for tool interactions
- Fallbacks - Have backup plans
Start simple, add complexity as needed, and always measure before optimizing.
The agent paradigm is still evolving rapidly. What works today might change tomorrow. Build for flexibility.
이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)
