AI Agent Design Patterns 2026: ReAct, Reflection & Planning Implementation Guide

Q: What is the ReAct pattern in AI agents?

ReAct (Reasoning and Acting) is an agent pattern that interleaves reasoning traces with action execution in a Thought-Action-Observation loop. The agent thinks about what to do, takes an action using external tools, observes the result, and uses that information for the next reasoning step. ReAct achieves 47.8% accuracy on HotpotQA multi-hop QA tasks versus 29.4% baseline.

Q: How does the Reflection pattern improve AI agent performance?

The Reflection pattern enables agents to critique and improve their own outputs through self-evaluation. After generating an initial response, the agent assesses it for accuracy, identifies gaps, and iteratively refines the output. Reflexion agents achieve 91% pass@1 on HumanEval coding benchmarks versus GPT-4's baseline of 80%, without any fine-tuning.

Q: When should I use Plan-and-Execute instead of ReAct?

Use Plan-and-Execute for complex multi-step tasks with dependencies, high-accuracy requirements, and long-term planning scenarios. It achieves 92% task accuracy versus 85% for ReAct but costs 2x more in API calls. ReAct is better for simple objectives requiring quick responses and real-time interactive scenarios.

Q: What frameworks support AI agent patterns in 2026?

LangGraph (LangChain) is the leading framework with native support for ReAct, Reflection, and Plan-and-Execute patterns. OpenAI Agents SDK 0.6.x, Claude Agent SDK 1.x, CrewAI, and AutoGen also provide production-ready implementations. LangGraph offers the most flexible graph-based architecture for custom agent workflows.

Q: How do production AI systems combine these patterns?

Real production systems combine 2-3 patterns for optimal results. Perplexity uses ReAct plus Multi-Agent architecture for search with separate retrieval, synthesis, and verification agents. Claude Code uses Reflection plus Planning with a plan mode that forces architectural thinking before execution. GitHub Copilot Chat uses ReAct with RAG for multi-file code edits.

Q: What benchmarks prove these patterns work?

Verified benchmarks from academic papers show: ReAct achieves 47.8% on HotpotQA (vs 29.4% baseline), Reflexion reaches 91% on HumanEval (vs GPT-4's 80%), Tree of Thoughts solves 74% of Game of 24 puzzles (vs 4% for chain-of-thought), and Plan-and-Execute achieves 92% task accuracy (vs 85% for ReAct alone).

Q: How do I detect and prevent reasoning loops in AI agents?

Implement loop detection by tracking the last N actions in a sliding window and comparing for repeated patterns. Set maximum iteration limits (typically 10-15 iterations). Use exponential backoff with retry logic for tool failures. Monitor token usage and implement circuit breakers that fall back to simpler patterns when agents exceed thresholds.

Q: What is the cost difference between AI agent patterns?

Direct prompting costs approximately $0.01-0.02 per query. ReAct with 3 steps costs $0.06-0.09 with 200-300% token overhead. Reflection with 2 iterations costs $0.08-0.12. Plan-and-Execute costs $0.12-0.18 with 300-400% token overhead. Combined patterns can cost $0.15-0.25 per query with 500-600% token overhead.

Q: Which pattern should I start with for my first AI agent?

Start with the ReAct pattern for your first AI agent. It is the most widely understood, battle-tested, and suitable for 80% of use cases. ReAct provides a good balance of capability and simplicity. Once you have ReAct working, add Reflection for quality-critical tasks like code generation, or Planning for complex multi-step workflows.

Q: How do I implement memory persistence for Reflection agents?

Use a vector database like Chroma or Pinecone to store reflections with embeddings for semantic similarity search. Store metadata including the original task, success/failure status, and timestamp. Retrieve relevant reflections by similarity to the current task, filtering by outcome type. This enables agents to learn from past failures without weight updates.

Why AI Agent Patterns Matter in 2026

Your production AI agent just crashed mid-task. Again. The logs show it looped 47 times trying to parse a malformed API response before hitting the iteration limit. Sound familiar?

This is the reality of building AI agents without understanding the core design patterns that power every successful implementation. While GPT-4 solves only 4% of Game of 24 puzzles with standard prompting, agents using the Tree of Thoughts planning pattern achieve 74% success. The difference is not the model - it is the architecture.

AI agent design patterns are reusable architectural solutions that solve common problems in autonomous AI systems. They define how agents reason about tasks, take actions, learn from mistakes, and plan multi-step workflows. The three patterns covered in this guide - ReAct, Reflection, and Planning - power every major AI coding assistant, search engine, and automation tool shipping in 2026.

What You Will Learn

This implementation guide provides production-ready code for each pattern:

ReAct Pattern: Full LangChain and LangGraph implementations with tool integration
Reflection Pattern: Reflexion agent with memory persistence using vector databases
Planning Pattern: Plan-and-Execute with adaptive replanning
Pattern Combinations: How Claude Code and Perplexity combine patterns
Production Deployment: Error handling, observability, and scaling strategies

All code examples are verified against January 2026 framework versions: LangChain 0.3.x, LangGraph 1.x, OpenAI Agents SDK 0.6.x, and Claude Agent SDK 1.x.

ReAct Pattern: Reasoning + Acting Implementation

ReAct is a pattern that interleaves reasoning traces with action execution, allowing the agent to think through problems step-by-step while taking actions and observing their outcomes. Introduced by Yao et al. in their 2022 paper "ReAct: Synergizing Reasoning and Acting in Language Models" (ICLR 2023), it remains the foundational pattern for tool-using AI agents.

How ReAct Works

Unlike pure chain-of-thought (reasoning only) or action-only approaches, ReAct creates a feedback loop:

Thought: Agent reasons about the current state and what action to take next
Action: Agent invokes a tool (search, calculate, API call, etc.)
Observation: Agent receives the result from the tool execution
Repeat: Loop continues until the agent has enough information to answer

ReAct Benchmark Results (Verified from Original Paper)

Benchmark	Baseline (Act-only)	Chain-of-Thought	ReAct	ReAct + CoT
HotpotQA (multi-hop QA)	29.4%	34.3%	34.3%	47.8%
Fever (fact verification)	56.3%	64.1%	71.1%	69.7%
ALFWorld (decision-making)	45%	-	79%	-
WebShop (e-commerce)	29.1%	-	39.3%	-

Source: Yao et al. (2022) "ReAct: Synergizing Reasoning and Acting in Language Models" - ICLR 2023

Full Python Implementation with LangChain

Here is a production-ready ReAct agent using LangChain's create_react_agent:

"""
ReAct Agent Implementation with LangChain
Requires: pip install langchain langchain-openai langchain-community
"""
from langchain import hub
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.tools import tool
from langchain_core.prompts import PromptTemplate
import os

# Set up environment
os.environ["OPENAI_API_KEY"] = "your-api-key"

# Define custom tools
@tool
def search_wikipedia(query: str) -> str:
    """Search Wikipedia for information about a topic.
    Use this tool when you need factual information about people, places, events, or concepts.
    """
    from langchain_community.tools import WikipediaQueryRun
    from langchain_community.utilities import WikipediaAPIWrapper

    api_wrapper = WikipediaAPIWrapper(top_k_results=2, doc_content_chars_max=1000)
    wiki = WikipediaQueryRun(api_wrapper=api_wrapper)
    return wiki.run(query)

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression.
    Use this tool for any calculations. Input should be a valid Python math expression.
    Examples: "2 + 2", "math.sqrt(16)", "15 * 7 / 3"
    """
    import math
    try:
        result = eval(expression, {"__builtins__": {}, "math": math})
        return str(result)
    except Exception as e:
        return f"Error evaluating expression: {e}"

# Initialize tools and model
tools = [search_wikipedia, calculate]
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# ReAct prompt template
react_prompt = PromptTemplate.from_template('''Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}''')

# Create ReAct agent
agent = create_react_agent(llm, tools, react_prompt)

# Wrap in executor with error handling
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,           # Show reasoning trace
    max_iterations=10,      # Prevent infinite loops
    handle_parsing_errors=True,
    return_intermediate_steps=True  # For debugging
)

# Execute agent
def run_react_agent(question: str):
    """Run the ReAct agent and return structured response."""
    try:
        result = agent_executor.invoke({"input": question})
        return {
            "answer": result["output"],
            "steps": result.get("intermediate_steps", []),
            "success": True
        }
    except Exception as e:
        return {
            "answer": None,
            "error": str(e),
            "success": False
        }

# Example usage
if __name__ == "__main__":
    question = "What is the population of Paris, and what is the square root of that number?"
    result = run_react_agent(question)
    print(f"\nFinal Answer: {result['answer']}")

ReAct with LangGraph (Full Control)

For maximum flexibility, implement ReAct from scratch with LangGraph's state graph:

"""
ReAct Agent Implementation with LangGraph (from scratch)
Requires: pip install langgraph langchain-openai
"""
from typing import Annotated, Sequence, TypedDict
from langchain_core.messages import BaseMessage, ToolMessage, SystemMessage, HumanMessage
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
import json

# Define agent state
class AgentState(TypedDict):
    """State maintained across the agent's reasoning loop."""
    messages: Annotated[Sequence[BaseMessage], add_messages]
    iteration_count: int

# Define tools
@tool
def search_web(query: str) -> str:
    """Search the web for current information."""
    # Replace with real search API in production
    return f"Search results for '{query}': [Relevant information here]"

@tool
def run_code(code: str) -> str:
    """Execute Python code and return the result."""
    try:
        local_vars = {}
        exec(code, {"__builtins__": __builtins__}, local_vars)
        return str(local_vars.get('result', 'Code executed successfully'))
    except Exception as e:
        return f"Error: {e}"

# Initialize model with tools
tools = [search_web, run_code]
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
model_with_tools = model.bind_tools(tools)
tools_by_name = {tool.name: tool for tool in tools}

# Define graph nodes
def call_model(state: AgentState) -> dict:
    """Call the LLM with current state."""
    system_message = SystemMessage(content="""You are a helpful AI assistant that uses tools to answer questions.
For each question:
1. Think about what information you need
2. Use available tools to gather that information
3. Synthesize the observations into a clear answer""")

    messages = [system_message] + list(state["messages"])
    response = model_with_tools.invoke(messages)

    return {
        "messages": [response],
        "iteration_count": state["iteration_count"] + 1
    }

def execute_tools(state: AgentState) -> dict:
    """Execute tool calls from the last message."""
    last_message = state["messages"][-1]
    outputs = []

    for tool_call in last_message.tool_calls:
        tool_name = tool_call["name"]
        tool_result = tools_by_name[tool_name].invoke(tool_call["args"])
        outputs.append(
            ToolMessage(
                content=tool_result if isinstance(tool_result, str) else json.dumps(tool_result),
                name=tool_name,
                tool_call_id=tool_call["id"]
            )
        )

    return {"messages": outputs}

def should_continue(state: AgentState) -> str:
    """Determine if agent should continue or stop."""
    last_message = state["messages"][-1]

    if state["iteration_count"] >= 10:
        return "end"

    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "continue"

    return "end"

# Build the graph
workflow = StateGraph(AgentState)
workflow.add_node("agent", call_model)
workflow.add_node("tools", execute_tools)
workflow.set_entry_point("agent")
workflow.add_conditional_edges("agent", should_continue, {"continue": "tools", "end": END})
workflow.add_edge("tools", "agent")

graph = workflow.compile()

When to Use ReAct

Ideal for:

External tool integration (search, databases, APIs)
Multi-step problem solving requiring intermediate data
Research tasks with web search
Interactive debugging scenarios
Tasks where transparency of reasoning is important

Avoid when:

Simple direct questions (overhead not justified)
Latency-critical applications (<500ms required)
Tasks requiring long-term planning with dependencies
High-accuracy requirements where reflection helps

Production Examples Using ReAct

Product	How ReAct is Used
GitHub Copilot Chat	Agent Mode for multi-file edits with RAG + ReAct loop
Perplexity AI	Search + reasoning + citation in "Pro Search" mode
Claude Code	Tool use with terminal, files, LSP integration
ChatGPT Plugins	Function calling loop with observation handling

Reflection Pattern: Self-Improving Agents

The Reflection pattern adds a self-evaluation layer where agents critique their own outputs, checking for accuracy, verifying constraints, and identifying logical gaps. The breakthrough Reflexion paper by Shinn et al. (NeurIPS 2023) demonstrated that agents can learn through linguistic feedback, improving performance without weight updates.

The Reflexion Architecture

Reflexion implements a three-component system:

Actor: The LLM that generates solutions (attempts the task)
Evaluator: Assesses the solution quality (pass/fail with feedback)
Self-Reflection: Generates verbal feedback on failures, stored in memory

The key insight is that reflections are stored in episodic memory and retrieved for future attempts, enabling the agent to learn from past mistakes.

Reflection Benchmark Results

Benchmark	GPT-4 Baseline	Reflexion	Improvement
HumanEval (Python)	80.0% pass@1	91.0%	+11%
HumanEval (Rust)	40.6% pass@1	55.9%	+15.3%
ALFWorld	24% (2 trials)	97% (12 trials)	+73%
HotpotQA	31.0%	51.0%	+20%

Source: Shinn et al. (2023) "Reflexion: Language Agents with Verbal Reinforcement Learning" - NeurIPS 2023

Full Reflexion Agent Implementation

"""
Reflexion Agent Implementation with LangGraph
Based on Shinn et al. (2023) "Reflexion: Language Agents with Verbal Reinforcement Learning"
"""
from typing import TypedDict, List, Optional, Annotated
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages

# State definition
class ReflexionState(TypedDict):
    """State for the Reflexion agent."""
    task: str
    current_solution: Optional[str]
    evaluation: Optional[str]
    reflections: List[str]  # Memory of past reflections
    attempts: int
    messages: Annotated[List[BaseMessage], add_messages]

# Initialize models (different models for different roles)
actor_model = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
evaluator_model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
reflector_model = ChatOpenAI(model="gpt-4o-mini", temperature=0.3)

MAX_ATTEMPTS = 5

# Actor Node: Generate solution
def actor_node(state: ReflexionState) -> dict:
    """Generate or improve a solution based on task and past reflections."""

    reflection_context = ""
    if state["reflections"]:
        reflection_context = "\n\nPrevious attempts and learnings:\n"
        for i, reflection in enumerate(state["reflections"], 1):
            reflection_context += f"\nAttempt {i} Reflection:\n{reflection}\n"

    prompt = f"""You are an expert problem solver. Generate a solution for the following task.

Task: {state["task"]}
{reflection_context}

Based on any previous reflections, generate an improved solution.
Focus on avoiding past mistakes and incorporating lessons learned.

Provide your solution:"""

    response = actor_model.invoke([HumanMessage(content=prompt)])

    return {
        "current_solution": response.content,
        "attempts": state["attempts"] + 1,
        "messages": [AIMessage(content=f"Attempt {state['attempts'] + 1}:\n{response.content}")]
    }

# Evaluator Node: Assess solution quality
def evaluator_node(state: ReflexionState) -> dict:
    """Evaluate the current solution and determine if it passes."""

    prompt = f"""You are a strict evaluator. Assess if this solution correctly solves the task.

Task: {state["task"]}

Solution:
{state["current_solution"]}

Evaluate:
1. Is the solution correct and complete?
2. Are there any bugs, errors, or missing elements?
3. Does it fully address the task requirements?

Respond with exactly one of:
- PASS: [brief explanation]
- FAIL: [specific issues that need to be fixed]"""

    response = evaluator_model.invoke([HumanMessage(content=prompt)])

    return {
        "evaluation": response.content,
        "messages": [AIMessage(content=f"Evaluation: {response.content}")]
    }

# Self-Reflection Node: Generate improvement insights
def reflection_node(state: ReflexionState) -> dict:
    """Generate verbal reflection on why the solution failed."""

    prompt = f"""You are a thoughtful self-reflector. The solution failed evaluation.

Task: {state["task"]}

Failed Solution:
{state["current_solution"]}

Evaluation Feedback:
{state["evaluation"]}

Generate a reflection that:
1. Identifies the specific mistakes made
2. Explains WHY these mistakes occurred
3. Provides concrete strategies to avoid them in the next attempt
4. Suggests specific improvements to make

Your reflection (be specific and actionable):"""

    response = reflector_model.invoke([HumanMessage(content=prompt)])
    updated_reflections = state["reflections"] + [response.content]

    return {
        "reflections": updated_reflections,
        "messages": [AIMessage(content=f"Reflection: {response.content}")]
    }

# Routing logic
def should_continue(state: ReflexionState) -> str:
    if state["evaluation"] and state["evaluation"].startswith("PASS"):
        return "end"
    if state["attempts"] >= MAX_ATTEMPTS:
        return "end"
    return "reflect"

# Build the Reflexion graph
workflow = StateGraph(ReflexionState)
workflow.add_node("actor", actor_node)
workflow.add_node("evaluator", evaluator_node)
workflow.add_node("reflection", reflection_node)
workflow.set_entry_point("actor")
workflow.add_edge("actor", "evaluator")
workflow.add_conditional_edges("evaluator", should_continue, {"reflect": "reflection", "end": END})
workflow.add_edge("reflection", "actor")  # Loop back for retry

reflexion_graph = workflow.compile()

Memory System for Reflection Agents

For production systems, persist reflections in a vector database for semantic retrieval:

"""
Persistent Memory System for Reflexion using Chroma
"""
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document

class ReflectionMemory:
    """Persistent memory for storing and retrieving reflections."""

    def __init__(self, persist_directory: str = "./reflexion_memory"):
        self.embeddings = OpenAIEmbeddings()
        self.vectorstore = Chroma(
            collection_name="reflections",
            embedding_function=self.embeddings,
            persist_directory=persist_directory
        )

    def store_reflection(self, task: str, reflection: str, success: bool):
        """Store a reflection with metadata."""
        doc = Document(
            page_content=reflection,
            metadata={"task": task, "success": success}
        )
        self.vectorstore.add_documents([doc])

    def retrieve_relevant_reflections(self, task: str, k: int = 3) -> list:
        """Retrieve reflections similar to the current task."""
        docs = self.vectorstore.similarity_search(task, k=k)
        return [
            {"reflection": doc.page_content, "task": doc.metadata.get("task")}
            for doc in docs
        ]

Use Cases for Reflection Pattern

Use Case	Why Reflection Helps	Expected Improvement
Code Generation	Catches bugs through self-review	+10-15% pass@1
Creative Writing	Iterative quality improvement	Better coherence
Complex Reasoning	Validates logical chains	+20% accuracy
Test Generation	Improves coverage through reflection	65% to 85% coverage

Planning Pattern: Goal-Oriented Decomposition

Plan-and-execute agents separate planning from execution, achieving 92% task accuracy compared to 85% for ReAct patterns. This pattern is essential for complex multi-step workflows where understanding the full task structure upfront leads to better outcomes.

Planning Frameworks Comparison

Framework	Architecture	Key Innovation
BabyAGI	Task Creation > Prioritization > Execution	Three-agent task loop
AutoGPT	Recursive goal decomposition	Self-prompted planning
LangGraph PlanAndExecute	Planner + Executor + Replanner	Adaptive replanning
Tree of Thoughts	Branch exploration + backtracking	74% on Game of 24
ReWOO	Plan-first, execute-all	80% token reduction

Tree of Thoughts Benchmark Results

Tree of Thoughts demonstrates the power of deliberate planning:

Task	Chain-of-Thought	Tree of Thoughts	Improvement
Game of 24	4% success	74% success	+70%
Creative Writing	6.19 coherency	7.67 coherency	+24%
Mini Crosswords	<2% solved	20% solved	+18%

Source: Yao et al. (2023) "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" - NeurIPS 2023

Plan-and-Execute Implementation with LangGraph

"""
Plan-and-Execute Agent Implementation with LangGraph
Based on Plan-and-Solve paper and BabyAGI project
"""
from typing import TypedDict, List, Optional
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from pydantic import BaseModel, Field
import json

# Pydantic models for structured output
class Task(BaseModel):
    """A single task in the plan."""
    id: int = Field(description="Unique task identifier")
    description: str = Field(description="What needs to be done")
    dependencies: List[int] = Field(default=[], description="IDs of tasks this depends on")
    status: str = Field(default="pending")
    result: Optional[str] = Field(default=None)

class Plan(BaseModel):
    """A complete plan with tasks."""
    goal: str = Field(description="The overall goal")
    tasks: List[Task] = Field(description="Ordered list of tasks")

# State definition
class PlanExecuteState(TypedDict):
    goal: str
    plan: Optional[Plan]
    current_task_idx: int
    completed_results: List[str]
    final_answer: Optional[str]

# Initialize models
planner_model = ChatOpenAI(model="gpt-4o", temperature=0)  # Strong planner
executor_model = ChatOpenAI(model="gpt-4o-mini", temperature=0)  # Efficient executor

# Planner Node
def planner_node(state: PlanExecuteState) -> dict:
    """Create a plan to achieve the goal."""

    prompt = f"""Create a detailed step-by-step plan to achieve this goal.

Goal: {state["goal"]}

Requirements:
1. Break down into discrete, actionable tasks
2. Order tasks logically (dependencies first)
3. Each task should be independently executable

Return as JSON:
{{"goal": "the goal", "tasks": [{{"id": 1, "description": "task", "dependencies": []}}]}}"""

    response = planner_model.invoke([HumanMessage(content=prompt)])

    # Parse the plan
    content = response.content
    if "```json" in content:
        content = content.split("```json")[1].split("```")[0]
    plan_data = json.loads(content)
    plan = Plan(**plan_data)

    return {"plan": plan, "current_task_idx": 0}

# Executor Node
def executor_node(state: PlanExecuteState) -> dict:
    """Execute the current task."""

    plan = state["plan"]
    task_idx = state["current_task_idx"]

    if task_idx >= len(plan.tasks):
        return {}

    current_task = plan.tasks[task_idx]

    context = ""
    if state["completed_results"]:
        context = "\n\nPrevious results:\n"
        for i, result in enumerate(state["completed_results"]):
            context += f"Task {i+1}: {result[:200]}...\n"

    prompt = f"""Execute this task:

Goal: {plan.goal}
Current Task: {current_task.description}
{context}

Provide a thorough result:"""

    response = executor_model.invoke([HumanMessage(content=prompt)])

    current_task.status = "completed"
    current_task.result = response.content
    updated_results = state["completed_results"] + [response.content]

    return {
        "plan": plan,
        "current_task_idx": task_idx + 1,
        "completed_results": updated_results
    }

# Synthesizer Node
def synthesizer_node(state: PlanExecuteState) -> dict:
    """Synthesize final answer from all task results."""

    plan = state["plan"]

    prompt = f"""Synthesize these results into a final answer:

Goal: {plan.goal}

Task Results:
{json.dumps([{"task": t.description, "result": t.result} for t in plan.tasks], indent=2)}

Final Answer:"""

    response = executor_model.invoke([HumanMessage(content=prompt)])

    return {"final_answer": response.content}

# Routing logic
def should_continue_execution(state: PlanExecuteState) -> str:
    if state["current_task_idx"] >= len(state["plan"].tasks):
        return "synthesize"
    return "execute"

# Build the graph
workflow = StateGraph(PlanExecuteState)
workflow.add_node("planner", planner_node)
workflow.add_node("executor", executor_node)
workflow.add_node("synthesizer", synthesizer_node)
workflow.set_entry_point("planner")
workflow.add_edge("planner", "executor")
workflow.add_conditional_edges("executor", should_continue_execution, {"execute": "executor", "synthesize": "synthesizer"})
workflow.add_edge("synthesizer", END)

plan_execute_graph = workflow.compile()

ReAct vs Plan-and-Execute Comparison

Metric	ReAct	Plan-and-Execute
Response Time	~2-5s (faster)	~5-15s
Token Usage	2000-3000	3000-4500
Task Accuracy	85%	92%
API Calls	3-5	5-8
Cost per Task	$0.06-0.09	$0.09-0.14

Choose Plan-and-Execute when: Complex multi-step tasks with dependencies, high-accuracy requirements (financial analysis), long-term planning scenarios, tasks requiring strategic decision-making.

Choose ReAct when: Simple direct objectives, real-time interactive scenarios, cost-sensitive applications, quick responses needed.

Pattern Combinations in Production

Modern production systems rarely use a single pattern. Here is how major AI products combine patterns for optimal results.

Claude Code Architecture

Claude Code combines Reflection and Planning patterns:

Plan Mode Check: For complex requests, forces planning first
Planning Phase: Goal decomposition and task prioritization
ReAct Execution: For each task - thought, action (LSP, terminal, files), observation
Reflection Check: After key milestones, self-critique and approach updates

Perplexity AI Architecture (200M queries/day)

Perplexity uses ReAct + Multi-Agent:

Query Analysis: Understand user intent
Plan Generation: For Pro Search mode
Retrieval Agent: Search stack execution
Synthesis Agent: GPT-5/Claude 4.5 for answer generation
Verification Agent: Citation checking and grounding

Pattern Selection Decision Tree

Use this logic to select the right pattern combination:

Is the task simple and direct? YES: Use Direct Prompting (no agent needed)
Does quality matter more than speed? YES: Add Reflection pattern
Is the task genuinely complex with dependencies? YES: Consider Planning pattern
Are multiple specialized skills needed? YES: Use Multi-Agent system
Default: ReAct with appropriate tools

Combined Pattern Implementation

Here is an adaptive agent that selects patterns based on task complexity:

"""
Combined Pattern Agent: Adaptive Pattern Selection
"""
from typing import TypedDict, List, Optional, Literal
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END

class CombinedState(TypedDict):
    query: str
    complexity: Literal["simple", "medium", "complex"]
    plan: Optional[List[str]]
    current_step: int
    react_history: List[dict]
    reflections: List[str]
    final_answer: Optional[str]

# Complexity classifier
def classify_complexity(state: CombinedState) -> dict:
    model = ChatOpenAI(model="gpt-4o-mini", temperature=0)

    prompt = f"""Classify complexity of this task:

Task: {state["query"]}

Levels:
- SIMPLE: Direct answer, no tools, single step
- MEDIUM: Requires tools, 2-3 steps
- COMPLEX: Multi-step, dependencies, requires planning

Respond with one word: SIMPLE, MEDIUM, or COMPLEX"""

    response = model.invoke([{"role": "user", "content": prompt}])
    complexity = response.content.strip().upper()

    return {"complexity": complexity.lower() if complexity in ["SIMPLE", "MEDIUM", "COMPLEX"] else "medium"}

# Route based on complexity
def route_by_complexity(state: CombinedState) -> str:
    if state["complexity"] == "simple":
        return "direct"
    elif state["complexity"] == "complex":
        return "plan"
    else:
        return "react"

Production Deployment Guide

Error Handling Best Practices

"""
Production-grade error handling for AI agents
"""
from tenacity import retry, stop_after_attempt, wait_exponential
import logging

logger = logging.getLogger(__name__)

class AgentError(Exception):
    """Base exception for agent errors."""
    pass

class MaxIterationsError(AgentError):
    """Agent exceeded maximum iterations."""
    pass

class ReasoningLoopError(AgentError):
    """Agent stuck in reasoning loop."""
    pass

@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))
def safe_tool_call(tool, args):
    """Execute tool with retry logic."""
    try:
        return tool.invoke(args)
    except Exception as e:
        logger.warning(f"Tool {tool.name} failed: {e}")
        raise

def detect_reasoning_loop(history: list, window: int = 3) -> bool:
    """Detect if agent is stuck repeating the same actions."""
    if len(history) < window * 2:
        return False

    recent = history[-window:]
    previous = history[-window*2:-window]

    recent_actions = [h.get("action") for h in recent]
    previous_actions = [h.get("action") for h in previous]

    return recent_actions == previous_actions

Observability Setup

"""
Structured tracing for AI agents
"""
from dataclasses import dataclass, asdict
from datetime import datetime
import json

@dataclass
class AgentTrace:
    trace_id: str
    timestamp: str
    pattern: str  # "react", "reflection", "planning"
    input_query: str
    steps: list
    tools_called: list
    tokens_used: int
    latency_ms: int
    success: bool
    error: str = None

def create_trace(trace_id, pattern, query, steps, tools, tokens, latency, success, error=None):
    trace = AgentTrace(
        trace_id=trace_id,
        timestamp=datetime.utcnow().isoformat(),
        pattern=pattern,
        input_query=query,
        steps=steps,
        tools_called=tools,
        tokens_used=tokens,
        latency_ms=latency,
        success=success,
        error=error
    )
    return asdict(trace)

Cost and Performance Comparison

Pattern	Avg Latency	Token Overhead	Cost/Query
Direct Prompting	1-2s	Baseline	$0.01-0.02
ReAct (3 steps)	5-10s	+200-300%	$0.06-0.09
Reflection (2 iter)	8-15s	+100-200%	$0.08-0.12
Plan-and-Execute	10-20s	+300-400%	$0.12-0.18
Combined (all)	15-30s	+500-600%	$0.15-0.25

Framework Recommendations (January 2026)

Use Case	Recommended Framework	Why
General agents	LangGraph	Flexible, production-ready
OpenAI-native	OpenAI Agents SDK	Best GPT integration
Anthropic-native	Claude Agent SDK	MCP support, tool use
Multi-agent	CrewAI or AutoGen	Role-based collaboration
Simple prototyping	LangChain AgentExecutor	Quick start

Getting Started: Your Implementation Roadmap

Step-by-Step Implementation Checklist

Start with ReAct: Implement a basic ReAct agent with 2-3 tools. This handles 80% of use cases and builds foundational understanding.
Add Observability: Implement structured tracing from day one. You cannot debug what you cannot observe.
Implement Loop Detection: Add maximum iteration limits and action pattern detection to prevent infinite loops.
Add Reflection for Quality: Once ReAct works, add a reflection step for code generation or high-stakes outputs.
Implement Planning for Complexity: For multi-step workflows, add plan-and-execute with adaptive replanning.
Combine Patterns: Use complexity classification to route requests to appropriate pattern combinations.

Common Pitfalls and Solutions

Pitfall	Solution
Agent loops forever	Set max_iterations, implement loop detection, add circuit breakers
Tool errors crash agent	Use retry with exponential backoff, handle_parsing_errors=True
Costs spiral out of control	Route by complexity, use smaller models for execution
Reflection does not improve quality	Use different models for actor/evaluator, be specific in evaluation prompts
Plans are too vague	Use stronger model for planning (GPT-4o), require specific task descriptions

Required Dependencies

# Core frameworks
pip install langchain langchain-openai langgraph

# Vector database for memory
pip install chromadb

# Error handling
pip install tenacity

# Optional: Alternative frameworks
pip install crewai autogen openai-agents-sdk

Next Steps

Ready to implement these patterns in your own projects? Here is what to do next:

Clone the code examples from the LangGraph tutorials: https://langchain-ai.github.io/langgraph/tutorials/
Read the original papers for deeper understanding of the theory
Subscribe to our YouTube channel for video walkthroughs of these implementations

Frequently Asked Questions

What is the ReAct pattern in AI agents?

ReAct (Reasoning and Acting) is an agent pattern that interleaves reasoning traces with action execution in a Thought-Action-Observation loop. The agent thinks about what to do, takes an action using external tools, observes the result, and uses that information for the next reasoning step. ReAct achieves 47.8% accuracy on HotpotQA multi-hop QA tasks versus 29.4% baseline.

How does the Reflection pattern improve AI agent performance?

The Reflection pattern enables agents to critique and improve their own outputs through self-evaluation. After generating an initial response, the agent assesses it for accuracy, identifies gaps, and iteratively refines the output. Reflexion agents achieve 91% pass@1 on HumanEval coding benchmarks versus GPT-4's baseline of 80%, without any fine-tuning.

When should I use Plan-and-Execute instead of ReAct?

Use Plan-and-Execute for complex multi-step tasks with dependencies, high-accuracy requirements, and long-term planning scenarios. It achieves 92% task accuracy versus 85% for ReAct but costs 2x more in API calls. ReAct is better for simple objectives requiring quick responses and real-time interactive scenarios.

What frameworks support AI agent patterns in 2026?

LangGraph (LangChain) is the leading framework with native support for ReAct, Reflection, and Plan-and-Execute patterns. OpenAI Agents SDK 0.6.x, Claude Agent SDK 1.x, CrewAI, and AutoGen also provide production-ready implementations. LangGraph offers the most flexible graph-based architecture for custom agent workflows.

How do production AI systems combine these patterns?

Real production systems combine 2-3 patterns for optimal results. Perplexity uses ReAct plus Multi-Agent architecture for search with separate retrieval, synthesis, and verification agents. Claude Code uses Reflection plus Planning with a plan mode that forces architectural thinking before execution. GitHub Copilot Chat uses ReAct with RAG for multi-file code edits.

What benchmarks prove these patterns work?

Verified benchmarks from academic papers show: ReAct achieves 47.8% on HotpotQA (vs 29.4% baseline), Reflexion reaches 91% on HumanEval (vs GPT-4's 80%), Tree of Thoughts solves 74% of Game of 24 puzzles (vs 4% for chain-of-thought), and Plan-and-Execute achieves 92% task accuracy (vs 85% for ReAct alone).

How do I detect and prevent reasoning loops in AI agents?

Implement loop detection by tracking the last N actions in a sliding window and comparing for repeated patterns. Set maximum iteration limits (typically 10-15 iterations). Use exponential backoff with retry logic for tool failures. Monitor token usage and implement circuit breakers that fall back to simpler patterns when agents exceed thresholds.

What is the cost difference between AI agent patterns?

Direct prompting costs approximately $0.01-0.02 per query. ReAct with 3 steps costs $0.06-0.09 with 200-300% token overhead. Reflection with 2 iterations costs $0.08-0.12. Plan-and-Execute costs $0.12-0.18 with 300-400% token overhead. Combined patterns can cost $0.15-0.25 per query with 500-600% token overhead.

Which pattern should I start with for my first AI agent?

Start with the ReAct pattern for your first AI agent. It is the most widely understood, battle-tested, and suitable for 80% of use cases. ReAct provides a good balance of capability and simplicity. Once you have ReAct working, add Reflection for quality-critical tasks like code generation, or Planning for complex multi-step workflows.

How do I implement memory persistence for Reflection agents?

Use a vector database like Chroma or Pinecone to store reflections with embeddings for semantic similarity search. Store metadata including the original task, success/failure status, and timestamp. Retrieve relevant reflections by similarity to the current task, filtering by outcome type. This enables agents to learn from past failures without weight updates.

Conclusion

The three AI agent design patterns covered in this guide - ReAct, Reflection, and Planning - represent the architectural foundation of every major AI coding assistant, search engine, and automation tool shipping in 2026.

ReAct grounds reasoning in tool observations, achieving 47.8% accuracy on multi-hop QA versus 29.4% baseline. Reflection enables self-improvement that reaches 91% on HumanEval, surpassing GPT-4's 80%. Planning unlocks complex problem-solving, with Tree of Thoughts achieving 74% on puzzles that chain-of-thought solves only 4% of the time.

Production systems like Claude Code, Perplexity, and GitHub Copilot combine these patterns for optimal results. The key is not choosing a single pattern, but understanding when to apply each and how to combine them effectively.

Start with ReAct for your first agent. Add Reflection for quality-critical outputs. Use Planning for complex multi-step workflows. Implement observability from day one. Your production AI agents will thank you.

Why AI Agent Patterns Matter in 2026

What You Will Learn

ReAct Pattern: Reasoning + Acting Implementation

How ReAct Works

ReAct Benchmark Results (Verified from Original Paper)

Full Python Implementation with LangChain

ReAct with LangGraph (Full Control)

When to Use ReAct

Production Examples Using ReAct

Reflection Pattern: Self-Improving Agents

The Reflexion Architecture

Reflection Benchmark Results

Full Reflexion Agent Implementation

Memory System for Reflection Agents

Use Cases for Reflection Pattern

Planning Pattern: Goal-Oriented Decomposition

Planning Frameworks Comparison

Tree of Thoughts Benchmark Results

Plan-and-Execute Implementation with LangGraph

ReAct vs Plan-and-Execute Comparison

Pattern Combinations in Production

Claude Code Architecture

Perplexity AI Architecture (200M queries/day)

Pattern Selection Decision Tree

Combined Pattern Implementation

Production Deployment Guide

Error Handling Best Practices

Observability Setup

Cost and Performance Comparison

Framework Recommendations (January 2026)

Getting Started: Your Implementation Roadmap

Step-by-Step Implementation Checklist

Common Pitfalls and Solutions

Required Dependencies

Next Steps

Frequently Asked Questions

What is the ReAct pattern in AI agents?

How does the Reflection pattern improve AI agent performance?

When should I use Plan-and-Execute instead of ReAct?

What frameworks support AI agent patterns in 2026?

How do production AI systems combine these patterns?

What benchmarks prove these patterns work?

How do I detect and prevent reasoning loops in AI agents?

What is the cost difference between AI agent patterns?

Which pattern should I start with for my first AI agent?

How do I implement memory persistence for Reflection agents?

Conclusion

Free Download: Agentic AI Workshop Guide

DevOps & AI Weekly

Related Articles

LangGraph vs CrewAI vs AutoGen: Complete AI Agent Framework Comparison 2026

5 AI Agent Design Patterns That Power Every Modern Application

LangChain Complete Guide 2026: Build AI Applications from Scratch

Ready to Build Production AI Agents?