LangGraph Tutorial (Python): debugging agent loops for beginners

By Cyprian AaronsUpdated 2026-04-21
langgraphdebugging-agent-loops-for-beginnerspython

This tutorial shows you how to debug a LangGraph agent that gets stuck in loops, repeats the same tool call, or never reaches a final answer. You need this when your graph “works” in the sense that it runs, but the agent keeps spinning and you can’t tell which node or state update is causing it.

What You'll Need

  • Python 3.10+
  • langgraph
  • langchain-openai
  • langchain-core
  • An OpenAI API key in OPENAI_API_KEY
  • Basic familiarity with LangGraph nodes, edges, and state
  • A terminal and a place to run Python scripts

Install the packages:

pip install langgraph langchain-openai langchain-core

Step-by-Step

  1. Start with a minimal agent loop that can fail in a visible way.
    The point is not to build a perfect agent first; it’s to reproduce the loop so you can inspect state transitions.
import os
from typing import Annotated, Literal, TypedDict

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

class State(TypedDict):
    messages: Annotated[list, add_messages]
    step_count: int

def agent_node(state: State):
    response = llm.invoke(state["messages"])
    return {"messages": [response], "step_count": state["step_count"] + 1}

def route(state: State) -> Literal["agent", "end"]:
    last = state["messages"][-1]
    if isinstance(last, AIMessage) and "done" in last.content.lower():
        return "end"
    if state["step_count"] >= 5:
        return "end"
    return "agent"
  1. Add explicit debug logging inside your node and router.
    When an agent loops, you want to see what state came in, what came out, and why the router picked the next edge.
def debug_agent_node(state: State):
    print("\n--- AGENT NODE ---")
    print("step_count:", state["step_count"])
    print("last_message:", state["messages"][-1].content)

    response = llm.invoke(state["messages"])
    print("model_output:", response.content)

    return {"messages": [response], "step_count": state["step_count"] + 1}

def debug_route(state: State) -> Literal["agent", "end"]:
    last = state["messages"][-1]
    print("\n--- ROUTER ---")
    print("step_count:", state["step_count"])
    print("last_type:", type(last).__name__)
    print("last_content:", getattr(last, "content", ""))

    if isinstance(last, AIMessage) and "done" in last.content.lower():
        return "end"
    if state["step_count"] >= 5:
        return "end"
    return "agent"
  1. Build the graph with a hard stop so loops cannot run forever while you debug.
    A max step counter is boring, but it saves time when the model keeps bouncing between the same states.
graph = StateGraph(State)
graph.add_node("agent", debug_agent_node)
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", debug_route, {"agent": "agent", "end": END})

app = graph.compile()

initial_state = {
    "messages": [HumanMessage(content="Tell me something useful.")],
    "step_count": 0,
}
  1. Run the graph once and inspect the printed trace.
    If the agent loops, you should now know whether it’s because the model never emits a stop signal or because your router never recognizes one.
result = app.invoke(initial_state)

print("\n=== FINAL RESULT ===")
print(result["step_count"])
for msg in result["messages"]:
    role = type(msg).__name__
    content = getattr(msg, "content", "")
    print(f"{role}: {content}")
  1. Fix common loop causes by making termination deterministic.
    In production systems, do not rely on vague phrases like “I’m done” unless you control them tightly; use structured outputs or explicit flags.
from pydantic import BaseModel, Field

class AgentOutput(BaseModel):
    answer: str = Field(...)
    done: bool = Field(...)

structured_llm = llm.with_structured_output(AgentOutput)

def structured_agent_node(state: State):
    output = structured_llm.invoke(state["messages"])
    content = f'{output.answer}\nDONE={output.done}'
    return {
        "messages": [AIMessage(content=content)],
        "step_count": state["step_count"] + 1,
    }

def structured_route(state: State) -> Literal["agent", "end"]:
    last = state["messages"][-1].content
    if "DONE=True" in last:
        return "end"
    if state["step_count"] >= 5:
        return "end"
    return "agent"

Testing It

Run the script and watch the console output for each node execution. You should see exactly which message entered the agent node, what the model returned, and why the router chose agent or end.

If it still loops, check two things first: whether your node is returning new messages into state correctly, and whether your router is inspecting the latest message rather than an older one. Most beginner loops come from stale routing logic or from appending messages without changing any termination condition.

A good test is to force a known stop condition in your prompt or structured schema and confirm that the graph exits on that turn. If it only stops when step_count hits the limit, your natural termination signal is broken.

Next Steps

  • Add LangGraph checkpoints so you can resume from a failing step instead of rerunning from scratch.
  • Learn how to inspect intermediate state with custom callbacks and tracing.
  • Replace string-based stop checks with typed tool calls or structured outputs for production agents

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides