Appendix

Chaining Prompts

Single prompts have limits. Prompt chaining breaks complex tasks into a pipeline of focused steps — each prompt's output feeding the next as input. This is how production AI systems actually work.

20 min read 7 code examples Appendix A of B

Why Single Prompts Have Limits

A single prompt works well when a task can be completed in one focused step. But complex real-world tasks often involve multiple phases: researching, synthesizing, deciding, formatting, and validating. Cramming all of this into one prompt creates several problems:

📏
Context Window Limits
A single prompt can't process more text than fits in Claude's context window. A chain can handle documents of any size by breaking them into chunks across sequential steps.
🎯
Focus Dilution
Asking Claude to do five things in one prompt reduces performance on each. A chain of five focused prompts — each doing one thing well — consistently outperforms one sprawling prompt.
🔍
No Intermediate Validation
A single prompt fails silently. In a chain, you can inspect and validate each intermediate output — catching errors before they propagate and corrupt downstream steps.
🔄
No Conditional Logic
Single prompts can't branch. Chains can: inspect the output of Step 1, then decide whether to run Step 2A or Step 2B based on what Claude found.

The Pipeline Metaphor

Think of prompt chaining as a data pipeline. Each stage transforms the data and passes it downstream:

Pipeline Visualization
Input Document
      │
      ▼
[STEP 1: Extract]  →  "What are the key entities and facts?"
      │                Output: structured JSON with extracted data
      ▼
[STEP 2: Analyze]  →  "Analyze these facts. What are the implications?"
      │                Output: prose analysis
      ▼
[STEP 3: Format]   →  "Format this analysis as an executive brief"
      │                Output: formatted document
      ▼
[STEP 4: Validate] →  "Check: does this brief match the original facts?"
      │                Output: validation report or approved flag
      ▼
Final Output

Types of Chains

🔗
Refinement Chains
Generate → Critique → Improve. Use when quality matters more than speed. Each step makes the previous output better: draft → editorial feedback → revised draft.
🔬
Analysis Chains
Extract → Analyze → Synthesize. For processing complex information sources. Break analysis into: what does it say? → what does it mean? → what should we do?
🌿
Decision Chains
Classify → Branch. Step 1 categorizes the input; subsequent steps handle each category differently. Essential for building routing logic in AI applications.
Parallel Chains
Run multiple independent prompts simultaneously using async/threading. Collect all results and pass to a synthesis step. Dramatically reduces latency for multi-angle analysis.

Sequential Chain: Research → Summarize → Format

Research Pipeline (Python)
import anthropic

client = anthropic.Anthropic()

def call_claude(system: str, user: str, max_tokens: int = 1024) -> str:
    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=max_tokens,
        system=system,
        messages=[{"role": "user", "content": user}]
    )
    return response.content[0].text

def research_pipeline(topic: str, raw_sources: list[str]) -> str:
    """3-step chain: extract key info → analyze → produce executive brief."""

    # STEP 1: Extract key facts from each source
    extracted_facts = []
    for i, source in enumerate(raw_sources):
        facts = call_claude(
            system="You are a research analyst. Extract key facts, dates, and statistics from documents.",
            user=f"Extract all key facts from this source. Return as a JSON array of strings.\n\n<source id='{i+1}'>\n{source}\n</source>"
        )
        extracted_facts.append(facts)

    facts_combined = "\n".join([f"Source {i+1}: {f}" for i, f in enumerate(extracted_facts)])

    # STEP 2: Analyze and synthesize
    analysis = call_claude(
        system="You are a strategic analyst. Synthesize information from multiple sources into coherent insights.",
        user=f"""Analyze these extracted facts about '{topic}'.
Identify: key themes, conflicting information, important gaps.

<facts>
{facts_combined}
</facts>

Return: 3-5 key insights in prose.""",
        max_tokens=2048
    )

    # STEP 3: Format as executive brief
    brief = call_claude(
        system="You are a senior business writer specializing in executive communications.",
        user=f"""Format the following analysis as a one-page executive brief about '{topic}'.

Structure: Background (2 sentences) → Key Findings (3 bullet points) → Implications (2 sentences) → Recommended Actions (3 bullet points)

<analysis>
{analysis}
</analysis>""",
        max_tokens=1024
    )

    return brief

# Usage
brief = research_pipeline("AI adoption in healthcare", [doc1, doc2, doc3])
print(brief)

Document Extract → Validate → Output Chain

Extraction with Validation
import json

def extract_and_validate(contract_text: str) -> dict:
    """Extract contract data and validate the extraction before returning."""

    # Step 1: Extract structured data
    extraction_prompt = f"""Extract the following fields from this contract.
Return valid JSON only — no explanation.

Schema:
{{
  "parties": ["list of party names"],
  "effective_date": "YYYY-MM-DD or null",
  "termination_date": "YYYY-MM-DD or null",
  "payment_terms_days": number or null,
  "liability_cap": number or null,
  "governing_law": "state/jurisdiction or null"
}}

<contract>
{contract_text}
</contract>"""

    raw_extraction = call_claude("You extract structured data from legal documents.", extraction_prompt)

    try:
        extracted = json.loads(raw_extraction)
    except json.JSONDecodeError:
        return {"error": "extraction_failed", "raw": raw_extraction}

    # Step 2: Validate the extraction
    validation_prompt = f"""Review this data extraction from a contract.
Verify: (1) are the field values consistent with the contract text?
(2) are any fields incorrectly null that should have values?
(3) are any field values wrong?

Return JSON: {{"valid": true/false, "issues": ["list of issues found"], "corrections": {{}}}}

<contract>{contract_text[:2000]}</contract>
<extraction>{json.dumps(extracted)}</extraction>"""

    validation = call_claude("You validate data extractions for accuracy.", validation_prompt)

    val_data = json.loads(validation)
    if not val_data.get("valid") and val_data.get("corrections"):
        extracted.update(val_data["corrections"])

    return {"data": extracted, "validation": val_data}

Parallel Chains: Running Multiple Prompts Simultaneously

Parallel Analysis Chain (asyncio)
import asyncio
import anthropic

async_client = anthropic.AsyncAnthropic()

async def analyze_from_perspective(topic: str, perspective: str) -> dict:
    response = await async_client.messages.create(
        model="claude-opus-4-5",
        max_tokens=512,
        messages=[{
            "role": "user",
            "content": f"Analyze '{topic}' from the perspective of {perspective}. 3 key points."
        }]
    )
    return {"perspective": perspective, "analysis": response.content[0].text}

async def multi_perspective_analysis(topic: str) -> str:
    # Run all perspectives in parallel
    perspectives = ["a CFO", "an operations manager", "a frontline employee", "a customer"]
    tasks = [analyze_from_perspective(topic, p) for p in perspectives]
    results = await asyncio.gather(*tasks)

    # Synthesize all perspectives into one coherent view
    combined = "\n".join([f"{r['perspective']}: {r['analysis']}" for r in results])
    synthesis_response = await async_client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": f"Synthesize these four perspectives on '{topic}' into a balanced summary.\n\n{combined}"
        }]
    )
    return synthesis_response.content[0].text

# Usage
result = asyncio.run(multi_perspective_analysis("moving to a 4-day work week"))

State Management Between Prompts

Each prompt in a chain receives only what you explicitly pass to it — there's no shared memory between API calls. You must actively manage state:

📦
Pass Full Context
The simplest approach: pass the full accumulated context to each step. Works for short chains with small outputs. Becomes expensive and hit context limits for long chains.
🗜️
Compress Between Steps
Add a compression step between long steps: "Summarize the key findings from this analysis in 200 words." Pass the summary rather than the full output to the next step.
📊
Structured State Object
Use a JSON state object that accumulates outputs. Each step adds its results to the state object. The final step receives the full structured state and produces the final output.

Best Practices for Chain Design

1
Atomic prompts: one job per step
Each step in the chain should do exactly one thing well. If you find yourself writing "and then also..." in a step's instructions, split it into two steps.
2
Design for structured handoffs
Each step's output should be in a format that makes it easy to inject into the next step. Prefer JSON or XML for intermediate outputs — they're easy to pass as template variables.
3
Add validation steps for high-stakes chains
After any step that makes a critical decision (classification, extraction, analysis), add a validation step that checks the output for correctness before passing it downstream.
4
Handle errors gracefully
Wrap each step in try/except. Never let a parsing error in Step 2 crash the whole pipeline. Log errors, attempt recovery, or fail gracefully with informative error messages.
Appendix A Takeaway
Prompt chaining is the foundational pattern for production AI systems. Break complex tasks into atomic, focused steps. Use structured (JSON/XML) outputs for clean handoffs between steps. Add validation steps for high-stakes decisions. Use parallel chains with asyncio when steps are independent, to minimize total latency. When single prompts fail, the answer is usually "break it into a chain."