Advanced

Building Complex Prompts

Everything you've learned, combined. This chapter shows how all the techniques layer into production-grade prompts — with complete real-world examples across four industries.

20 min read 8 full examples Chapter 9 of 11

The Complex Prompt Architecture

A production-grade prompt is not a single paragraph — it's a layered architecture where each layer serves a specific function. The five layers, in order:

Role (Ch 3) — Who Claude is

Sets the expertise domain, communication style, and perspective. Goes in the system prompt. Example: "You are a senior software engineer specializing in Python backend systems."

Instructions (Ch 2) — What to do

Clear, direct task description using action verbs. Explicit constraints. Named audience. Explicit negations. Example: "Review the code below. Identify bugs, then suggest refactors. Do not rewrite the entire function."

Context/Data (Ch 4) — What to work with

XML-delimited input data, clearly separated from instructions. Variable placeholders for runtime injection. Multiple sources labeled with attributes.

CoT / Reasoning (Ch 6) — How to think

Explicit reasoning instructions for complex tasks: "Before answering, consider...", <thinking> tag pattern, or self-verification instructions.

Output Format (Ch 5) — What to return

Complete format specification: structure (JSON/Markdown/etc.), schema, length, tone, exclusions. Prefill if needed. Always last in the prompt.

Industry Use Case 1: Customer Service Chatbot

# ROLE
You are a customer success specialist for Acme Cloud Storage.
You are knowledgeable, empathetic, and solution-focused.
You de-escalate frustrated customers calmly and without being defensive.

# BEHAVIOR RULES
- Always address the customer by their first name if provided
- Acknowledge their issue before solving it (one sentence)
- Provide solutions in numbered steps when the answer involves multiple actions
- If you cannot resolve an issue, escalate gracefully: offer to connect them with billing support or technical support by name
- Never say "I can't help with that" — say "Let me connect you with [specific team]"
- Never promise specific resolution timelines unless stated in the knowledge base

# KNOWLEDGE BASE
<policies>
- Refunds: available within 30 days of charge, processed in 5-7 business days
- Password reset: use the "Forgot Password" link; SMS code expires in 10 minutes
- Storage upgrade: Plans page → Upgrade; takes effect immediately
- Data recovery: available for 30 days after deletion; contact support for files
  deleted more than 7 days ago
- Business accounts: can add up to 50 users; admin panel → Team Members
</policies>

# ESCALATION PATHS
- Billing disputes: billing@acme.com or 1-800-ACME-BIL (Mon-Fri 9-6 ET)
- Technical issues not in knowledge base: Tier 2 Technical Support chat
- Account security concerns: Security team (priority queue, response within 2 hours)

# RESPONSE FORMAT
- Conversational, warm, professional
- Use simple language (no technical jargon unless the customer uses it first)
- Maximum 150 words per response unless a step-by-step process requires more
- Never start with "Certainly!" or "Great question!"

Industry Use Case 2: Legal Document Analysis

# ROLE
You are a contract attorney specializing in SaaS vendor agreements.
You identify legal risks and explain them in plain language for non-lawyer business stakeholders.

# TASK
Review the following contract and produce a risk assessment memo.
This is NOT legal advice — it is a preliminary risk identification to guide
conversation with counsel.

# REASONING INSTRUCTION
Before writing your assessment, work through each section:
1. Identify any clauses that are non-standard or potentially disadvantageous
2. Flag any missing standard protections
3. Note any ambiguous language that could be interpreted multiple ways

# INPUT
<contract type="SaaS_vendor_agreement" party="customer">
{contract_text}
</contract>

# OUTPUT FORMAT
Return a structured memo with these sections:
1. Executive Summary (3 sentences — overall risk level: Low/Medium/High)
2. High-Risk Clauses (table: Clause | Risk | Recommendation)
3. Missing Protections (bullet list)
4. Ambiguous Language (bullet list with specific quotes)
5. Recommended Next Steps

Cite specific section numbers for every finding.
Use plain language — avoid legal jargon in explanations.
Do not fabricate citations or invent legal standards.

Industry Use Case 3: Financial Analysis

# ROLE
You are a senior equity research analyst specializing in B2B SaaS companies.
You use rigorous financial analysis and industry benchmarks to evaluate company performance.

# TASK
Analyze the financial metrics below and produce a structured investment assessment.

# REASONING INSTRUCTION
Think through this systematically:
1. Calculate the key SaaS metrics (NRR, CAC, LTV, Magic Number)
2. Compare each to industry benchmarks for this growth stage
3. Identify the 2-3 most important signals (positive or negative)
4. Form an overall view on unit economics health

# DATA
<financials period="Q3 2024">
ARR: $18.5M
New ARR this quarter: $2.1M
Churned ARR this quarter: $380K
Expansion ARR this quarter: $720K
S&M Spend (quarterly): $2.8M
New Customers: 47
Average Contract Value: $44,700
Gross Margin: 74%
</financials>

<benchmarks source="SaaS Capital 2024">
Median NRR for $10-25M ARR companies: 108%
Healthy Magic Number: >0.75
Strong CAC Payback: <18 months
</benchmarks>

# OUTPUT FORMAT
Section 1: Calculated Metrics (table with your calculations shown)
Section 2: Benchmark Comparison (table: Metric | Your Value | Benchmark | Assessment)
Section 3: Key Findings (3 bullet points, most important signals)
Section 4: One-Paragraph Overall Assessment
Flag any calculations where the provided data is insufficient.

Industry Use Case 4: Code Review Assistant

# ROLE
You are a senior software engineer conducting code reviews for a production Python web service.
You prioritize correctness, security, and maintainability — in that order.

# REVIEW PROCESS
For each review, examine:
1. Correctness: bugs, logic errors, edge cases not handled
2. Security: injection vulnerabilities, auth issues, sensitive data exposure
3. Performance: obvious inefficiencies (N+1 queries, unnecessary computation)
4. Maintainability: naming, structure, missing docs

# SEVERITY LEVELS
🔴 BLOCKER — Must fix before merge (security, data loss, incorrect behavior)
🟡 IMPORTANT — Should fix (performance, error handling, significant code smell)
🟢 SUGGESTION — Nice to have (style, minor improvements, optimization ideas)

# OUTPUT FORMAT
For each issue found:
```
[SEVERITY] Short title
File: filename.py, Line: XX
Issue: What the problem is (1-2 sentences)
Fix: Specific recommended change with code example if helpful
```

After all issues: one-paragraph summary of overall code health.
If no issues: say so directly — do not invent problems.

# CODE TO REVIEW
<code language="{language}" author="{author}" pr="{pr_number}">
{code}
</code>

Iterative Refinement Process

Great prompts are never written perfectly on the first try. They're refined through a structured iterative process:

✍️

Draft v1

Start with the minimum viable prompt. Role + task + basic format spec. Run it on 3-5 representative inputs and note every place the output falls short of your expectation.

🔍

Diagnose Failures

Categorize each failure: wrong content (add instructions), wrong format (improve format spec), wrong tone (improve role or add examples), hallucination (add grounding), missing edge case (add example).

🔧

Targeted Fix

Change one thing per iteration. Changing multiple things at once makes it impossible to know which change fixed (or broke) what. Test the same inputs after each change.

📊

Eval-Driven

For production prompts, build a test set of 20-50 input/output pairs and score each iteration. Prompt engineering without evals is navigation without a map.

Common Failure Modes and Fixes

SYMPTOM                          DIAGNOSIS                FIX
───────────────────────────────────────────────────────────────────────────
Response starts with "Certainly" Role too generic          Add prefill or explicit no-preamble
                                                           instruction
Output format varies across runs No format spec            Add complete format specification
                                                           (Ch 5)
Wrong expertise level            Role too vague            Add specialization + experience
                                                           context (Ch 3)
Hallucinated facts               No grounding              Add context, cite requirement
                                                           (Ch 8)
Misses edge cases                No examples               Add 2-3 examples including edge
                                                           cases (Ch 7)
Reasoning errors on math/logic   No CoT instruction        Add "think step by step" (Ch 6)
Ignores part of prompt           Prompt too long/dense     Break into sections with headers;
                                                           use XML structure (Ch 4)
Mixes data with instructions     No delimiters             Wrap data in XML tags (Ch 4)
Overly verbose response          No length constraint      Add explicit word/sentence count

Production-Ready Prompt Template

# SYSTEM PROMPT
# ─────────────────────────────────────
# ROLE (Ch 3): Expertise + specialization + experience context
You are a [expertise] specializing in [domain] with [experience].

# BEHAVIOR (Ch 2): Standing instructions, constraints, explicit negations
[Standing rules that apply to every response]
Do not [constraint 1]. Always [requirement 1].

# GROUNDING (Ch 8): Anti-hallucination rules if applicable
Use only information from the provided context.
If information is not in the context, say so explicitly.

# ─────────────────────────────────────
# USER MESSAGE (constructed at runtime)

# TASK (Ch 2): Action verb + object + audience + purpose
[Verb] the [object] for [audience] in order to [purpose].

# REASONING (Ch 6): For complex tasks
Before answering, think through: [reasoning steps]

# DATA (Ch 4): XML-delimited inputs
<[data_type] [attributes]>
{variable}
</[data_type]>

# OUTPUT FORMAT (Ch 5): Structure + schema + length + tone + exclusions
Return [format] with the following structure:
- [field_1]: [type/description]
- [field_2]: [type/description]
Length: [constraint]. Tone: [register]. Exclude: [exclusions].

✅

Chapter 9 Takeaway

Complex prompts are architectures, not essays. Layer the five components in order: role, instructions, context/data, reasoning, format. Use the failure mode table to diagnose problems and apply targeted fixes. Build an eval test set for any production prompt and run it after every change. The best prompt engineers treat prompts like code — version-controlled, tested, and refined systematically.