Claude Extended Thinking 深度思考模式完整指南：复杂推理场景实战（2026）

普通模式下，Claude 直接生成答案。Extended Thinking 开启后，Claude 会先进行内部推理——探索多个思路、自我验证、逐步推导——再给出最终答案。对于复杂问题，质量提升显著。

什么时候用 Extended Thinking

适合：

需要多步推理的数学/算法问题
复杂系统架构设计（需要权衡多个方案）
安全审计（需要穷举攻击面）
代码 Bug 根因分析（需要追踪执行链路）
需要高置信度答案的关键决策

不适合：

简单问答、格式转换
对延迟敏感的实时应用
预算有限的高频调用

支持的模型

claude-opus-4-5（效果最好）
claude-sonnet-4-5（性价比高）

基础用法

python

import anthropic
client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # 给思考过程分配的最大 Token 数
    },
    messages=[{
        "role": "user",
        "content": "Analyze the time complexity of this algorithm and suggest optimizations: [code]"
    }]
)

# 解析响应
for block in response.content:
    if block.type == "thinking":
        print("Thinking process:")
        print(block.thinking[:500])  # 推理过程
    elif block.type == "text":
        print("Answer:")
        print(block.text)  # 最终答案

thinking_budget 如何设置

budget_tokens 控制 Claude 内部思考过程的最大 Token 数。

场景	建议 budget_tokens
简单推理	1000-3000
中等复杂度	5000-8000
复杂架构/算法	10000-16000
极复杂问题	32000+

注意：max_tokens 必须大于 budget_tokens（需要为答案留空间）。

流式输出（推荐生产使用）

python

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=8000,
    thinking={"type": "enabled", "budget_tokens": 5000},
    messages=[{"role": "user", "content": user_question}]
) as stream:
    thinking_shown = False
    for event in stream:
        if hasattr(event, 'type'):
            if event.type == "content_block_start":
                if hasattr(event.content_block, "type"):
                    if event.content_block.type == "thinking" and not thinking_shown:
                        print("[Thinking...]", end="", flush=True)
                        thinking_shown = True
                    elif event.content_block.type == "text":
                        print("\n[Answer]")
            elif event.type == "content_block_delta":
                if hasattr(event.delta, 'text'):
                    print(event.delta.text, end='', flush=True)

三个实战场景

场景 1：复杂 Bug 根因分析

python

bug_prompt = """
This distributed system occasionally produces duplicate records.
Reproduce rate: ~0.1% under high load.
Stack: Redis distributed lock + PostgreSQL + async Python workers.

Analyze the root cause systematically. Consider:
1. Race conditions in the locking mechanism
2. Network partition scenarios
3. Worker crash/restart during transaction
4. Clock skew between services

Code: [paste relevant code]
"""

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=12000,
    thinking={"type": "enabled", "budget_tokens": 8000},
    messages=[{"role": "user", "content": bug_prompt}]
)

场景 2：架构方案选型

python

arch_prompt = """
We need to design a real-time notification system for 10M users.
Requirements: <100ms delivery, 99.9% reliability, support push/email/SMS.
Current stack: Python/FastAPI, PostgreSQL, Redis.

Evaluate these approaches:
1. WebSocket + Redis Pub/Sub
2. Server-Sent Events + message queue
3. Third-party service (Firebase/Pusher)

Give detailed tradeoff analysis and a final recommendation.
"""

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 12000},
    messages=[{"role": "user", "content": arch_prompt}]
)

场景 3：安全漏洞全面审计

python

security_prompt = """
Perform a comprehensive security audit of this authentication module.
Be exhaustive - think through every possible attack vector:
- Authentication bypass
- Session management flaws
- Injection vulnerabilities
- Cryptographic weaknesses
- Race conditions in auth flow

Code: [paste auth code]
"""

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=10000,
    thinking={"type": "enabled", "budget_tokens": 8000},
    messages=[{"role": "user", "content": security_prompt}]
)

成本控制

thinking 内容按普通输出 Token 计费（thinking_tokens * output_price）。

模型	思考 Token 价格（/百万）
Sonnet 4.5	$15.00
Opus 4.5	$75.00

策略：

先用较小的 budget（2000-5000）测试，质量够就不必加大
只对真正复杂的问题开启，简单问题关闭节省成本
Sonnet + thinking 通常比 Opus 无 thinking 更划算

常见误区

误区 1：thinking 过程可以修改 thinking block 是只读的，不能预填充或在 few-shot 中使用。

误区 2：budget 越大越好 Claude 会自动决定用多少思考 Token，设置过大浪费成本。从 5000 开始调整。

误区 3：所有任务都应该开启 简单任务开启 thinking 反而可能降低效率（过度思考）。

来源：Extended Thinking - Anthropic 官方文档

Claude Extended Thinking 深度思考模式：复杂推理场景实战指南

什么时候用 Extended Thinking

支持的模型

基础用法

thinking_budget 如何设置

流式输出（推荐生产使用）

三个实战场景

场景 1：复杂 Bug 根因分析

场景 2：架构方案选型

场景 3：安全漏洞全面审计

成本控制

常见误区

相关文章推荐

什么时候用 Extended Thinking#

支持的模型#

基础用法#

thinking_budget 如何设置#

流式输出（推荐生产使用）#

三个实战场景#

场景 1：复杂 Bug 根因分析#

场景 2：架构方案选型#

场景 3：安全漏洞全面审计#

成本控制#

常见误区#

相关文章推荐

什么时候用 Extended Thinking

支持的模型

基础用法

thinking_budget 如何设置

流式输出（推荐生产使用）

三个实战场景

场景 1：复杂 Bug 根因分析

场景 2：架构方案选型

场景 3：安全漏洞全面审计

成本控制

常见误区