OpenClaw 日志与健康检查完全指南：监控告警与运维自动化（2026）

OpenClaw 日志与健康检查完全指南：监控、告警与运维自动化

OpenClaw 日志系统（Logging）与健康检查（Health Check）完整教程：日志级别配置（debug/info/warn/error）和日志格式（text/json）、日志文件持久化路径配置、按渠道/Agent/Provider 过滤日志、Health Check HTTP 端点（/health）的使用（状态码/响应格式）、用于容器编排的 liveness/readiness 探针配置、Gateway Doctor 命令的详细输出解读、集成 Prometheus 指标导出（/metrics 端点）、Grafana Dashboard 可视化，以及生产环境的日志轮转和告警配置方案。

2026/3/253分钟阅读ClaudeEagle

生产环境的 OpenClaw 需要完善的可观测性—— 日志、健康检查、指标，这些是运维稳定性的基础。

日志配置

基础日志设置

json

{
  "logging": {
    "level": "info",
    "format": "json",
    "file": "/var/log/openclaw/gateway.log"
  }
}

日志级别（从详细到简洁）：

debug：所有调试信息（开发用，不要在生产环境开）
info：正常运行信息（推荐生产环境默认）
warn：警告（潜在问题，不影响运行）
error：错误（影响功能的问题）

JSON 格式日志（推荐生产环境）

json

{
  "logging": {
    "level": "info",
    "format": "json",
    "file": "/var/log/openclaw/gateway.log",
    "rotate": {
      "maxSize": "100mb",
      "maxFiles": 7
    }
  }
}

JSON 格式输出示例：

json

{"level":"info","ts":"2026-03-25T14:23:01Z","channel":"telegram",
 "userId":"@alice","tokens":{"input":1234,"output":456},"latencyMs":1823}

便于 Elasticsearch/Loki 等日志系统解析。

按组件分级

json

{
  "logging": {
    "level": "info",
    "components": {
      "gateway": "info",
      "channels.telegram": "debug",
      "channels.slack": "warn",
      "providers.anthropic": "info",
      "exec": "debug"
    }
  }
}

可以对特定渠道开启 debug，其他保持 info，精准排查问题。

命令行查看日志

bash

# 实时日志流
openclaw logs --follow

# 最近 100 行
openclaw logs --tail 100

# 按级别过滤
openclaw logs --level error
openclaw logs --level warn

# 按渠道过滤
openclaw logs --channel telegram
openclaw logs --channel slack

# 搜索关键词
openclaw logs --grep "rate_limit"
openclaw logs --grep "403"

# 时间范围
openclaw logs --since 30m
openclaw logs --since "2026-03-25 14:00"

# 导出
openclaw logs --since 1h > /tmp/debug.log

Health Check 端点

Gateway 提供标准 HTTP 健康检查接口：

bash

# 基础健康检查
curl http://127.0.0.1:18789/health

# 正常响应（200 OK）
{
  "status": "ok",
  "version": "1.5.0",
  "uptime": 86400,
  "channels": {
    "telegram": "connected",
    "slack": "connected",
    "matrix": "disconnected"
  },
  "providers": {
    "anthropic": "ok",
    "deepseek": "ok"
  }
}

# 异常响应（503 Service Unavailable）
{
  "status": "degraded",
  "errors": ["anthropic: api key invalid"]
}

Kubernetes 探针配置

yaml

# deployment.yaml
livenessProbe:
  httpGet:
    path: /health
    port: 18789
  initialDelaySeconds: 30
  periodSeconds: 30
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health/ready
    port: 18789
  initialDelaySeconds: 10
  periodSeconds: 10

/health：存活检查（进程是否运行） /health/ready：就绪检查（所有渠道是否连接成功）

Prometheus 指标

json

{
  "metrics": {
    "enabled": true,
    "path": "/metrics",
    "port": 9090
  }
}

关键指标：

# 请求计数（按渠道/Agent/状态）
openclaw_requests_total{channel="telegram",agent="default",status="success"} 1234

# 响应延迟（P50/P95/P99）
openclaw_latency_seconds{quantile="0.95"} 2.3

# Token 使用量
openclaw_tokens_total{provider="anthropic",type="input"} 234567

# 活跃会话数
openclaw_sessions_active 42

# 错误计数
openclaw_errors_total{type="rate_limit"} 5

Prometheus 配置：

yaml

# prometheus.yml
scrape_configs:
  - job_name: 'openclaw'
    static_configs:
      - targets: ['localhost:9090']
    scrape_interval: 30s

Grafana Dashboard

配置 Prometheus 数据源后，导入 OpenClaw 官方 Dashboard：

Grafana → + → Import → 输入 Dashboard ID（见官网）

Dashboard 面板包括：
  - 实时请求量（按渠道分布）
  - Token 消耗趋势（日/周/月）
  - 响应延迟 P95 趋势
  - 错误率告警图
  - 模型使用分布

告警配置

Prometheus AlertManager

yaml

# alerts.yml
groups:
  - name: openclaw
    rules:
      - alert: HighErrorRate
        expr: rate(openclaw_errors_total[5m]) > 0.1
        annotations:
          summary: "OpenClaw 错误率过高"

      - alert: APIKeyExpired
        expr: openclaw_errors_total{type="auth_error"} > 0
        annotations:
          summary: "API Key 认证失败，请检查密钥"

简单告警（OpenClaw 内置）

json

{
  "alerts": {
    "errorRate": {
      "threshold": 10,
      "window": "5m",
      "channel": "telegram"
    },
    "providerDown": {
      "channel": "telegram",
      "message": "AI 服务商连接异常，已自动切换备用模型"
    }
  }
}

来源：OpenClaw 官方文档 - docs.openclaw.ai/gateway/logging

日志配置#

基础日志设置#

JSON 格式日志（推荐生产环境）#

按组件分级#

命令行查看日志#

Health Check 端点#

Kubernetes 探针配置#

Prometheus 指标#

Grafana Dashboard#

告警配置#

Prometheus AlertManager#

简单告警（OpenClaw 内置）#

相关文章推荐

日志配置

基础日志设置

JSON 格式日志（推荐生产环境）

按组件分级

命令行查看日志

Health Check 端点

Kubernetes 探针配置

Prometheus 指标

Grafana Dashboard

告警配置

Prometheus AlertManager

简单告警（OpenClaw 内置）