Skip to content
Worix
BrowsePublish
Log inSign Up

Gateway Sentinel

Production-hardened OpenClaw gateway watchdog. Monitors the gateway process using graduated health checks, performs escalating repairs (restart → doctor fix...

83 downloads
Free
Reviewed

🛡️ OpenClaw Guardian

A battle-hardened watchdog that keeps your OpenClaw gateway running — and tells you when it can't.

What It Does

OpenClaw Guardian runs as a background service and continuously monitors the OpenClaw gateway using two independent health signals. When the gateway goes down, it works through an escalating repair sequence before entering a cooldown and waiting for manual help. Every significant event is logged and sent to your configured alert channel(s).

Health Check Strategy (graduated)

  1. CLI check — openclaw gateway status (the authoritative signal)
  2. HTTP fallback — curl http://localhost:${OPENCLAW_PORT}/health (5s timeout)
  3. Both must fail before the guardian considers the gateway truly down

Repair Strategy (escalating)

LevelActionTrigger
1 — Restartopenclaw gateway restartFirst failure
2 — Doctor Fixopenclaw doctor --fix → openclaw gateway startAfter Level 1 fails
3 — Git RollbackStash → reset to last stable commit → pop stashAfter GUARDIAN_MAX_REPAIR failures, only if GUARDIAN_ENABLE_ROLLBACK=true
CooldownSleep GUARDIAN_COOLDOWN secondsAfter all levels exhausted

Note: Level 3 rollback is off by default and requires explicit opt-in via GUARDIAN_ENABLE_ROLLBACK=true. Even then, it always stashes uncommitted work before resetting — your changes are never silently discarded.

Alerting

Guardian supports both Telegram and Discord simultaneously. If neither is configured, it runs in log-only mode.

Alert events:

  • Guardian started / stopped
  • Gateway down detected
  • Each repair attempt (with level)
  • Repair success / failure
  • Rollback triggered
  • All repairs exhausted (cooldown entered)

Daily Snapshots

Once per calendar day, guardian runs git add -A && git commit in your workspace. It respects .gitignore, so secrets you've excluded stay excluded. Commit message format: guardian: daily snapshot YYYY-MM-DD.


Quick Start

1. Configure environment variables

Create ~/.openclaw/guardian.env (or export in your shell profile):

# Required for alerts — set at least one
export GUARDIAN_TELEGRAM_BOT_TOKEN="bot123456:ABC..."
export GUARDIAN_TELEGRAM_CHAT_ID="-1001234567890"
# OR
export GUARDIAN_DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/..."

# Optional tuning
export GUARDIAN_CHECK_INTERVAL=30
export GUARDIAN_MAX_REPAIR=3
export GUARDIAN_COOLDOWN=600
export GUARDIAN_ENABLE_ROLLBACK=false  # set true to enable git rollback
export GUARDIAN_WORKSPACE="$HOME/.openclaw/workspace"
export GUARDIAN_LOG="/tmp/openclaw-guardian.log"
export OPENCLAW_PORT=3578

2. Install as a system service

# macOS or Linux — auto-detects
./scripts/install-guardian.sh

# With a custom log path
GUARDIAN_LOG=/var/log/openclaw-guardian.log ./scripts/install-guardian.sh

3. Verify it's running

# macOS
launchctl list | grep openclaw

# Linux
systemctl --user status openclaw-guardian

# Both
tail -f /tmp/openclaw-guardian.log

4. Run manually (testing / foreground)

# Source your config first
source ~/.openclaw/guardian.env

# Run guardian in the foreground (Ctrl-C to stop)
./scripts/guardian.sh

5. Uninstall

./scripts/uninstall-guardian.sh

Environment Variable Reference

VariableDefaultDescription
GUARDIAN_CHECK_INTERVAL30Seconds between health checks
GUARDIAN_MAX_REPAIR3Max Level 1+2 attempts before Level 3
GUARDIAN_COOLDOWN600Cooldown sleep (seconds) after all repairs fail
GUARDIAN_ENABLE_ROLLBACKfalseEnable Level 3 git rollback (off by default)
GUARDIAN_LOG/tmp/openclaw-guardian.logLog file path (rotates at 1 MB)
GUARDIAN_WORKSPACE$HOME/.openclaw/workspacePath to the OpenClaw workspace git repo
GUARDIAN_TELEGRAM_BOT_TOKEN(unset)Telegram Bot API token
GUARDIAN_TELEGRAM_CHAT_ID(unset)Telegram chat or channel ID
GUARDIAN_DISCORD_WEBHOOK_URL(unset)Discord incoming webhook URL
OPENCLAW_PORT(auto-detected)Gateway HTTP port — auto-parsed from openclaw gateway status if not set

File Layout

skills/openclaw-guardian/
├── SKILL.md                    ← this file
└── scripts/
    ├── guardian.sh             ← main watchdog (run continuously)
    ├── install-guardian.sh     ← sets up launchd / systemd service
    └── uninstall-guardian.sh   ← clean removal

Runtime files (created automatically, not committed):

FilePurpose
/tmp/openclaw-guardian.lockSingle-instance lockfile containing PID
/tmp/openclaw-guardian-last-snapshotDate of last successful daily snapshot
/tmp/openclaw-guardian.logCurrent log (rotated to .log.1 at 1 MB)

How It Improves on myclaw-guardian

Issue in myclaw-guardianFix in openclaw-guardian
git reset --hard without stashing — could silently destroy uncommitted workAlways git stash before any reset; git stash pop to restore regardless of outcome
Process detection via pgrep — fragile, can match wrong processUses openclaw gateway status (the actual CLI) as primary, with HTTP fallback
No lockfile — multiple instances could run simultaneously/tmp/openclaw-guardian.lock with PID written; stale lock detection on startup
Only Discord alertsSupports Telegram and Discord simultaneously; log-only if neither configured
Level 3 rollback always enabled — risky defaultLevel 3 off by default (GUARDIAN_ENABLE_ROLLBACK=false), explicit opt-in required
No graduated health checkingTwo independent checks: CLI → HTTP; both must fail before declaring gateway down
No cooldown after exhausting repairsConfigurable cooldown (GUARDIAN_COOLDOWN) before resuming monitoring

Logging

Logs are timestamped and structured:

[2026-03-05 11:30:00] [INFO] OpenClaw Guardian started (PID 12345)
[2026-03-05 11:30:30] [INFO] Gateway healthy
[2026-03-05 11:31:00] [WARN] CLI status check failed — trying HTTP health endpoint
[2026-03-05 11:31:05] [WARN] Gateway health check FAILED
[2026-03-05 11:31:05] [INFO] ALERT: 🔴 Gateway is DOWN — beginning repair sequence
[2026-03-05 11:31:05] [INFO] Repair Level 1: restarting gateway
[2026-03-05 11:31:35] [INFO] Level 1 repair succeeded

Log rotates automatically when it exceeds 1 MB (one backup: .log.1).


Security Notes

  • No secrets in git — daily snapshots use git add -A which respects .gitignore. Ensure your .gitignore excludes .env, *.key, etc.
  • Level 3 rollback is destructive by nature — only enable it if you understand git reset semantics and have tested your .gitignore coverage.
  • Alert tokens in env only — never put GUARDIAN_TELEGRAM_BOT_TOKEN or webhook URLs in files that get committed.

Download

ZIP package — ready to use

Skill Info

Creator
zurbrick
Downloads
83
Published
Mar 15, 2026
Updated
Mar 16, 2026