Autonomously breaks coding tasks into spec, plan, code, and QA phases, executing all heavy work via Claude Code with multi-account rate limit rotation.
Autonomous spec → plan → code → QA pipeline powered by Claude Code. All heavy computation runs through Claude Code (Max subscription). OpenClaw only orchestrates.
Flo (minimal tokens) → shell pipeline → Claude Code (all heavy work)
↓
Account rotation on rate limit
Classify the task before planning — each type has a different phase structure:
| Type | When | Phase Order |
|---|---|---|
feature | New capability | Backend → Worker → Frontend → Integration |
refactor | Restructure existing code | Add New → Migrate → Remove Old → Cleanup |
investigation | Bug hunt | Reproduce → Investigate → Fix → Harden |
migration | Move data/infra | Prepare → Test → Execute → Cleanup |
simple | Single-file change | Just subtasks, no phases |
bash ~/clawd/skills/flowforge/scripts/init_forge.sh "<task_description>" "<repo_path>"
Creates ~/.forge/<timestamp>/ with task.md.
bash ~/clawd/skills/flowforge/scripts/run_forge.sh ~/.forge/<timestamp>/
This chains 4 Claude Code calls:
spec.md (high thinking)implementation_plan.json (high thinking)Each step saves output to the workspace directory. Claude Code does ALL the work.
Poll workspace for completion:
tail -f ~/.forge/<timestamp>/progress.log
cat ~/.forge/<timestamp>/qa_report.md
Three Claude Max accounts rotate automatically on rate limit:
account-1@gmail.com → account-2@gmail.com → account-3@gmail.com → retry
Configure your accounts in ~/.flowforge/accounts.txt (one email per line).
Save credentials per account in ~/.claude/accounts/<email>.json.
Switch accounts with: bash <skill-dir>/scripts/rotate_account.sh
To pull a task from a GitHub issue:
gh issue view <number> --repo <owner>/<repo> --json title,body | \
jq -r '"# " + .title + "\n\n" + .body' > ~/.forge/<timestamp>/task.md
Then run the pipeline normally.
On completion, workspace contains:
spec.md — full specificationimplementation_plan.json — phases + subtasks with statusqa_report.md — QA review and scoreprogress.log — timestamped execution logAdd --rubric flag for high-stakes runs. Scores against a universal 200-criterion quality rubric after the spec-based QA pass:
bash ~/clawd/skills/flowforge/scripts/run_forge.sh ~/.forge/<timestamp>/ --rubric
Rubric covers: Architecture (40), Code Quality (40), Testing (40), Error Handling (30), Security (20), Documentation (15), Observability (15).
Verdict thresholds: ≥180 = Ship it | 150–179 = Needs work | <150 = Major rework
Skip --rubric for quick tasks. Use it before shipping to production.
See references/spec-prompt.md, references/planner-prompt.md, references/qa-prompt.md, references/rubric-prompt.md for the full Claude Code prompts used at each stage.
ZIP package — ready to use