Optimized Smart Model Switcher (v2) - Zero-latency, no restart required. Automatically selects and switches to the best available model for each task from yo...
Zero-Latency • No Restart • Runtime Switching
| Feature | V1 | V2 |
|---|---|---|
| Restart Required | ❌ Yes | ✅ No |
| Switch Latency | 5-10s | <100ms |
| Model Preloading | ❌ No | ✅ Yes |
| Parallel Processing | ❌ No | ✅ Yes |
| Auto Model Discovery | ❌ No | ✅ Yes |
| Fallback Logic | Basic | Advanced |
| Performance | Low | High |
| Task Type | Primary Model | Fallback 1 | Fallback 2 | Latency |
|---|---|---|---|---|
| 写小说/创意写作 | qwen3.5-plus | qwen3.5-397b | qwen-plus | <50ms |
| 写代码/编程 | qwen3-coder-plus | qwen3-coder-next | qwen3.5-plus | <50ms |
| 复杂推理/数学 | qwen3-max | qwen3.5-plus | qwen-plus | <50ms |
| 数据分析 | qwen3.5-plus | qwen3-max | qwen-plus | <50ms |
| 日常对话 | qwen3.5-plus | qwen-plus | qwen-turbo | <30ms |
| 长文档处理 | qwen3.5-plus | qwen3.5-397b | qwen-plus | <50ms |
| Debug/修复 | qwen3-coder-plus | qwen3.5-plus | qwen-plus | <50ms |
| 翻译 | qwen3.5-plus | qwen-plus | qwen-turbo | <30ms |
┌─────────────────────────────────────────────────────────┐
│ User Request │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Task Analyzer (30ms) │
│ • Keyword matching │
│ • Context analysis │
│ • Confidence scoring │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Model Registry (Preloaded) │
│ • qwen3.5-plus (Ready) │
│ • qwen3-coder-plus (Ready) │
│ • qwen3-max (Ready) │
│ • ... (All models preloaded) │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Model Selector (20ms) │
│ • Select best model for task │
│ • Check availability │
│ • Apply fallback if needed │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Model API Call │
│ • Direct API call (no config change) │
│ • Connection pooling │
│ • Auto retry │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Response │
│ • Return result │
│ • Log performance │
│ • Update statistics │
└─────────────────────────────────────────────────────────┘
| Metric | V1 | V2 | Improvement |
|---|---|---|---|
| Switch Time | 5-10s | <100ms | 50-100x faster |
| Memory Usage | Low | Medium | +20% (worth it) |
| CPU Usage | Low | Low | Same |
| API Calls | 1 | 1-2 | Same |
| User Experience | Poor | Excellent | Significant |
Example 1: Writing Task
User: "帮我写一本科幻小说"
Agent: "🧠 Switched to qwen3.5-plus (best for novel writing, 1M context)"
[Completes task]
Example 2: Coding Task
User: "帮我写个 Python 爬虫"
Agent: "🧠 Switched to qwen3-coder-plus (best for coding)"
[Completes task]
Example 3: Reasoning Task
User: "这道数学题怎么做?"
Agent: "🧠 Switched to qwen3-max (best for reasoning)"
[Completes task]
Example 4: Multi-Step Task
User: "帮我写个贪吃蛇游戏,然后写个游戏说明"
Agent: "🧠 Switched to qwen3-coder-plus (best for coding)"
[Writes code]
Agent: "🧠 Switched to qwen3.5-plus (best for writing)"
[Writes documentation]
| Limitation | Description |
|---|---|
| Plan-Bound | Only uses models from your purchased plan |
| No External | Won't call models outside your plan |
| Requires Config | Needs correct openclaw.json setup |
| Memory | Uses 20% more memory for preloading |
1. Extract keywords from user request
2. Match against task type keywords
3. Calculate confidence score for each type
4. Select type with highest confidence
5. If confidence < threshold, use default
6. Map type to best model
7. Check model availability
8. Apply fallback if needed
{
"models": {
"qwen3.5-plus": {
"status": "ready",
"tasks": ["writing", "analysis", "translation"],
"context": 1000000,
"priority": 1
},
"qwen3-coder-plus": {
"status": "ready",
"tasks": ["coding", "debug"],
"context": 100000,
"priority": 1
},
"qwen3-max": {
"status": "ready",
"tasks": ["reasoning", "math"],
"context": 100000,
"priority": 1
}
}
}
Primary Model (Unavailable?)
↓
Fallback 1 (Unavailable?)
↓
Fallback 2 (Unavailable?)
↓
Default Model (Always available)
| Benefit | Impact |
|---|---|
| No Restart | Save 5-10s per switch |
| Zero Latency | Instant model switching |
| Better UX | Users don't notice switching |
| Auto-Update | New models auto-detected |
| Reliable | Advanced fallback logic |
| Efficient | Connection pooling |
| Feature | V1 | V2 |
|---|---|---|
| Restart Required | Yes | No |
| Switch Latency | 5-10s | <100ms |
| Model Preloading | No | Yes |
| Auto Discovery | No | Yes |
| Fallback | Basic | Advanced |
| Performance | Low | High |
| Memory | Low | Medium (+20%) |
| User Experience | Poor | Excellent |
# Clone repository
git clone https://github.com/davidme6/openclaw.git
# Copy skill to workspace
cp -r openclaw/skills/smart-model-switcher-v2 ~/.openclaw/workspace/skills/
# Restart Gateway (one-time)
openclaw gateway restart
No configuration needed! The skill auto-detects your plan and available models.
Q: Why didn't it switch models? A: Check logs for fallback events. Primary model may be unavailable.
Q: Can I override the selection? A: Yes, manually specify a model and it will use that.
Q: How do I know which model is being used? A: It always tells you at the start of each task.
Q: Memory usage increased? A: Normal. Model preloading uses 20% more memory for instant switching.
Version: 2.0.0 (Optimized) Author: Created for Coding Plan users License: MIT Release Date: 2026-03-10
Upgrade from V1 today and experience zero-latency model switching! 🚀
ZIP package — ready to use