GUI Automation
Control the desktop via CUA computer server API running on port 8000
Desktop Control via CUA Server
This skill allows OpenClaw to control the desktop using the CUA computer server API.
⚠️ Security Notice
This skill requires installing and running a third-party server (cua-computer-sdk) that has full control over your desktop.
Before using this skill:
- The server can simulate keyboard, mouse, and take screenshots
- Only run on systems where you trust all users and processes
- The server runs with your user privileges (no sudo/admin required)
- By default, only accessible from localhost (safe for local use)
Prerequisites
- Python 3.12+ installed on your system
- CUA computer server running on port 8000 (see installation below)
- Access to localhost:8000 only (network exposure not recommended)
Installation
Recommended: Temporary Session (Safest)
Run the server only when needed, in a terminal you can monitor:
# Install the Computer SDK (official CUA package)
pip install cua-computer-sdk
# Verify package (optional but recommended)
pip show cua-computer-sdk # Check publisher and version
# Run temporarily (Ctrl+C to stop)
cua-server start --port 8000 --bind 127.0.0.1
# In another terminal, verify it's running locally only
curl http://localhost:8000/status
netstat -an | grep 8000 # Should show 127.0.0.1:8000
This is the safest approach - the server only runs when you explicitly start it and stops when you close the terminal.
Alternative: Install from Source
For transparency, you can review and run from source:
# Clone and review the code first
git clone https://github.com/trycua/cua-computer-server
cd cua-computer-server
# Review the code before running
ls -la
cat requirements.txt # Check dependencies
# Install and run
pip install -r requirements.txt
python -m cua_server --port 8000 --bind 127.0.0.1
Running the Server
Option 1: Manual Start (Recommended)
# Start in foreground - you can see what it's doing
cua-server start --port 8000
# Stop with Ctrl+C when done
Option 2: Background Process (Temporary)
# Run in background for current session only
cua-server start --port 8000 &
# Note the process ID
echo "Server PID: $!"
# Stop when done
kill <PID>
Note: This skill does NOT require persistent/system service installation. Running the server temporarily when needed is the recommended approach.
Scope & Limitations
This skill:
- ✅ Controls YOUR desktop when the server is running
- ✅ Runs with YOUR user privileges (no admin/sudo needed)
- ✅ Only accessible from localhost by default
Security Best Practices
- Run Temporarily: Start the server only when needed, stop when done
- Localhost Only: Keep default binding to 127.0.0.1
- No Network Exposure: Avoid
--bind 0.0.0.0unless absolutely necessary - Monitor Activity: Run in foreground to see what commands are executed
- Limited Scope: The server can only do what your user account can do
Quick Test
After starting the server, verify it works:
# Simple health check
curl http://localhost:8000/status
# Should return: {"status": "ok"}
# Take a screenshot (safe test)
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "screenshot"}' \
-o screenshot.json
# If successful, you'll get a JSON response with base64 image data
Troubleshooting
Port Already in Use:
# Check what's using port 8000
lsof -i :8000 # macOS/Linux
netstat -ano | findstr :8000 # Windows
# Solution: Use a different port
cua-server start --port 8001
Permission Denied (Linux):
# You may need to add your user to the input group for keyboard/mouse control
sudo usermod -a -G input $USER
# Log out and back in for changes to take effect
Display Not Found (Linux):
# Check your display variable
echo $DISPLAY
# Set it explicitly
DISPLAY=:0 cua-server start --port 8000
Server Not Responding:
# Check if the process is running
ps aux | grep cua-server # Linux/macOS
tasklist | findstr cua-server # Windows
# Try running in foreground to see errors
cua-server start --port 8000 --debug
Available Commands
Take Screenshot
Capture the current screen:
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "screenshot"}' \
| jq -r '.result.base64' \
| base64 -d > screenshot.png
Click at Coordinates
Click at specific x,y coordinates:
# Click at center of 1280x720 screen
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "left_click", "params": {"x": 640, "y": 360}}'
Right Click
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "right_click", "params": {"x": 640, "y": 360}}'
Double Click
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "double_click", "params": {"x": 640, "y": 360}}'
Type Text
Type text at the current cursor position:
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "type_text", "params": {"text": "Hello, World!"}}'
Press Hotkey
Press a key combination:
# Ctrl+C
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "hotkey", "params": {"keys": ["ctrl", "c"]}}'
# Ctrl+Alt+T (open terminal)
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "hotkey", "params": {"keys": ["ctrl", "alt", "t"]}}'
Press Single Key
Press a single key:
# Press Enter
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "press_key", "params": {"key": "enter"}}'
# Press Escape
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "press_key", "params": {"key": "escape"}}'
Move Cursor
Move cursor to specific position:
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "move_cursor", "params": {"x": 100, "y": 200}}'
Scroll
Scroll up or down:
# Scroll down 3 units
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "scroll_direction", "params": {"direction": "down", "amount": 3}}'
# Scroll up 5 units
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "scroll_direction", "params": {"direction": "up", "amount": 5}}'
Launch Application
Launch an application by name:
# Launch Firefox
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "launch", "params": {"app": "firefox"}}'
# Launch Terminal
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "launch", "params": {"app": "xfce4-terminal"}}'
Open File or URL
Open a file or URL with default application:
# Open URL
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "open", "params": {"path": "https://example.com"}}'
# Open file
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "open", "params": {"path": "/home/cua/document.txt"}}'
Get Window Information
Get current window ID:
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "get_current_window_id"}'
Window Control
Maximize window:
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "maximize_window", "params": {"window_id": "0x1234567"}}'
Minimize window:
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "minimize_window", "params": {"window_id": "0x1234567"}}'
Demo Workflows
Browser Navigation Demo
Open Firefox and navigate to a website:
# Take initial screenshot
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "screenshot"}' -o initial.json
# Launch Firefox
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "launch", "params": {"app": "firefox"}}'
sleep 3
# Focus address bar (Ctrl+L)
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "hotkey", "params": {"keys": ["ctrl", "l"]}}'
sleep 1
# Type URL
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "https://example.com"}}'
# Press Enter
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "press_key", "params": {"key": "enter"}}'
sleep 5
# Take final screenshot
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "screenshot"}' -o final.json
Text Editor Demo
Open text editor and type content:
# Open terminal
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "hotkey", "params": {"keys": ["ctrl", "alt", "t"]}}'
sleep 2
# Type command to open text editor
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "mousepad"}}'
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "press_key", "params": {"key": "enter"}}'
sleep 2
# Type some text
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "Hello from OpenClaw!\nThis is automated desktop control."}}'
# Save file (Ctrl+S)
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "hotkey", "params": {"keys": ["ctrl", "s"]}}'
sleep 1
# Type filename
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "openclaw-demo.txt"}}'
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "press_key", "params": {"key": "enter"}}'
Form Filling Demo
Fill out a web form:
# Assuming browser is open with form visible
# Click on first input field (adjust coordinates)
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "left_click", "params": {"x": 400, "y": 300}}'
# Type name
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "John Doe"}}'
# Tab to next field
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "press_key", "params": {"key": "tab"}}'
# Type email
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "john@example.com"}}'
# Tab to next field
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "press_key", "params": {"key": "tab"}}'
# Type message
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "type_text", "params": {"text": "This form was filled automatically by OpenClaw!"}}'
# Submit form (click submit button)
curl -X POST http://localhost:8000/cmd -H "Content-Type: application/json" -d '{"command": "left_click", "params": {"x": 400, "y": 500}}'
Helper Functions
Check Server Status
curl http://localhost:8000/status
List All Available Commands
curl http://localhost:8000/commands | jq
Get Screen Size
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "get_screen_size"}'
Get Cursor Position
curl -X POST http://localhost:8000/cmd \
-H "Content-Type: application/json" \
-d '{"command": "get_cursor_position"}'
Environment Variables
CUA_SERVER_URL: Base URL for CUA server (default: http://localhost:8000)
Tips
- Wait Between Commands: Add
sleepbetween commands to allow UI to update - Check Coordinates: Screen is 1280x720, center is at (640, 360)
- Screenshot for Debugging: Take screenshots before and after actions to verify
- Use Variables: Store coordinates and text in variables for reusability
Example OpenClaw Usage
Once this skill is loaded, you can use it in OpenClaw conversations:
User: "Take a screenshot and open Firefox"
OpenClaw: *executes the screenshot and launch firefox commands*
User: "Type 'Hello World' in the current window"
OpenClaw: *executes the type_text command*
User: "Click at the center of the screen"
OpenClaw: *executes click command at 640,360*
Troubleshooting
- Connection Refused: Make sure CUA server is running on port 8000
- No Response: Check if you're in the container or have SSH tunnel set up
- Commands Not Working: Verify with
curl http://localhost:8000/status - Wrong Coordinates: Remember screen is 1280x720, adjust coordinates accordingly
Download
ZIP package — ready to use
Skill Info
- Creator
- sarinali
- Downloads
- 44
- Published
- Mar 15, 2026
- Updated
- Mar 16, 2026