Skip to content
Worix
BrowsePublish
Log inSign Up

Willow Inference Server

Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...

73 downloads
Free
Reviewed

Willow Inference Server Skill

Local ASR (speech-to-text) and TTS (text-to-speech) inference server.

Setup

1. Start Willow Inference Server

git clone https://github.com/toverainc/willow-inference-server.git
cd willow-inference-server
./utils.sh install
./utils.sh gen-cert your-hostname
./utils.sh run

Server runs at https://your-hostname:19000

2. Configure Environment

Set the server URL:

export WILLOW_BASE_URL="https://your-hostname:19000"

Or configure per request (see below).

ASR (Speech-to-Text)

Transcribe Audio File

curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@/path/to/audio.m4a" \
  -F "language=auto"

Parameters

ParameterDescriptionDefault
audio_fileAudio file to transcriberequired
languageLanguage code (en, zh, etc.) or "auto"auto
modelWhisper model (tiny, base, medium, large-v2)server config
tasktranscribe or translatetranscribe

Supported Formats

  • MP3, WAV, M4A, OGG, FLAC, WebM

Example: Transcribe with curl

# Basic transcription
curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@recording.m4a" \
  -F "language=zh"

# With specific model
curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@meeting.mp3" \
  -F "language=en" \
  -F "model=base"

TTS (Text-to-Speech)

Convert Text to Speech

curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world", "voice": "af_sarah"}'

Parameters

ParameterDescriptionDefault
textText to convert to speechrequired
voiceVoice ID (see below)default voice
speedSpeech speed (0.5-2.0)1.0
volumeVolume (0.0-1.0)1.0

Available Voices

Common voices (format: gender_voicename):

  • af_sarah - Sarah (Female)
  • af_bella - Bella (Female)
  • am_michael - Michael (Male)
  • am_alex - Alex (Male)

Check server docs for full list: ${WILLOW_BASE_URL}/api/docs

Example: TTS with curl

# Basic TTS
curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "你好,这是测试"}' \
  -o output.wav

# With custom voice
curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello!", "voice": "am_michael", "speed": 1.2}' \
  -o hello.mp3

Environment Variables

VariableDescriptionDefault
WILLOW_BASE_URLServer URLhttps://localhost:19000

Workflow Examples

1. Record and Transcribe

# Record audio (macOS)
rec test.wav

# Transcribe
curl -X POST "${WILLOW_BASE_URL}/asr" \
  -F "audio_file=@test.wav" \
  -F "language=auto"

2. Text to Speech

# Convert text to speech
curl -X POST "${WILLOW_BASE_URL}/tts" \
  -H "Content-Type: application/json" \
  -d '{"text": "今天的任务是学习新技能"}' \
  -o speech.wav

3. Batch Transcription

for f in *.m4a; do
  curl -X POST "${WILLOW_BASE_URL}/asr" \
    -F "audio_file=@$f" \
    -F "language=auto" \
    -o "${f%.m4a}.txt"
done

API Documentation

Full API docs available at: ${WILLOW_BASE_URL}/api/docs

Notes

  • All endpoints require HTTPS (or HTTP if configured)
  • Audio files are processed locally on the server
  • ASR latency depends on model size and hardware
  • TTS voices can be customized with custom voice recordings

Download

ZIP package — ready to use

Skill Info

Creator
DeAntiWang
Downloads
73
Published
Mar 15, 2026
Updated
Mar 16, 2026