Speech to Text
Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sends voice notes, audio attachments, me...
34 downloads
Free
Reviewed
audio
free
speech-to-text
transcription
voice
whisper
Speech to Text
Use this skill to turn local audio files into text with a public Whisper-based endpoint.
Quick start
Run:
python3 scripts/transcribe.py /path/to/file.ogg
Return the transcript as plain text. By default, the script also applies lightweight Chinese punctuation and sentence-breaking cleanup.
For machine-readable output:
python3 scripts/transcribe.py /path/to/file.ogg --json
To disable cleanup and keep the raw model text:
python3 scripts/transcribe.py /path/to/file.ogg --format raw
To force Chinese punctuation cleanup:
python3 scripts/transcribe.py /path/to/file.ogg --format zh
For English translation instead of same-language transcription:
python3 scripts/transcribe.py /path/to/file.ogg --task translate
Workflow
- Confirm the input is a local audio file.
- Run
scripts/transcribe.pyon it. - If the transcript looks imperfect, tell the user it came from a public Whisper endpoint and may need cleanup.
- If helpful, post-process into:
- cleaned transcript
- summary
- action items
- bilingual output
What the script does
The script:
- uploads the local file to a public Gradio-backed Hugging Face Space
- submits a Whisper transcription job
- waits for completion via the Gradio event stream
- prints the resulting text
Default endpoint:
https://hf-audio-whisper-large-v3-turbo.hf.space
Override it with:
python3 scripts/transcribe.py input.ogg --space https://your-space.hf.space
or set:
export HF_WHISPER_SPACE=https://your-space.hf.space
Guardrails
- Treat this as a best-effort public/free path, not a privacy-grade path.
- Do not use for highly sensitive audio unless the user explicitly accepts public third-party processing.
- Expect rate limits, queueing, and occasional outages.
- If the public endpoint fails, explain that the free backend is unavailable and offer alternatives.
Output handling
Prefer to return:
- the raw transcript when the user asked to "转文字/听写"
- a cleaned version when punctuation is poor
- a short note about uncertainty if names, numbers, or jargon may be wrong
Script
scripts/transcribe.py— public Whisper transcription helper
Download
ZIP package — ready to use
Skill Info
- Creator
- shu-hari
- Downloads
- 34
- Published
- Mar 15, 2026
- Updated
- Mar 16, 2026