Speech-to-Text
OpenShrimp can transcribe voice messages and video notes using Moonshine STT, a lightweight on-device speech recognition model. Send a voice message and the transcribed text is sent to Claude as your prompt.
How it works
Section titled “How it works”- You send a voice message or video note in Telegram
- OpenShrimp downloads the audio (OGG/Opus format)
- The Moonshine STT binary transcribes it locally
- The transcribed text is sent to Claude as your message
No external services or API calls — everything runs on your machine.
Moonshine STT is included with OpenShrimp but the binary is downloaded on first use. No manual setup is needed.
Automatic download
Section titled “Automatic download”The first time you send a voice message, OpenShrimp:
- Detects that the
moonshine-sttbinary isn’t installed - Downloads it from GitHub releases to
~/.local/share/openshrimp/bin/moonshine-stt - Downloads the ONNX model files
- Transcribes your message
Subsequent voice messages use the cached binary and models.
Supported platforms
Section titled “Supported platforms”| Platform | Architecture | Supported |
|---|---|---|
| Linux | x86_64 | Yes |
| Linux | aarch64 | Yes |
| macOS | Apple Silicon (aarch64) | Yes |
The Moonshine model
Section titled “The Moonshine model”Moonshine is a small, fast speech recognition model optimized for on-device inference:
- Moonshine V1 — four ONNX model files
- ONNX Runtime — no GPU required, runs on CPU
- Input — any audio format (converted to 16kHz mono float32 PCM via PyAV)
- Output — plain text transcription
Models are automatically downloaded from the sherpa-onnx releases on first use.
Limitations
Section titled “Limitations”- English only (Moonshine V1)
- Best with clear speech in low-noise environments
- Very long voice messages may take a few seconds to transcribe
- The first transcription is slower due to model loading
Using voice messages effectively
Section titled “Using voice messages effectively”Voice messages are transcribed and sent to Claude as regular text prompts. Some tips:
- Speak clearly and at a normal pace
- Describe your request as you would type it
- You can send follow-up voice messages — they continue the same conversation
- Voice messages work in both private chats and forum topics