WhisperClip is a free, open-source voice-to-text application for macOS that processes everything locally on your device.
The app converts speech to text using OpenAI’s Whisper models and includes built-in AI text enhancement through local language models. All without sending your data to the cloud.
Features
- High-Quality Speech Recognition: Uses WhisperKit models ranging from 216MB to 955MB, depending on your accuracy and speed preferences. The app supports multiple languages with automatic detection and shows real-time waveform visualization while you record.
- Local AI Text Enhancement: Processes transcribed text through local language models (Gemma, Llama, Qwen, Mistral, and others) to fix grammar, format emails, translate languages, or run custom text workflows you define.
- Full Privacy Protection: Runs 100% locally with no cloud services or data collection. The open-source code lets you verify exactly what the app does, and it operates within Apple’s secure sandboxed environment.
- Productivity Automation: Supports global hotkeys (Option+Space by default), automatic clipboard copying, auto-paste into your current application, and auto-enter for instant message sending. The app sits in your menu bar and auto-stops recording after 10 minutes.
- Apple Silicon optimization: Specifically optimized for Mac hardware.
- User-Friendly Interface: Features a dark-themed design, real-time recording visualization, detailed setup guides, simple model management, and customizable shortcuts with prompt templates.
Use Cases
- Drafting Emails and Messages: Instead of typing out long emails or messages in Slack, you can use WhisperClip to dictate them. The AI enhancement feature can even format the text into a properly structured email.
- Taking Quick Notes: When an idea strikes, you can press the hotkey and speak your thoughts. The text is instantly ready to be pasted into your notes app, so you don’t lose your train of thought.
- Content Creation: For writers and bloggers, WhisperClip can be a useful tool for getting a first draft down. Just speak your ideas for an article or a post, and then edit the transcribed text.
- Language Practice and Translation: You can speak in one language and have the AI translate it into another. This is helpful for language learners or for quickly translating a phrase.
How to Use It
1. Visit whisperclip’s official website and download the latest release.
2. Drag the WhisperClip.app file into your Applications folder like any other Mac application.
3. When you launch WhisperClip for the first time, macOS will ask for several permissions. Grant the app microphone access so it can record your voice.
4. You’ll also need to enable accessibility permissions (found in System Settings > Privacy & Security > Accessibility) for the global hotkey to work.
5. Allow Apple Events permissions so WhisperClip can perform clipboard operations and auto-paste functionality.
6. Download the AI models through the built-in setup guide. Start with OpenAI Whisper Small (216MB) for faster processing, or opt for the Whisper Large v3 Turbo (632MB) for improved accuracy. The app downloads these models from Hugging Face and stores them locally. This is the only time WhisperClip connects to the internet unless you choose to download additional models later.
7. For text enhancement, pick one of the local LLM models based on your Mac’s capabilities. The smaller Gemma 2 (2B) model works well on most machines, while the larger models provide better results if you have sufficient RAM and storage.
8. After setup, press Option+Space (or your custom hotkey) to start recording. Speak naturally. You don’t need to pause for punctuation or speak slowly.
9. Press the hotkey again to stop recording. WhisperClip will then process your speech locally, transcribe it to text, and automatically copy the result to your clipboard. If you enabled auto-paste, the text appears in whatever application you’re currently using.
10. To customize the app, open Settings and modify the hotkey if Option+Space conflicts with other shortcuts.
11. Add custom prompts under Settings > Prompts to create specialized text processing workflows (for example, “Translate to Spanish” or “Format as bullet points”).
12. Switch between different Whisper and LLM models anytime through the Setup Guide if your needs change.
Pros
- Completely Free: There are no ads, subscriptions, or premium features. It’s 100% free forever.
- Excellent Privacy: Since all processing happens locally, your data stays on your Mac.
- Highly Customizable: You can change the hotkey, create custom AI prompts for different tasks, and choose from a variety of local AI models.
Cons
- macOS 14+ Required: It only works on newer versions of macOS, so users with older operating systems are out of luck.
- High Disk Space Requirement: The AI models take up a significant amount of disk space (20GB recommended).
- No Speaker Diarization: The tool doesn’t distinguish between different speakers in a recording.
Related Resources
- WhisperKit Documentation: Technical details about the Apple-optimized Whisper implementation that WhisperClip uses for speech recognition.
- MLX Framework: Apple’s machine learning framework for Apple Silicon that enables local LLM processing in WhisperClip.
- OpenAI Whisper: The original Whisper project from OpenAI that pioneered the speech recognition technology WhisperClip builds upon.
FAQs
Q: Can I use WhisperClip completely offline?
A: Yes, after you download the necessary AI models during initial setup. WhisperClip only connects to the internet when downloading models from Hugging Face. All speech processing, transcription, and text enhancement happen locally on your Mac without any network requests. You can even block the app’s network access after setup if you want extra assurance.
Q: Which Whisper model should I choose?
A: Start with OpenAI Whisper Large v3 Turbo (632MB) for the best balance of speed and accuracy. If you have a slower Mac or limited storage, try Whisper Small (216MB) first—it’s noticeably faster but makes more transcription errors with accents or technical terms. The Large v2 Turbo (955MB) offers maximum accuracy but requires more processing time and disk space.
Q: What languages does WhisperClip support?
A: It supports multiple languages through the WhisperKit framework, which is capable of transcribing audio in 100 different languages. The app includes an auto-detection feature to identify the language being spoken.










