Private Voice Dictation & Meeting Transcription for macOS

Ghost Pepper is a free open source macOS app for local voice dictation, note-taking, and meeting transcription.

It runs speech recognition, text cleanup, and summary generation on Apple Silicon, then saves results on the Mac as local text or markdown files.

Download Ghost Pepper

Features

Pastes transcribed speech into any active text field when you release the Control key.
Removes filler words and self-corrections after each transcription using a local LLM.
Records meeting calls and saves the full transcript, notes, and AI summary as a markdown file on your Mac.
Runs all speech and cleanup models on-device via Apple Silicon, with no data sent to any server.
Supports five speech model options ranging from fastest English-only transcription to 50-plus-language coverage.
Lets you edit the cleanup prompt in Settings to control exactly how the LLM post-processes your speech.
Provides optional cloud integrations for Zo AI chat, Trello, and Granola meeting import.

Use Cases

Private dictation for writing in email apps, editors, browsers, and note tools.
Meeting transcription for calls that should stay on a local machine.
Internal documentation where markdown output is useful.
Developer or technical writing workflows that benefit from preferred term correction.
Research interviews and personal knowledge capture on a Mac.
Managed device environments where IT can pre approve Accessibility access with PPPC profiles.

Quick Start

1. Download the Ghost Pepper DMG file.

2. Drag the app into Applications.

3. Open the app and grant Microphone permission.

4. Grant Accessibility permission so the global hotkey and text pasting can work.

5. Pick the speech & cleanup models you want to use.

6. Hold Control and speak to test dictation in a text field.

7. Open settings and tune the cleanup prompt, microphone, and other options.

8. Start a meeting transcription when you want notes, transcripts, and summaries saved as local markdown files.

Basic Dictation

Hold the Control key and speak. Release Control to end recording. Ghost Pepper transcribes the audio and pastes the result into the active text field.

The local cleanup LLM then processes the text, removes filler words, and corrects self-corrections. Cleanup adds between 1 and 7 seconds, depending on which cleanup model you select.

Meeting Transcription

Start a meeting transcription session from the menu bar icon before or during a call. Ghost Pepper uses AVAudioEngine and ScreenCaptureKit to capture audio locally.

When you end the session, it generates a markdown file on your Mac containing the full transcript, notes, and an AI-generated summary.

Model Selection

Open Settings from the menu bar icon to choose speech and cleanup models. Each model downloads automatically from Hugging Face on first selection and caches locally for all subsequent use.

Speech Models

Model	Size	Best For
Whisper tiny.en	~75 MB	Fastest transcription, English only
Whisper small.en (default)	~466 MB	Best accuracy, English only
Whisper small (multilingual)	~466 MB	Multi-language support
Parakeet v3 (25 languages)	~1.4 GB	Multi-language via FluidAudio
Qwen3-ASR 0.6B int8 (50+ languages)	~900 MB	Highest multilingual quality, macOS 15+ required

Cleanup Models

Model	Size	Cleanup Speed
Qwen 3.5 0.8B (default)	~535 MB	Very fast (~1-2 seconds)
Qwen 3.5 2B	~1.3 GB	Fast (~4-5 seconds)
Qwen 3.5 4B	~2.8 GB	Full quality (~5-7 seconds)

Pros

Real local processing for core transcription, cleanup, summaries, OCR context, and storage.
Free and open source under the MIT license.
No account required.
Fast push-to-talk dictation in any text field.
Local meeting notes with markdown output.

Cons

Requires macOS 14.0 and Apple Silicon (M1 or later).
Initial model downloads total anywhere from several hundred megabytes to over a gigabyte, depending on which speech and cleanup models you select.
Accessibility permission requires admin rights on managed enterprise devices, and MDM pre-approval adds a configuration step for IT teams.

Related Resources

WhisperKit: The Swift inference library powering Ghost Pepper’s speech recognition, useful for developers building custom on-device transcription tools.
LLM.swift: The on-device LLM inference library used to run Ghost Pepper’s cleanup models.
FluidAudio: The inference engine backing the Parakeet v3 multilingual speech model in Ghost Pepper.
Hugging Face: The model hosting platform from which Ghost Pepper downloads all speech and cleanup models on first run.
Apple ScreenCaptureKit Documentation: Apple’s framework used for local audio capture in Ghost Pepper’s meeting transcription mode.

FAQs

Q: Does Ghost Pepper work on Intel Macs?
A: Ghost Pepper requires Apple Silicon (M1 or later) and does not support Intel Macs. The app relies on Apple’s Neural Engine for on-device model inference.

Q: Does Ghost Pepper need internet access to work?
A: An internet connection is required only during the first use of each model, for the initial download from Hugging Face. After models are cached locally, all features run fully offline.

Q: Can I use Ghost Pepper to transcribe Zoom or Google Meet calls?
A: Ghost Pepper’s meeting transcription mode uses ScreenCaptureKit to capture audio locally from any source on your Mac, including cloud-based video call apps.

Q: Does Ghost Pepper support multiple languages?
A: Yes. Language support depends on the speech model you install.