Privacy & Data
Does MimicScribe work offline?
Core speech-to-text transcription works completely offline — the model runs on-device using Apple Silicon. AI features (dictation corrections, text transforms, meeting summaries, assistant, and search) require an internet connection.
What data leaves my device?
Only when you actively use the features that need them:
- AI feature requests — when you use dictation corrections, text transforms, or meeting summaries, the transcribed text (never audio) is sent to Google Gemini via a server-side proxy. Unlimited subscribers can use their own Gemini API key (Settings → Subscription) to send data directly to Google, bypassing our servers entirely. Audio is processed entirely on your Mac. Depending on the feature, the request may also include:
- Dictation mode: your configured name, personal context, and vocabulary list (if set in Preferences)
- Transform mode: the selected text from the active app, plus clipboard text if you say “clipboard” or “pasteboard” in your instruction
- Meeting mode: meeting attendee names and email addresses (read from your calendar event, if calendar access is granted) as part of the transcript sent for summarization
- Reference documents: when you add a URL or file as a reference document (Settings > Your Context), its content is sent to Google Gemini once for processing into searchable sections. After processing, only relevant sections are included in AI requests — the full document is not sent on every call. Reference document content and search indexes are stored locally on your device
- License validation — a periodic check to verify your license key against our server. Billing is handled by Stripe; the app never contacts Stripe directly. No personal data beyond the key itself is sent
- Usage reporting (Unlimited plan only) — aggregate token counts and feature names are sent to track subscription usage against your plan. No transcription text is included
- Device identifier — a one-way hash of your Mac’s hardware ID is sent to our proxy server to enforce free-tier usage limits. It is not stored permanently, not shared with Google or any third party, and cannot be used to identify you personally
What data stays on my device?
All audio, transcripts, meeting recordings, and speaker profiles are stored locally. MimicScribe does not sync data between Macs or upload recordings to any server.
Where is my data stored?
| Data | Location |
|---|---|
| Database (transcripts, meetings, speaker profiles, reference document indexes) | ~/Library/Application Support/app.mimicscribe/mimicscribe.db |
| Audio recordings (if enabled) | ~/Documents/MimicScribe/Recordings/ |
| Meeting templates | ~/Library/Application Support/app.mimicscribe/Templates/ |
| Speech models | ~/Library/Application Support/FluidAudio/Models/ |
You can open these folders in Finder by pasting the path into Go > Go to Folder (Shift+Cmd+G).
How do I back up my data?
Time Machine backs up both ~/Library/Application Support/ and ~/Documents/ by default, so your database, templates, and audio recordings are all covered automatically.
If you don’t use Time Machine, the key locations to back up are:
~/Library/Application Support/app.mimicscribe/— database and templates~/Documents/MimicScribe/Recordings/— audio files
These are standard files that can be copied, moved, or synced with any backup tool.
Speech recognition models don’t need to be backed up — see the next section.
Speech recognition models
MimicScribe downloads four models on first launch:
| Model | Purpose | Size |
|---|---|---|
| Parakeet TDT 0.6B (v3) | Speech-to-text transcription | ~460 MB |
| Speaker diarization | Identifying who is speaking | ~34 MB |
| Silero VAD | Voice activity detection | ~1 MB |
| MiniLM (L6-v2) | Reference document & meeting search | ~90 MB |
The speech, diarization, and VAD models are stored in ~/Library/Application Support/FluidAudio/Models/ and use about 500 MB of disk space total. The MiniLM embedding model is cached by HuggingFace and uses about 90 MB.
Safe to delete. If you need to free up space, you can delete the entire FluidAudio/Models folder. No data is lost — only the pre-trained model weights are removed.
When you next launch MimicScribe, it will automatically re-download the models in the background. If you try to dictate or record before the download finishes, you’ll see a brief “Loading model…” overlay that disappears once the models are ready. The re-download requires an internet connection; after that, all transcription runs fully offline.
Optional analytics
Settings > Privacy offers two toggles:
- Send anonymous usage signals — specific event names are listed in the settings panel so you can see exactly what’s reported
- Send anonymized crash logs — helps us fix bugs faster
Both default to on during onboarding and can be turned off at any time. Choosing Local Mode in onboarding turns both off automatically. No personal data, audio, or transcription text is ever included. Usage signals include a one-way hash of your Mac’s hardware identifier so we can count distinct devices — this hash cannot be reversed to identify you personally and uses a different salt than the billing identifier, so the two cannot be correlated.
Verify it yourself
Every endpoint MimicScribe contacts is documented and inspectable from the command line. See Network Activity for the full inventory and the nettop / lsof commands to confirm what’s leaving your Mac at any moment.