Privacy & Data

Does MimicScribe work offline?

Core speech-to-text transcription works completely offline — the model runs on-device using Apple Silicon. AI features (dictation corrections, text transforms, meeting summaries, assistant, and search) require an internet connection.

What data leaves my device?

Only when you actively use the features that need them:

AI feature requests — when you use dictation corrections, text transforms, or meeting summaries, the transcribed text (never audio) is sent to Google Gemini via a server-side proxy. Unlimited subscribers can use their own Gemini API key (Settings → Subscription) to send data directly to Google, bypassing our servers entirely. Audio is processed entirely on your Mac. Depending on the feature, the request may also include:
- Dictation mode: your configured name, personal context, and vocabulary list (if set in Preferences)
- Transform mode: the selected text from the active app, plus clipboard text if you say “clipboard” or “pasteboard” in your instruction
- Reference documents: when you add a URL or file as a reference document (Settings > Your Context), its content is sent to Google Gemini once for processing into searchable sections. After processing, only relevant sections are included in AI requests — the full document is not sent on every call. Reference document content and search indexes are stored locally on your device
License validation — a periodic check to verify your license key against our server. Billing is handled by Stripe; the app never contacts Stripe directly. No personal data beyond the key itself is sent
Usage reporting (Unlimited plan only) — aggregate token counts and feature names are sent to track subscription usage against your plan. No transcription text is included
Device identifier — a one-way hash of your Mac’s hardware ID is sent to our proxy server to enforce free-tier usage limits. It is not stored permanently, not shared with Google or any third party, and cannot be used to identify you personally

What data stays on my device?

All audio, transcripts, meeting recordings, and speaker profiles are stored locally. MimicScribe does not sync data between Macs or upload recordings to any server.

Where is my data stored?

Data	Location
Database (transcripts, meetings, speaker profiles, reference document indexes)	`~/Library/Application Support/app.mimicscribe/mimicscribe.db`
Audio recordings (if enabled)	`~/Documents/MimicScribe/Recordings/`
Prompt templates (dictation / transform)	`~/Library/Application Support/app.mimicscribe/Templates/`
Speech models	`~/Library/Application Support/FluidAudio/Models/`

You can open these folders in Finder by pasting the path into Go > Go to Folder (Shift+Cmd+G).

How do I back up my data?

Time Machine backs up both ~/Library/Application Support/ and ~/Documents/ by default, so your database, templates, and audio recordings are all covered automatically.

If you don’t use Time Machine, the key locations to back up are:

~/Library/Application Support/app.mimicscribe/ — database and templates
~/Documents/MimicScribe/Recordings/ — audio files

These are standard files that can be copied, moved, or synced with any backup tool.

Speech recognition models don’t need to be backed up — see the next section.

Speech recognition models

MimicScribe downloads five models on first launch:

Model	Purpose	Size
Parakeet TDT 0.6B (v3)	Speech-to-text transcription	~460 MB
Speaker diarization	Identifying who is speaking	~34 MB
Silero VAD	Voice activity detection	~1 MB
MiniLM (L6-v2)	Reference document & meeting search	~90 MB
LocalVQE	Echo cancellation for meetings	~5 MB

The speech, diarization, and VAD models are stored in ~/Library/Application Support/FluidAudio/Models/ and use about 500 MB of disk space total. The MiniLM embedding model is cached by HuggingFace and uses about 90 MB.

Safe to delete. If you need to free up space, you can delete the entire FluidAudio/Models folder. No data is lost — only the pre-trained model weights are removed.

When you next launch MimicScribe, it will automatically re-download the models in the background. If you try to dictate or record before the download finishes, you’ll see a brief “Loading model…” overlay that disappears once the models are ready. The re-download requires an internet connection; after that, all transcription runs fully offline.

Optional analytics

Settings > Privacy offers two toggles:

Send anonymous usage signals — specific event names are listed in the settings panel so you can see exactly what’s reported
Send anonymized crash logs — helps us fix bugs faster

Both default to on during onboarding and can be turned off at any time. Choosing Local Mode in onboarding turns both off automatically. No personal data, audio, or transcription text is ever included, and everything sent is automatically deleted from our server after 90 days. Usage signals include a one-way hash of your Mac’s hardware identifier so we can count distinct devices — this hash cannot be reversed to identify you personally and uses a different salt than the billing identifier, so the two cannot be correlated.

Verify it yourself

Settings → Network Log shows every request the app sends, live — what it was for, where it went, and when. Every endpoint MimicScribe contacts is also documented and inspectable from outside the app: see Network Activity for the full inventory and the nettop / lsof commands to confirm what’s leaving your Mac at any moment.