Performance & Memory

MimicScribe runs speech recognition, speaker diarization, and echo cancellation entirely on your Mac. Here’s what that costs in practice.

Memory Footprint

Measured on an M1 MacBook Pro (16 GB RAM) running a release build during an active meeting recording with live transcription and diarization:

StatePhysical Footprint
Idle (menu bar, models loaded)~275 MB
Active meeting recording~330 MB
Peak (highest observed)~330 MB

For context, a single Safari tab typically uses 150-400 MB depending on page complexity.

What counts and what doesn’t

The ~330 MB figure is the physical footprint — the memory macOS actually charges to MimicScribe. This is the same number you’d see in Activity Monitor’s “Memory” column and what macOS uses for memory pressure decisions.

The speech recognition and diarization models (~500 MB of weights) run on the Neural Engine (ANE), Apple’s dedicated ML accelerator. macOS maps these into a separate memory region tagged (neural)(nofootprint) — they do not count against the app’s memory budget and do not compete with your other applications for RAM.

CategorySizeCounts against app?
App heap + framework data~165 MBYes
UI surfaces (IOSurface, CoreAnimation)~130 MBYes
ML model weights on Neural Engine~500 MBNo
SQLite page cacheUnder 1 MB dirtyYes (reclaimable under pressure)

Heap stability

During a meeting recording, the heap does not grow over time. The peak footprint stays within 6 MB of the steady-state value — there are no allocation spikes or slow leaks. The operating system can also reclaim ~25 MB of cached memory under pressure without any impact to the app.

Processing Latency

MimicScribe uses a custom BNNS Graph decoder pipeline (Apple’s Accelerate framework) instead of standard CoreML inference, cutting decode time nearly in half:

StageTime
Keypress to audio capture~2 ms
ASR decode per 15s audio window~80 ms
Full ASR pipeline per window (preprocess + encode + decode)~133 ms
Model warmup (first launch)~1.5 s

Dictation and voice commands feel instantaneous — the ASR result is ready before you finish lifting your finger from the hotkey.

CPU Usage

During an active meeting, MimicScribe uses moderate CPU for audio capture, echo cancellation, and streaming ASR. Between ASR windows (most of the time), CPU usage drops to near zero. The Neural Engine handles the heavy ML inference, keeping CPU available for your other work.

Battery Impact

MimicScribe’s ASR pipeline processes each 15-second audio window in ~133 ms, then idles until the next window is ready. That works out to under 1% CPU duty cycle during an active recording — the processor is idle more than 99% of the time.

The ML models (speech recognition, speaker diarization) run on Apple’s Neural Engine, which is purpose-built for low-power inference and draws from a separate power budget than the CPU and GPU. Audio capture and echo cancellation run on the CPU but require minimal sustained work.

When you’re not recording, MimicScribe sits idle in the menu bar with no background processing, no network polling, and no periodic sync.

Disk Usage

DataSize
Speech + diarization models~500 MB (one-time download)
Database (transcripts, meetings, speaker profiles)Grows with usage, typically under 50 MB
Audio recordings (if enabled)~8 MB per hour

Measuring It Yourself

You can verify these numbers on your own Mac. Open Terminal and run:

footprint $(pgrep mimicscribe)

This shows the full memory breakdown including the Neural Engine region. The phys_footprint line at the bottom is the number macOS charges to the app.