Benchmarking Whisper's Speed on Raspberry Pi 5 : How Fast Can It Get on a CPU?

TL;DR
Need less RAM consumption : The Vanilla Whisper is slow but uses sub-1GB RAM
Need top accuracy: Native Parakeet‑TDT‑v2 0.6 B scores 2.69 % WER at RTF 0.71, consuming 5.3 GB RAM.
Best edge balance: Sherpa‑onnx Parakeet‑TDT 0.11 B lands 4.19 % WER at near‑real‑time RTF 0.12 with 1.2 GB RAM.
Intro
Speech‑to‑text on the edge is no longer a science‑fair project. Today you can transcribe audio on a laptop—or, in our case, on our Distiller CM5 (Raspberry Pi CM5 compute module: 4 × Cortex‑A76 @ 2.4 GHz, 8 GB LPDDR4, ≤ 10 W)—without touching a GPU. The real debate is which stack gives you the best balance of accuracy, speed, and memory.
In this post we pit the most popular CPU‑only Whisper variants against two sizes of Parakeet‑TDT. Same hardware, same dataset, zero GPUs.
The Three Numbers That Matter
Word‑Error Rate (WER) – how many words the model gets wrong. Excellent < 10 %; usable < 20 %.
Real‑Time Factor (RTF) – inference time divided by audio length. RTF < 1 means you transcribe faster than you speak.
Memory – peak RAM during inference. Many edge boards have only 2‑4 GB free after the OS boots.
Test Bed
Dataset: 250 clips from Common Voice Delta 20.0 (validated split)
Hardware: Distiller CM5 (4 × Cortex-A76 @ 2.4 GHz, 8 GB LPDDR4, ≤ 10 W)
Bench Number
Variant | WER | RTF | RAM | Notes |
Vanilla Whisper | 11.48 % | 1.48 | 761 MB | — |
Fast‑Whisper | 10.08 % | 0.55 | 1,007 MB | — |
Sherpa‑onnx Whisper (base) | 9.66 % | 0.36 | 900 MB | — |
Sherpa‑onnx Whisper (base‑int8) | 10.99 % | 0.34 | 819 MB | — |
OpenVINO Whisper | 11.31 % | 0.29 | 1,392 MB | — |
Sherpa‑onnx Parakeet‑TDT 0.11 B | 4.19 % | 0.12 | 1,232 MB | 9 s load |
Native Parakeet‑TDT‑v2 0.6 B | 2.69 % | 0.71 | 5,384 MB | 97 s load |
Sherpa‑onnx Parakeet‑TDT‑v2‑int8 0.6 B | 3.51 % | 0.21 | 1,760 MB | 8 s load |
Takeaways
Need < 3 % WER? Native Parakeet‑TDT‑v2 0.6 B delivers—if you have 5 GB.
Parakeet‑TDT 0.11 B seems to be the new king, at 4 % WER and RTF 0.12 beats all other whisper models.
If you want to get a plug-and-play devkit to experiment with edge LLM, check out our shop and YouTube videos.
Subscribe to my newsletter
Read articles from PamirAI Founders directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
