Why we chose Core ML over MLX

When we started building transcription for Sequence, we weren’t chasing benchmarks... but then we went and broke one.

We needed something that could live inside high-flow editing pipelines, day after day, project after project, without slowing the storytelling down.

We did start with MLX, and it was fun to work with. There are many great Rust bindings out there, so I get why so many people choose it.

Something about Core ML kept calling to me, so I had to experiment and learn what I was being drawn to. At the end of the day, we needed an architecture we could trust repeatedly, and something that felt rock-solid on macOS, even for laptop users editing on battery.

So what did I discover about Core ML?

CPU, GPU, ANE - three for the price of one. That was the first curiosity train: how we could effectively use the full Apple silicon stack, rather than relying on a single execution path. What does that give you? Impressive performance that lands work where it needs to, plus real power-efficiency gains from Apple Silicon (M1 onward). Direct access to Apple’s Neural Engine (ANE) was especially impressive, and we could clearly see the resource profile in our tests.

Second: packaging, how we ship models to users, and safely know they’ll love the results. This is where Core ML shines, with compiled .mlmodelc bundles that let us remove overhead from extra interpretation layers we didn’t need.

With the compiled models we built, we could set explicit I/O contracts and run a single native runtime path that taps Apple silicon properly, while matching what NVIDIA did with NeMo in Parakeet v3. Read more about how to transcribe on your Mac here.

So once we were set on Core ML, how did we squeeze every ounce out of the model? The custom harness.

We wired up export parity checks against reference behavior, tensor-boundary validation, decode-loop corrections, and layered post-processing hardening. That system got us to 2.54% WER on LibriSpeech test-clean. NVIDIA’s published 6.05% figure for Parakeet v3 is a 7-dataset macro average, so it’s not a direct test-clean apples-to-apples comparison but overall impressive!

Core ML is the runtime we trust editors with.

Thanks for reading,

Peace,

Sequence Tool social media preview graphic

James Seddon, Founder

Why we chose Core ML over MLX

Create What Matters

Read more

BAFTA Slip - A Case for Smarter Editing

Why we chose Core ML over MLX

Create What Matters

Read more

BAFTA Slip - A Case for Smarter Editing

Sequence Newsletter