Adobe and Speechmatics Enhance On-Device Speech Recognition for Premiere

  • Speechmatics and Adobe have deepened their partnership with a new on-device speech-to-text (STT) model in Adobe Premiere, delivering near-cloud accuracy while keeping audio local to the device.
  • The new model processes 1 hour of audio in about 55 seconds and is within 5% relative to cloud accuracy, evaluated across nearly 10 million words of diverse real-world data.
  • The model offers a 12-16% improvement against Whisper-powered creative solutions and supports 55+ languages.
  • The on-device model is available as a C/C++ library on macOS and Windows, integrating with Adobe Premiere for seamless transcription and speaker diarization.

The partnership between Adobe and Speechmatics underscores the growing importance of on-device AI solutions in content creation, driven by increasing data sovereignty concerns and the rise of LLM-centric workflows. As natural language becomes the interface for shaping stories, accurate and private speech-to-text technology is becoming a critical foundation for optimizing content workflows and enabling agentic AI. This development is part of a broader trend towards decentralized, privacy-focused AI solutions that cater to the needs of global creator communities.

Adoption Pace
How quickly studios, agencies, and production companies will integrate the new on-device model into their workflows, particularly in privacy-sensitive environments.
Competitive Edge
Whether Speechmatics can sustain its 12-16% accuracy advantage over Whisper-powered solutions as competitors enhance their own on-device offerings.
Market Expansion
The pace at which Speechmatics can expand its on-device model to other industries beyond media, such as healthcare and contact centers.