📊 Key Data

93% reduction in medical transcription errors compared to leading general-purpose AI
1.4% word error rate (WER) in English medical terminology tests
98.3% recall on formatted clinical data (e.g., dosages, measurements)

🎯 Expert Consensus

Experts would likely conclude that Corti's Symphony for Speech-to-Text represents a significant advancement in medical transcription accuracy, offering superior performance over existing solutions and setting a new benchmark for clinical AI applications.

Sam Lidman

Sam Lidman: Daily

2 months ago

Corti's AI Slashes Medical Transcription Errors by up to 93%

COPENHAGEN, Denmark – May 20, 2026 – The Danish artificial intelligence lab Corti today launched a new speech recognition system that promises to fundamentally reshape the accuracy of clinical data. The new model, dubbed Symphony for Speech-to-Text, claims to reduce word error rates by up to 93% compared to leading general-purpose AI, a leap forward that could significantly enhance patient safety and power a new generation of intelligent healthcare tools.

In a field where a single misheard word can have critical consequences, Corti's announcement positions Symphony not just as an improvement, but as a foundational shift. The company is targeting the gap between general-purpose transcription services, which often struggle with complex medical jargon, and legacy dictation products that are not built for the sophisticated, AI-driven workflows now emerging in modern medicine.

A New Benchmark in Clinical Accuracy

The performance metrics released by Corti are striking. In tests focused on English medical terminology, Symphony for Speech-to-Text achieved a word error rate (WER) of just 1.4%. This figure stands in stark contrast to the performance of well-known generalist models on the same data, with Corti reporting WERs of 17.4% for Whisper, 17.7% for OpenAI's model, and 18.1% for ElevenLabs.

Even more significantly, Corti's system appears to outperform the long-standing industry benchmark, Nuance's Dragon Medical One. In a head-to-head comparison on real-world English medical dictation, Symphony achieved a 4.6% WER, a 19% improvement over Dragon's 5.7%. It also demonstrated a higher recall rate for specific medical terms, correctly identifying 93.5% compared to Dragon's 92.9%.

This level of precision is about more than just cleaner notes. The system's ability to produce "clinically usable formatting" is a key differentiator. Corti's research, published in a detailed paper on arXiv, shows Symphony achieving 98.3% recall on formatted entities like dosages, measurements, and dates. The strongest general-purpose baseline model tested only managed 44.3% recall on these structured data points. This means the AI doesn't just hear the words "take fifty milligrams," it understands and structures it as "50 mg," a crucial distinction for downstream clinical software and patient safety checks.

Beyond Transcription: Powering the 'Agentic Era' of Healthcare

Corti executives frame this launch as a crucial step into what they call the "agentic era" of healthcare, where AI systems move beyond simple tasks to become active assistants in clinical workflows. The accuracy of the underlying data is paramount for this vision to become a reality.

"Speech has always been one of healthcare's most important inputs," said Andreas Cleve, co-founder and CEO of Corti, in the company's press release. "What is changing is what happens after the words are captured. In the agentic era, speech recognition requires more than simply producing a transcript - we need to give AI systems accurate clinical facts to reason from. If a model mishears a medication, dosage, or symptom, every downstream step becomes less reliable."

This philosophy is built into Symphony's architecture. Instead of just providing a raw text file, the system is designed to deliver structured, verified information through its API. This approach helps ensure that subsequent AI applications—whether for automated coding, clinical decision support, or ambient documentation—are operating on a foundation of facts, not just a string of potentially flawed text. This focus on providing reliable "facts" is designed to mitigate the risk of AI "hallucinations" and improve the safety and utility of all connected systems.

Breaking Global Language Barriers in Medicine

The challenge of accurate speech recognition is compounded in global healthcare settings, where multiple languages and dialects are often used within a single institution. Corti's research demonstrates that Symphony's accuracy gains are not limited to English.

The system achieved a 2.4% WER in German, compared to 13.0% for the next-best system tested. In French, it recorded a 3.9% WER versus 10.6% for its nearest competitor. This consistent multilingual performance is a critical feature for international healthcare organizations and technology partners.

Early adopter Voicepoint, which provides clinical documentation solutions in the linguistically diverse landscape of Switzerland, underscored the importance of this capability. "In a clinical conversation, every word matters - a missed medication name, a misheard dosage, or a mistranscribed symptom can change the meaning of an encounter," stated Pierre Corboz, Head of Solutions & Business Development at Voicepoint. "Symphony's accuracy on clinical terminology gives us the foundation to bring more trusted AI capabilities into clinical workflows... When Corti improves the speech layer, the workflows we build together become sharper, safer, and more useful for clinicians in Switzerland."

Navigating a Complex and Crowded Market

Corti is entering a competitive field, but it has carved out a specific, high-value niche. The market for medical speech recognition has historically been bifurcated. On one side are the powerful, general-purpose APIs from tech giants, which offer scalability but can lack the domain-specific accuracy required for medicine. On the other are dedicated medical dictation products, which are highly tuned for physician monologues but are often architecturally rigid and not designed as an infrastructure layer for other AI tools.

Symphony for Speech-to-Text is engineered to be the best of both worlds: a highly accurate, clinically-aware engine delivered through a flexible, developer-friendly API. The system's training regimen, which utilizes a massive proprietary corpus of real-world clinical audio alongside synthetic data to cover over 150,000 medical terms, is central to its advantage. By focusing on real clinical language and conditions, Corti aims to provide a model that is robust in the noisy, unpredictable environments of actual healthcare delivery.

With built-in adherence to regulations like HIPAA and GDPR, Corti is also addressing the critical need for compliance. The platform is now generally available to developers and healthcare organizations through the Corti API, inviting the industry to build the next generation of clinical tools on a more reliable foundation of speech.