ConnexAI’s Speech Recognition Outperforms OpenAI, Google, and Amazon on Real-World Customer Service Audio

ConnexAI’s Speech Recognition Outperforms OpenAI, Google, and Amazon on Real-World Customer Service Audio

ConnexAI’s Speech Recognition Outperforms OpenAI, Google, and Amazon on Real-World Customer Service Audio

ConnexAI releases best-in-class ASR model, achieving lowest error rate across 16,311 production recordings

ConnexAI releases best-in-class ASR model, achieving lowest error rate across 16,311 production recordings

Last updated

News

ConnexAI, home to one of the UK’s leading AI research labs, today published benchmark results demonstrating that its realtime automatic speech recognition (ASR) model outperforms major providers including OpenAI, Google, Amazon, and Deepgram on real-world customer service and sales audio.

In testing across 16,311 recordings totalling twenty-five hours of authentic customer service conversation audio, ConnexAI’s ASR model achieved a median Word Error Rate (WER) of 7.7%, compared to 10.5% for the next-best performing model, Amazon Transcribe. OpenAI’s gpt-4o-transcribe recorded 28.6%, while Google STT and Deepgram Nova-3 achieved 20.0% and 15.8%, respectively.

The results reflect nearly a decade of focused development. ConnexAI established its AI Research Lab in Manchester in 2017, building speech recognition models specifically for the demands of customer service and sales environments and training on real-world audio from global sources rather than clean studio recordings.

The benchmark evaluated seven leading streaming ASR models using audio captured from production contact centre environments. The test set included speakers from the US, UK, Australia, and South Africa, encompassing both native and non-native English speakers across varying acoustic conditions and industry domains, including insurance, healthcare, and finance.

“In agentic AI systems, speech recognition is the foundation of success. When an AI agent mishears a customer, that error cascades through the entire conversation, potentially causing failed transactions or frustrated callers,” said Kris Hong, Head of Speech at ConnexAI. “Our ASR model is purpose-built for the realities of contact centre audio: compressed signals from phone lines, background noise, disfluent speech, and the full spectrum of accents spoken by customers across our clients’ global operations.”

Benchmark Methodology

ConnexAI’s AI Research Lab designed the benchmark to reflect genuine production conditions. All recordings were streamed to providers in their original 8 kHz format, with human-verified transcriptions serving as ground truth. The team developed provider-specific normalisation rules to ensure fair comparison, with all outputs manually reviewed for accuracy.

ConnexAI’s advantage was most pronounced in alphanumeric sequences, such as postcodes and reference numbers, where it achieved 9.1% WER compared to 10.0% for Amazon Transcribe and over 33% for most other models. This capability is critical for contact centre applications where accurate capture of account numbers, addresses, financial data, and booking references directly impacts customer outcomes.

Utterance length analysis further demonstrated ConnexAI’s best-in-class performance. Because the model is trained on real-world conversational speech, including short and fragmented responses, it maintains strong accuracy even for brief one-to-five-word utterances where contextual information is limited. Many competing models show higher error rates on shorter inputs, improving only as utterance length increases.

Why ASR Accuracy Matters for Agentic AI

Enterprise-ready agentic AI solutions combine automatic speech recognition for hearing, large language models for reasoning, and text-to-speech for speaking. In these autonomous pipelines, any ASR model error can propagate downstream, potentially causing critical system failures or poor customer experiences.

While many ASR providers optimise for general-purpose applications, ConnexAI’s AI Research Lab trains its models on real-world contact centre audio from global sources. This specialisation enables the model to maintain, and often exceed, performance on prevalent accents while also supporting under-represented accents under challenging acoustic conditions where general-purpose models typically struggle.

The full benchmark report, including detailed methodology and category-level results, is available at :

www.connex.ai/resources/connexai-leading-automatic-speech-recognition-benchmark

About ConnexAI

ConnexAI is an award-winning agentic AI platform designed by a world-class engineering team to help organisations maximise profitability, accelerate revenue growth, and redefine productivity. Its AI Research Lab, established in 2017, develops best-in-class speech recognition, natural language processing, and voice synthesis models that power ConnexAI’s suite of enterprise-grade applications, including AI Agent, AI Analytics, ASR, AI Voice, and AI Quality. Learn more at www.connex.ai.

Share:

Media Enquires:

hello@connex.ai

0333 344 2435