January 28, 2026
10 minutes
Written by
Minah Han
Company News
No items found.
January 28, 2026
10 minutes
Written by
Minah Han
Company News
No items found.

Why InnoCaption Offers Both Human and AI Captions

For the D/deaf and Hard of Hearing (DHH) community, the past decade has brought a profound transformation in communication access. Advancements in automated speech recognition (ASR) have broadened accessibility, and the number of call captioning solutions has grown rapidly. It’s an exciting development: everyone’s accessibility needs are different, so the more options to choose from, the better.

But at InnoCaption, our approach to captioning is a little different from everyone else’s. While we’re proud of the AI technology we’ve built, we still believe that human expertise is invaluable — and that our users should have the freedom to choose what works best for them, every time. That’s why we’re proud to be the only captioned phone service provider to offer captioning by live stenographers in addition to ASR, with the ability to switch caption modes mid-call with a tap.

Here’s how we see the value of AI and human captioning, respectively.

The Speed and Scale of Automated Speech Recognition 

In 2019, InnoCaption became the first FCC certified call captioning provider to offer AI captions using automated speech recognition. It was a major step forward that enabled us to keep up with our rapidly growing user base without being constrained by the limited number of high-quality human captioners we could train and onboard. It also let us address a number of specific needs: we had users who valued consistently fast captioning speed over accuracy, and others who wanted a way to have intimate conversations without a third party on the line.

Today, our AI caption mode provides fast, accurate, consistent captioning to tens of thousands of users every month. Many people use it as their default captioning mode, with some preferring it for a range of everyday use cases like calls to customer service lines with automated voice prompts to navigate. Using ASR also allows us to support captioning in 20+ commonly used languages besides English — empowering even more people to make calls with confidence.

But not all ASR technology is created equal, and the performance of different speech recognition models is changing all the time. To make sure we’re always providing our users the best AI captioning solution available, we’re committed to an ongoing process of benchmarking and optimization.

“Captioning live phone call audio is an incredible technical challenge,” says Paul Lee, our Chief Operating Officer. “It takes a lot of work to get it right, and you can never let yourself think you’ve solved the problem for good.”

Every month, we run tests on the top speech recognition models in the market to measure their performance against key benchmarks like accuracy, delay, and other factors. Our software is built to be able to easily swap models as needed, so when we find that another model will serve our users better than the one we’re currently using, we’re able to make the change. You can learn more about our benchmarking process here.

The Precision and Nuance of Live Stenographers

For as much work as we put into our automated captioning, though, there are still some situations that AI struggles with. Factors like poor audio quality, heavy accents, or background noise can all cause error rates to rise. And for someone relying on captions during an important call — for example, a phone interview — “mostly right” won’t cut it. That’s why we offer live human captioning, not just AI.

“Good enough doesn’t always mean accessible,” says Paul. “If someone is being interviewed for a job and they’re anxious that the ASR may not work properly, that’s more than an inconvenience — many people aren’t comfortable disclosing their hearing loss early in the hiring process. Sensitive situations require the highest level of reliability.”

Our stenographers go through 3-4 years of court reporting school, where they’re trained to use specialized stenotype machines that let them write over 200 words per minute with a high degree of accuracy. With their human understanding and experience, they can deliver high-quality captions even when faced with the types of challenges that tend to trip up ASR. Plus, all 911 calls are automatically routed to one of our live captioners for safety.

But the human captioning option isn’t just about getting the words right: it’s about added context and nuance, too. Our stenographers caption non-verbal sounds like laughter or a dog barking in the background, giving the user a richer, more fully realized understanding of what’s happening in the moment. (You can meet one of our captioners here.)

One user told us that, thanks to having a live stenographer captioning her call, she was able to see that her granddaughter was laughing. That’s the kind of connection you don’t get with AI.

Accessibility Your Way

At InnoCaption, we refuse to choose between AI and human solutions because we know accessibility isn’t one-size-fits-all. Our role is to give our users flexibility: ASR when consistent speed matters, stenographers when human context is critical, and the ability to switch back and forth seamlessly. It’s a level of control you won’t get anywhere else.

“Accessibility isn’t about indiscriminately adopting the newest technology,” says Paul. “We believe that embracing the benefits of technology, while understanding its limitations and empowering users with choice, is the best way to provide the most accessible experience.”

Play
1min

Minah Han

About the author

Minah Han is a marketing professional dedicated to advancing accessible communication solutions for the deaf and hard of hearing community. At InnoCaption, she leverages her expertise in digital marketing and storytelling to amplify the voices of individuals who rely on innovative technologies for everyday conversations. Minah is passionate about bridging the gap between technology and accessibility, helping to drive awareness and education around captioned calling solutions.

Make calls with confidence

InnoCaption provides real-time captioning technology making phone calls easy and accessible for the deaf and hard of hearing community. Offered at no cost to individuals with hearing loss because we are certified by the FCC. InnoCaption is the only mobile app that offers real-time captioning of phone calls through live stenographers and automated speech recognition software. The choice is yours.

Llame con confianza

InnoCaption proporciona tecnología de subtitulado en tiempo real que hace que las llamadas telefónicas sean fáciles y accesibles para la comunidad de personas sordas y con problemas de audición. Se ofrece sin coste alguno para las personas con pérdida auditiva porque estamos certificados por la FCC. InnoCaption es la única aplicación móvil que ofrece subtitulación en tiempo real de llamadas telefónicas mediante taquígrafos en directo y software de reconocimiento automático del habla. Usted elige.