The Advantages of an AI IoT Speech Interface on the Edge

A lively group of semiconductor engineers, hardware and software design managers, and tech journalists from around the globe joined a virtual happy hour session I co-hosted recently on AI IoT devices as part of the Design Automation Conference. We discussed the usual smart watches and temperature controls, as well as a few lesser known technologies including smart-kill Wi-Fi enabled rat traps and AI-enabled IoT chicken collars. Of course, everyone had their own story with cloud-based voice recognition technologies used in smart speakers; it turns out, most have unhappy experiences with voice technology.

This feedback comes as no surprise to BabbleLabs. It validates our belief – and basis for part of our business model – that cloud-based automatic speech recognition (ASR) technology accomplishes only part of the goal for IoT devices. As stated by session participants, the technology disappoints when you most want it to recognize what you say, compromises user privacy, can be painfully slow, diminishes brand experience, and has a high cost, limiting potential for useful applications. All of these issues can be addressed through noise-optimized speech recognition technology deployed locally, on the edge, as offered through BabbleLabs Clear Command.

The Cloud Conundrum
Today, most popular speech recognition technologies work by using a wake word recognized by embedded technology that signals when it’s time to act. For example, following “OK Google,” the device passes speech into the cloud, applying complex algorithms to the phrases to identify the best match – a sometimes time-consuming process fraught with errors that leaves users with accuracy rates as low as 73 percent, unacceptable for anything other than entertainment purposes. Additionally, when OEMs apply this technology to IoT devices used in kitchen appliances, cars, or other items, the OEM has allowed Google to brand the product experience – a lost opportunity.

Acing Accuracy and Speed
BabbleLabs Clear Command takes a different approach to IoT speech recognition. You create a customized library of up to a few hundred phrases designed for each application’s needs and the technology recognizes these phrases on the device. This local, targeted phrase approach combined with unique speech enhancement and noise cancellation technology produces at least 14X better accuracy than cloud-based speech recognition technology, especially in noisy environments like kitchens or traffic, and provides instant results. With better accuracy and speed on its side, Clear Command emerges as the best choice to be used in mission critical IoT devices such as radio handsets used by police, firefighters, and medical personnel.

Protecting Privacy
41 percent of the general public have concerns about privacy when it comes to voice assistants. Anecdotal evidence repeatedly shows that devices regularly mistake common language for wake words, unintentionally triggering devices and opening the door for personal speech data to be captured and reviewed through recordings and text transcriptions managed by tech providers. BabbleLabs speech enhancement and command recognition eliminates personal privacy risk because speech data never leaves the device. In this case, people do not have to worry that their information may land in the wrong hands.

Building Brand Customization
As demand for voice recognition technology in everyday life accelerates, manufacturers that move quickly to implement it without careful consideration of the user experience and their brand’s image can adversely affect consumer perception and ultimately weaken their position. Rushing to integrate Google’s or Amazon’s voice assistant – and wake words – on top of their own carefully cultivated ecosystem contributes to the creation of an undifferentiated product and falls short of delivering on a unique brand experience. Clear Command delivers voice solutions that enhance the brand experience through customized phrases, allowing users to remain within the ecosystem of the brand.

Offering Better Battery Life at an Affordable Price
Small IoT devices, like headphones, smartwatches and hearables, are especially challenged with integrating voice interfaces, as most solutions have high initial costs and limited battery life. In order to be successful, manufacturers need to find cost-effective ways to innovate while addressing power requirements and managing the power consumption for rechargeable batteries. Most cloud-based automated voice recognition systems require too much power to be effective in tiny devices. Clear Command is a low latency solution that naturally costs less and uses far less memory (typically in the low hundreds of kilobytes) and power (0.5mW to 5mW) than cloud-based speech recognition software.

Accessing Edge for Customized Speech Recognition
AiThority estimates that nearly every application will need to integrate voice technology in some way in the next five years. Manufacturers will require a versatile, cost-effective AI IoT voice interface that will reflect their brand’s unique persona and meet the expectations of increasingly sophisticated consumers, such as those in the happy hour session. If you’re working on your next product, consider adding Clear Command’s customizable technology – we’ll help you get the most out of your ASR investment!

Shout out to learn how you can deploy Clear Command and provide your customers with a better experience for a fraction of the cost of cloud-based speech recognition solutions.

Last Updated: August 14, 2020 6:48am

Return to Blog