Enhancing Speech to Solve the Pervasive Problem in Conferencing

Seventy years ago, the journalist William H. Whyte coined a popular adage, “The single biggest problem in communication is the illusion that it has taken place.“ Regardless of who the quote is ascribed to (sometimes even George Bernard Shaw is given credit), it gets at the perennial tension between the necessity of communication and the daunting difficulty in making it happen. This is especially true in large organizations with distributed teams. 

Large organizations emerge because they make humans more effective. Corporations, volunteer groups and the military all harness the coordinated energy and diverse talents of teams to create benefits unavailable to individuals. Everything that organizations need for success – shared vision, efficient allocation of resources, coordinated action, communal learning processes – is ultimately built on investment in good communication.

How does the modern organization communicate?
With a marvelous and complex diversity of methods – face-to-face meetings, mail, email, texts, live meetings, phone calls, video and audio conferences, video broadcasts and more. While many are asynchronous communications, live video, and especially live audio, are particularly pervasive, yet often problematic.

We can roughly break this list of communications methods down into two broad categories – non-real time content sharing methods and real-time audio-video methods. Within audio-video collaboration channels, it’s pretty clear that audio is central.  After all, you can have a productive audio conferencing experience without video, but video conferencing without good audio is sadly ineffective. All these tools play different roles in the overall team collaboration experience. Text and documents are precise and even executable; the audio-video experience carries more nuances that enable building personal connections in ways that document collaboration tools cannot. Organizations, however, emphatically need both.

If audio is critical to a live collaboration experience, what can potentially go wrong?
Consider a live video or audio collaboration session in a modern organization. You have people in multiple locations, often on multiple continents, with different accents. Some are together in conference rooms, traditional offices or new “huddle rooms,” and some are at home, in their cars or even in cafes and airports. Everyone needs to understand and be understood. The team works with the available audio and video equipment in each location. In the best case – in a high-end conference room – this may include high-end conferencing equipment. In the worst case, they have to use the simple audio of their smartphone or laptop. The level of environmental noise and quality of the audio connection varies widely. Each location may have significant audio impairments – loud air conditioning, public announcements, barking dogs, loud typing, traffic and even crying babies. Any of these locations may have further deterioration of audio due to network connections.

There is also a multiplicative effect as more people join the conference call. 
First, having more people increases the probability that a participant will spoil the audio experience with noise and loud distractions. Second, having more people exponentially increases the total value of lost productivity on the call. 

Here’s a simple model that looks at the relationship between the probability that any given user introduces noise problems with their environment or channel and the number of callers in a conference. Let’s assume the fixed value per person for the time on the call is $100.  Even with the very low probabilities of bad calls, the lost productivity is significant. If the probability of a bad call goes higher, much of the total value of the call is lost.  This is effectively an N2 relationship between the number of callers and the cost of noise – if you double the number of participants, you expect the lost productivity to go up almost 4x.

Next, let us explore if this problem is getting worse or getting better. 

There are two forces at work that tend to make things worse: a more mobile workforce and a greater reliance on remote coordination among global teams. A mobile workforce means that people can (and often must) work from non-office settings such as cars, cafes and home offices which offers a variety of conveniences. However, it also means more background noise, embarrassing interruptions and problematic audio connections.

Can emerging technology solutions enhance the experience and improve productivity and engagement?
New enhancements in speech technology can make things much better. Deep-learning-based speech enhancement like BabbleLabs’ Clear Edge and Clear Cloud technologies are aimed at precisely these difficult but critical audio environments. This powerful new family of software products can essentially eliminate background noise and distractions from any kind of professional call. With elimination of these interruptions, conference participants can quickly dive into a more meaningful collaboration experience. The enhanced experience also improves overall productivity, building personal connections and improving engagement across the organization.

BabbleLabs software is available today as a library, an application, a driver, or a speech-enhancement-as-a-service API. It also runs on virtually any platform at the edge (e.g.: Android, Windows or iOS devices) or enterprise grade data center infrastructure, in both public and private cloud environments. 

William Whyte can worry about the illusion of communication but BabbleLabs is ensuring that it really takes place .

Last Updated: December 10, 2019 5:15pm

