FAQs

Let’s start with the questions that seem to be on everyone’s mind. If you don’t see the answers you seek, please fill out the Submit an Issue form below. For non-technical questions, please Contact Us.

Search FAQs

Technical

What are the advantages of using deep neural networks for noise reduction?

BabbleLabs applies deep neural networks — sophisticated mathematical models, trained to perform a complex task — to mimic the capabilities of specific human cognition skills. Distinguishing speech from background sounds is remarkably difficult, because those interfering sounds often are occurring at the same time, in the same frequency ranges, and are often fluctuating as rapidly as the speech.

All traditional methods try to identify statistical characteristics of speech vs. noise, and use these statistics to suppress the noise while preserving the speech. This approach can work for stationary (non-varying) noise, but it cannot handle transient noise well. Just as humans learn to extract the thread of speech from background sounds, including interfering speech, neural networks are trained on extensive sequences of real human speech and real noise. These algorithms learn to isolate and regenerate the human speech without the noise by using a longer context of past speech.

BabbleLabs has gathered a unique database of noisy speech and applies hundreds of thousands of hours of natural speech in training its production networks, enabling accurate separation of speech from noise — across different languages, speaking styles, vocabulary, noise types and noise intensities.

Where can I learn more about the API?

Right here! Please let us know if you still have questions after reading the API datasheet. You can submit your question by filling out the form below.

Do you have a specification sheet?
Yes. It contains confidential information. Contact us and we will happily discuss the signing of a mutual NDA so we can share specs with you.
What happens to the audio/video file I send you?

We apply a pre-processing algorithm to regularize the received signal. Next it gets passed through a neural network (NN) subsystem to isolate and reconstruct the speech signal. Finally, we take the NN output to derive a statistical model of the speech that we use to suppress any noise in the received audio stream. This yields natural speech with minimal reverberation, mixed with a controlled level of the background noise to preserve the naturalness of the audio file. Also, we preserve the speech signal at the same level it was in the received file, to give you full control over any level adjustment that you might want to apply. The average volume of the speech signal is scaled to 0.9x the original value to avoid saturation. Since noise is removed from the stream, the perceived volume can be meaningfully lower for signals with a lot of noise, and you might wish to normalize the volume to compensate.

How many calls does the API require?

One call is required to get an authentication token, which is valid for a time interval. After obtaining the authentication token, only one call is required to enhance audio.

Do you support stereo?

Yes! Our algorithms work on each channel individually. You send us the number of output channels for stereo inputs, and we will either process each channel individually or convert the data to mono. Processing each channel will provide a richer experience, but you will be charged separately for each channel. Channels converted to mono will be charged the rate for a single channel. Streams with more than two channels will always be converted to at most two channels.

Which sample rates do you support?

Our algorithms work natively on 16,000 Hz. Other sample rates will be down- or up- sampled to 16,000 Hz

Which formats do you support?
VideoAudio
.mp4
.mov
.wav
.mp3
.aac
.ogg
.aiff
Do you support streaming?

Yes. You can either submit:

  1. A multi-part form encoded request suitable for posting an audio file, or
  2. Any supported audio stream format may be fed directly to our streaming endpoint.
What can I use my free minutes for?

To enhance any of your video or audio recordings, as long as we support the format.

What if I used up my 50 minutes and want more?

Please contact us.

What purchase plans do you offer?

Watch this space! BabbleLabs will soon release a pricing plan that spans volume use from individual to enterprise accounts.

What’s the best way to compare my original audio/video material (what I submitted) against the enhanced file that Clear Cloud returns?

Listen to it! Then, send us your feedback, we would love to hear what you have to say. Soon, BabbleLabs will also be sharing with you metrics and comparative analysis of the input stream and the output stream, to give you a better idea of how we did, objectively and subjectively.

Can I use Clear Cloud to improve Automated Speech Recognition (ASR) output?

We’re all about enhancing audio and speech for human ears. So, we don’t recommend using BabbleLabs speech enhancement as a front end for ASR software. It might seem logical to take raw voice input, de-noise with Clear Cloud, then feed it to Alexa or Siri. This won’t work well; digital assistants are designed to handle noise in other ways. We’re working ASR providers to enhance your interactions with digital assistants. Want to know more? Contact us so we can figure it out together!

Where can I find real-world samples of audio/video enhanced by BabbleLabs Clear Cloud?

Our users have been busy conquering unwanted noise. Check out our Gabby's Lab to see what Clear Cloud can do for you in real-world environments and use cases.

Privacy and Security

How can I be sure that you are protecting the confidentiality of the audio/video material I submit?

All data is transmitted to and from our servers encrypted using https. At this time, we do not keep or store the audio/video that is sent.

Will BabbleLabs listen to my audio or video for the purposes of advertising to me?

No! BabbleLabs wants to enhance, improve and personalize your audio and video streams, not use your streaming data to market to you.

How do I remove / delete my account?

Please contact us with your request.

Partners and OEMs

Who do I contact for embedded applications?

Please contact us with your contact information and type of application.

Are you looking for partners?

Yes! We’re looking for partners in speech processing and deep learning (cloud or embedded) as well as experts in speech metrics and researchers from university programs in related fields.

We are working toward establishing an active customer community. Likewise, we want to build and foster a community of enthusiastic experts, developers, end-users, and innovators. Interested? Contact us with your contact info and the area/manner in which you would like to partner.

Are you looking for investors?

We announced a successful round of funding in January 2018. This Series Seed investment of $4 million led by Cognite Ventures is being used for initial development and productization. We are always interested in hearing from our colleagues; we know the technology space we’re in is exceptionally active, with a growing focus on voice interfaces and deep learning.

Careers

Are you looking for interns?

Yes, we are looking for interns! We don’t yet have a formal internship program. Take a look at our Careers information. If you believe you have directly relevant skills — and you have a passion for speech enhancement, speech-centric technology, and deep learning — contact us.

Media and Public Relations

I want to write an article/blog about BabbleLabs. Who should I contact?

We welcome your help in spreading the word about the exciting developments here at BabbleLabs. Please send an inquiry with a brief abstract on our contact page.

Submit a help request