Part I: Restoring the Past
Part II: Rebuilding Speech
Part III: What’s Missing?
Part IV: Better = Cheaper
The BabbleLabs community has been delighted with how the availability of the product eases the task of explaining what BabbleLabs does. In just over a minute, someone can listen to the “before” and “after” clips like our “Raul-in-Montevideo” recording. Please listen to this original, single microphone recording:
Then compare it to exactly the same single microphone track after it has passed through our real-time API:
Even the defects in noise suppression help us educate people about the subtle issues in advanced speech enhancement. In the loudest parts, a little bit of the traffic noise does leak into the final recording. In other parts, the careful listener can detect slight distortion of Raul’s voice, and minor variations in volume. These reflect both some conscious choices — allowing a small fraction of the original noise actually sounds more comfortable to most listeners — and some lingering technical challenges for the BabbleLabs speech science team.
The spectrogram — the plot of sound’s energy in each frequency over time — is an essential visual tool for understanding the impact of speech processing. Consider this comparison of the noisy audio track (above) and the enhanced audio track (below) over the 38 seconds of the track, for fine-grained frequency categories from 0 Hz to 8 kHz, where time-frequency samples with the highest energy are dark red, intermediate samples ...