BabbleLabs has just launched broad production availability of our commercial speech API, web service, and phone mobile apps for iPhone and Android. These services clean up video and audio recordings to make the speech much easier to understand. The apps work on existing videos as well as new audio and video recorded inside the app. In either case, simply select the item you want to enhance and the app strips out virtually all of the background noise. You can then choose to post or share the enhanced content, or keep it just for yourself. It’s fun to experiment — download the app for free and process your first 125 minutes of video or 250 minutes of audio at no cost — that amount will last most users for a long time. Rest assured, the apps are completely private; they never store anything in the cloud or share anything with BabbleLabs.
- You can download the apps here:
We have great content to explore on our web site — you can learn more about the App, API, and Web Interface, and how BabbleLabs achieves such great results. Explore the site — in Gabby’s Lab, you can see more examples and send us your own!
This release is a major milestone for BabbleLabs — the culmination of more than a year’s effort by a remarkable team. The milestone has triggered some reflections on my career to date.
I have spent almost my whole career on technology start-ups — at MIPS Computer Systems, at Tensilica, and now at BabbleLabs. Almost all of my best and most and notable contributions have been at startups, but I don’t think of myself as a traditional serial entrepreneur. Some people love the messy early days of a startup when the technology, the market, the business model and the customers are all mysterious and ill defined. All things are possible and nothing is certain. Startup enthusiasts are willing, even eager, to try things despite the likelihood of failure. These folks make terrific serial entrepreneurs — they may get involved in something new every two or three years. Along the way, they often accumulate the experiences — and scars — of many startups.
I too love the messy early days of a startup, but I also love the gradual gelling of the teams, the technology and the market strategy. I love the building of a sales channel. I love the development of close working relationships with customers. I love the third and fourth and fifth generations of the technology, products and business models.
I’m not a natural serial entrepreneur because I really hate failure. Too much blood, sweat, and tears goes into a startup to allow me to ever walk away from the challenge. So I am thoughtful about any new venture I get involved in. Everything has to be as close to perfect as I can imagine — on all four of the interacting dimensions:
- The team — a mix of technological brilliance, experience and collaborative emotional intelligence. Diversity of experience is important too, because startups often work on solving new classes of problems, with few direct precedents.
- The technology — an emerging technology with high initial technical difficulty, low long-term deployment cost, and global impact. These are the factors that hopefully combine to create both significant competitive barriers and large potential markets.
- The market and business model discontinuity — an accumulation of quantitative shifts in the technology usage or product capabilities that add up to a qualitative opportunity to restructure the whole experience, workflow or system structure, often by changing the users, the cost structure or the network effect of adoption. Entrepreneurs can often trigger the discontinuity with a business model innovation — the rethink of the economics aligned to the new system structure, usually by changing not just the numbers in the typical transaction, but also the units. Popular examples include switching from dollars per software seat to pennies per minute, or from private to shared asset ownership. Such disruption is often associated with a creation of new group of customers — people who had never been direct consumers of a technology (think desktop publishing or video blogging).
- The customer pain — change is often catalyzed by a quietly growing collective of users frustrated with inefficiencies and missing features. This latency pain is often enough to nudge users into trying something different when they glimpse the potential for a new experience, new workflow, new system structure or new business model to sweep away the old pain.
I have only done three startups in my career with short stints at big companies between: two years at Intel when I was fresh out of undergrad, four years at Silicon Graphics after the acquisition of MIPS, one year at Synopsys, and three years at Cadence, after the acquisition of Tensilica. These periods at big companies have been useful intervals — chances to learn new management skills and examine business and channel strategies on a grander scale. They were not relaxing times; in fact, I sometimes worked harder than ever as I attempted to affect the natural conservatism and strategic inertia of these large, successful companies.
I will consider starting something new only when the planets really and truly align. For me, the creation of BabbleLabs reflects exactly that kind of celestial providence:
- The revolution in deep neural networks, triggered by the confluence of a robust new model of statistical computation, the proliferation of high-performance parallel computation, and the explosion of digital data online.
- The decisions by a small group of profoundly talented engineers and signal processing architects to leave the comfort of big company technical leadership roles to strike out and do something new and exciting.
- A simple observation that the nature of user interfaces could and should undergo essential change. In particular, people are ready to stop training themselves to adapt to the computers’ user interfaces — typing, mousing, swiping, and tapping. Instead, the technology and the users are ready to finally train the computers to adapt to humans ;-). Furthermore, the move towards mobility has created at least as many problems for noisy communication, as improved electronics systems have helped. We used to find a quiet place to make a phone call — remember the phone booth? But now we expect to take calls anywhere, regardless of the noise. So we find rapidly increasing demand for speech systems that can deal with real world noise, through speech enhancement, speech recognition, speaker identification, and speech dialog capabilities.
BabbleLabs finds itself at a unique juncture in the evolution of speech technology. Users are getting a glimpse of what is just now becoming possible in people-to-people communication and in human-machine interfaces. And they are increasingly impatient to get speech technology that serves them wherever they are.
I realize that I only do a startup every 15 or 20 years. I suppose that makes me a dedicated, but quite patient serial entrepreneur. And BabbleLabs might just prove to be the most satisfying chapter in that story.