Voice to Text Transcription with Lee Dorfman
Lee Dorfman is the founder and CEO of Quicktate.com, a San Francisco-based company focused on high quality voice-to-text conversion and its integration with third-party applications. We spoke to him about his company and his plans for the future.
Lee, thanks for doing an interview with us. Can you please give us a little bit of background on Quicktate. What it is and how your company envisioned it’s adoption by users?
First off, thank you for the opportunity to share my thoughts and observations!
Quicktate was an idea inspired by my several years of being in the transcription industry at my other company, iDictate. Although iDictate was founded in 1999, it wasn’t until 2004 that I became convinced that there was a need and a market for transcribed voicemail messages. There were a few machine speech recognition solutions on the market, but nobody at that time was using the most accurate speech processor, the human brain, to transcribe audio with perfect accuracy and rapid turnaround times, nor was anyone leveraging web technology to its fullest, so in 2007 we combined our years of experience with the latest tech and are now able to offer accurate human transcription with turnaround times measured by seconds rather than hours.
Your service also transcribes voice mail? Give us some examples of how this is put to use.
Most people find checking their voicemail to be an incredibly arduous task. Our philosophy is that voicemail shouldn’t have to be checked, it should come to you. Quicktate transcribes incoming voicemails and delivers them as text messages (or email) – removing the need to fumble with those irritating dial-in voicemail systems.
Interestingly, Quicktate launched with this as our sole feature, which ultimately became a simple testing ground for us to tweak our back-end’s stability. Once we were confident that our transcription quality and turnaround times were the best in the industry, we opened up our system completely to other developers. Our flexible API allows any developer to use our transcription service in their applications by just adding a few lines of code. It’s really exciting to see how many innovative new ways of using our service developers keep coming up with. We actually provide white-label voicemail transcription and call auditing services to many well-known brands, and the end-user typically doesn’t even know that Quicktate is really the company transcribing their voice.
Is reading actually faster than listening?
Unquestionably – text has an inherent efficiency which speech simply cannot offer. Even though the human brain can technically comprehend speech at a must faster rate than text, transcribed speech offers many conveniences. Text is asynchronous: it’s incredibly simple to eyeball scan, filter, search, and categorize. When listening to speech, you’re forced to sit there and listen for several seconds before you even know whether or not the message will be useful to you. With text, your eye allows you to skip forward and backward to rapidly discern a message’s summary and importance, a process impossible to replicate with audio even with frantic fast forwarding and rewinding.
I predicted in 2008 that within two years most wireless carriers would be offering voicemail transcriptions and that carriers wouldn’t be content with the quality of machine-based speech recognition. Word-for-word accuracy is paramount for human-to-human messaging. We’ve found that the unpredictable background noise and caller voice variations inherent to voicemail messages make it very difficult for a machine to correctly transcribe speech. The human brain is still the most accurate speech recognition engine in the world, and we feel that in the constant quest for better machine transcription quality, we often overlook our own heads. Machine transcription with human proofreading offers the best balance between quick turnaround times and perfect human-typed accuracy – and that’s exactly the quality we provide.
How can, say, a hyperlocal blogger or journalist utilize this new technology?
People can speak much faster than they can write, so transcribed dictation is incredibly useful to anyone needing to capture quick notes, especially when a pen and paper aren’t nearby. We provide a broad array of tools to efficiently capture speech as text, both directly as Quicktate products, and through the innovations created by our large developer network. Everything from conference calls, iPhone voice recordings, phone interviews, medical and legal dictations, to grocery lists can be transcribed through our system, and people are discovering imaginative new uses for transcription every day. While we are always encouraging our developer community to use our API in innovative new ways, we also have a lot of fun with in-house projects such as TweetCall (http://tweetcall.com), a service which allows you to speak and transcribe tweets directly into your Twitter account.
Quicktate is essentially a self publishing platform. Can you say a little bit about how you see the world changing and how your company is moving forward to create this future?
As the world continues to become more and more connected with the explosive growth of mobile data and the social web, we see the keyboard and numpad as being incredible bottlenecks to the evolution of human communication. Speech recognition is the solution, and we intend to remain at the forefront of offering premium human-assisted transcription services. We’re thrilled to be a part of such a vibrant community of innovators who are helping revolutionize human communication, and intend to keep arming this revolution for years to come.