I had not realized just how much I missed having global voice recognition until I went to Elana’s iPhone 4S while waiting for my iPhone 5S. Some background: I sold my iPhone 5 and had been using Judie’s spare iPhone 4 for the past few weeks. It worked fine, but the 4 doesn’t have Siri or voice recognition. Well, I’m baaack.
I was using Dragon Dictation for the month I was on the 4, but having gone to the 4S and using Siri and global voice recognition since yesterday has reminded me of how much I missed it. It is amazing how quickly we can become reliant on certain technologies.
What I also realized is just how accurate Apple’s voice recognition has become. If you speak in a clear and precise manner, the recognition is as close to accurate as you can get. Sure, it won’t recognize foreign words, and it may mistake “apples” for “apples’s”, but those are minor mistakes and easily corrected before using any dictated document. What is interesting to me is that Elana said that at first she thought the voice recognition on her new iPhone 5C was significantly improved from the iPhone 4S running iOS 6 but, after using it for a few hours, she didn’t actually think this was the case. I actually think she was correct in her initial assessment – Siri is out of beta and it shows. The thing is, just as Siri needed to mature a bit over time, those who use mobile voice recognition needed to train themselves to speak in a way that optimizes the accuracy of the recognition. It occurred to me that there is a relatively simple way to help increase accuracy that leverages an old voice recognition trick that has been around since the beginning of commercial voice recognition applications.
When you first set up any of Nuance’s desktop or laptop voice recognition software, you need to “train” the program. This is done by reading a piece of text out loud that the program supplies. The idea here is that as you read the text, the application recognizes it is able to adjust to the way you pronounce specific words. So as you speak, it learns where you tend to emphasize words and where you tend to say things in an unusual or slightly regional manner. At the same time, since this training process only allows you to advance to the next sentence or phrase once the program has accurately interpreted what you said, it is training you to speak accurately almost as much as it is training the program. It is a feedback process that benefits both you and the program in terms of potential accuracy.
While global voice recognition such as Siri does not learn your distinct voice patterns the way a desktop program does – that’s why it is usually not quite as accurate as a desktop voice recognition program – there is some benefit to using this same approach. By taking a piece of text and reading it into the iPhone or iPad using voice recognition, you can see where the program is accurate in its transcription and where it makes mistakes. This then allows you to see which words you need to pronounce with greater accuracy or in a different manner. It will also force you to speak in a clear manner that allows you to articulate each word for better accuracy. (People have observed that when I do voice recognition, I sound a bit robotic. I do, but it results in greater accuracy when dictating.)
Taking some time to read, check the results, and then reading again will force you to speak with greater and greater accuracy. It’s a bit of time investment now that can pay off in a huge way going forward.