Have you ever had a YouTube video or audio segment you wanted to transcribe but didn’t have the patience to sit and manually type out what the speaker/s said? I ran into just such a situation yesterday and, instead of it taking hours for me to listen, pause, type, listen, pause, type etc I was able to transcribe the two YouTube videos I needed in under ten minutes. Here’s how…
I wanted to use snippets from two pieces of testimony I heard in the New Jersey Legislature this past Thursday. The video clips were on YouTube and the audio quality was excellent. Unfortunately I was unable to find a transcript of them and I needed to quote them for a sermon I was writing. The only solution was to create my own.
While it would first appear that I could use voice recognition software like Dragon NaturallySpeaking and Dragon Dictate for Mac to accomplish this that approach won’t work. You see, such programs are “user-specific” and can’t be “shared”. They are great at transcribing your speech but ONLY your speech. That’s because in order to use them properly you need to initially take some time and “train” the software so it will understand your voice. During the training process (the software prompts you to read some specific text for some period of time) the computer adjusts to your specific vocal patterns. Once it is done transcriptions can be close to perfect when YOU speak but since it is locked to your speech patterns it will not be anywhere near as accurate if someone else tries to use it. For them to have the same degree of accuracy requires THEM to train the software as well.
Unlike the aforementioned voice recognition applications Dragon Dictation and/or Siri WILL work. The reason for this is simple, Dragon Dictation and Siri are not user-specific when it comes to the recognition process. They, unlike the previously mentioned applications, don’t process the audio on the device itself but instead send the raw data to Nuance or Apple’s servers. Once there it is processed and returned as text . Because, I suspect, they have access to much more powerful processing technology they can take ANY clear speech and transcribe it with almost 100% perfection.
So, instead of having to type out the two pieces of testimony I held my iPad up to my computer, started Dragon Dictation, played the first 45 seconds of the first video and let it transcribe. It was almost perfect. I repeated the process until both videos had been transcribed. i went back, spent a few minutes cleaning up the text and was good to go. All thanks to Dragon Dictation not being user-dependant.