So far, I’ve talked about dictionaries and dictionary sites, and other language-learning resources on the Internet. There is, however, one Internet service that many of us will come across, even if we aren’t in the process of learning a language – online translation tools. Arguably, the biggest sites today for these services are translate.google.com, and babelfish.yahoo.com. This week, I want to talk a little bit about the differences in these sites and how they approach language translation and some things you can do to make your translation more effective and accurate.
To start with, there are basically two big online methods used to translate between languages. There is traditional rules-based translation, and there is statistical machine translation. These take very different approaches to translating your text.
Babelfish, for example, owned by Yahoo, is a traditional, rules-based translator. It takes the text you want to translate, compares it to a set of rules, and produces a translation based on those rules. The general result is a mixed bag. Sometimes it’s good, sometimes not. More often, it gets the words right, but sometimes has trouble with context and meaning.
Google uses statistical machine translation. The military uses this technique quite a bit. What happens is that someone inputs a bunch of human-translated documents and then when you request your translation, google comes back with the statistically most likely translation. The problem is that this is only as good as the quality of translations fed into it in the first place. So to improve the process, Google also includes a mechanism for you to indicate if you think there is a better translation and indicate what that translation is.
Because Google uses raw human translations as it’s base, it is often better at colloquialisms (slang) and the like, but it will regularly make some bizarre translations.
Additionally, there’s another problem. What happens when you use a word that has multiple meanings? You can try to write in in a context that will make it clear which one you mean, but it can translate badly. Take the word “bat”, for example. It is a long stick used to play the game of baseball, but it is also a small flying rodent. If i say “I found a bat today” which one do I mean? It isn’t completely clear from the context.
Here’s an interesting example we can use to see the difficulties of online translation.
“The conductor took the score to the podium”. Of course we are talking about a musical conductor and he is carrying the orchestral score to the podium – probably to start a performance. But online translators don’t understand all that context. So what happens?
Well Babelfish says: “El conductor llevó la cuenta el podium.” and Google says: “El conductor tuvo la puntuación a la tribuna.”
First – both used “El conductor” which is incorrect – it should be “El Director de orquesta” – they pulled the wrong version of “conductor”. No points awarded.
Second – Babelfish used the verb “llevar” (llevó is the third person singular form) which means “to carry” – which makes sense, and Google chose “tener” (again, tuvo is the 3rd person singular) which is “to have” or “to get” or “to hold”. Close, but Babelfish got closer, I think.
Third – Babelfish chose “la cuenta” for “score”- but “la cuenta” is an “account” – maybe they meant score like in “to settle a score”. Google chose “la puntuación”. Again interesting – it means score as in a “mark” or “punctuation” – but not correct. A musical score would be “la partitura” or maybe even”la música” Oooo – again – no points awarded.
Last – Bablefish went with “podium”- it didn’t translate it at all. Google went with “la tribuna” which is a “rostrom” or “platform”. That would work. Better would have been “el podio”, but “la tribuna” works too, I guess. Point to Google.
So, as you can see, both can have significant issues in translating a relatively simple English sentence.
So what can you do? Well first I have a little technique where I use reverse feeding. I feed in what I want to translate and I get a result. Then I open another window and take the translation and feed it back in, reversing the translation. You would be amazed at some of the results! The reality is that many translation are NOT giving us the desired translation, but we don’t know enough about the language to detect the incorrect translation. This technique will reveal some of the bad translations that are out there. If you run into a bad translation, then I usually try to find a different way to say what I want and see if that translates more accurately. This doesn’t catch 100% of possible problems, but it sure does eliminate some of them!
Another thing that helps is to take the item or phrase you are translating and run it through multiple translators, then compare the results – like I did here. Try taking the result and feeding it backward through a different translator. Again, you will be shocked at some of the odd translations that appear, but it will help you gain better results overall.
Most important - just like I said a few weeks ago – keep that bi-lingual dictionary on hand! For the purposes of this exercise I used the Vox Spanish/English dictionary and I double-checked using spanishdict.com! Whenever you are unsure, you should look it up in the dictionary!
This is also a good time to mention that there are a number of good sites specifically dedicated to translating Spanish idioms and slang. Studyspanish.com has lessons discussing things like the idiomatic expressions that use the verb “tener”. Languagerealm.com has a listing of Spanish idioms. Proz.com also has a list of idioms and the appropriate translation. There are MANY different sites out there that list idioms and their translations in bothe language directions.
So who is best? Well it depends on a few factors. First, it depends on the language – Google simply supports more languages than Babelfish. Next, it depends on the type of text being translated – Google may or may not have any uploaded medical text translations in your chosen language, for example, which would mean you would get terrible results. On the other hand, because Google relies on massive amounts of hand-translated documents, it may be better with lot of idioms or slang or things like song lyrics, which often rely on metaphorical rather than literal translation.
And let’s not forget – there are also a lot of smaller translations sites out there. We’ve only touched the surface here. But be aware – many sites really are just front ends for Goggle or Yahoo translators (look at the copyright in the corner to see where the translation is coming from).
Overall, I’m usually getting better, more reliable results with Google, but if something doesn’t seem right, I also use babelfish. We should also not forget that Bing (Microsoft) is also offering translation these days, and while their current number of languages is low, it is growing. And how did Bing do with our example? Bing came up with “El Director de orquesta tomó la puntuación al podio.” It only messed up by using the word “la puntuación”. It got the rest right. Hmm. Now if only they get close to Google’s 50 or so languages (they only have 20 right now)…
Again, as always, there are a lot of resources out there for you to use, especially for common language pairs like English-Spanish, English-French, English-German, etc. In a future installment I’ll be talking about less common language pairs and some of the resources out there.
Next week, though, I’ll be examining a few of the choices you have to assist you in learning a language if you want to spend some money, but not as much as you would for a high-level application like Rosetta Stone.