has_many :codes

Using Google Translate from the terminal

Published  

(Update Jan 14, 2011: If you have already used this tip and are back to this post because it’s no longer working, read on… Google have updated the Translate API v2, so I have made some changes to the end of the post accordingly.)

In a previous post, I showed a nice shortcut I use quite often to search for definitions, with Google‘s define search feature, from the command line rather than from within a browser.

As I always have a few terminal windows open at any time, I often look for ways of using some popular web services from the command line as this can be fun and save time too; nowadays many web services having a UI that can be consumed with a browser, also expose APIs that allow developers to integrate these web services in their applications or web mashups.

Google, in particular, offer APIs for just about every one of their web services, including the very popular Google Translate, which I also use a lot to translate mostly from and to English, Finnish, Italian. So, how can we use Google Translate from the command line?

In the previous example, we have seen how easy it is to fetch any kind of web page from the command line by using utilities such as wget or similar; we’ve also seen how we can manipulate and format the content returned, for example to adapt it for display in a terminal. We could do something similar with Google Translate, however there is a quicker and better way to achieve the same result, by using the Google Translate API. This API can be consumed with usual HTTP requests, but returns a JSON response rather than a normal HTML web page, as it is designed to be integrated by developers in other applications and services.

We could manipulate this JSON response once again with filters based on regular expressions as seen in the previous example, but there is an easy way of parsing this JSON directly from the command line, with a utility called jsawk, that works in a very similar way to awk but specifically with JSON-formatted text.

The first step naturally is to make a request to the Google Translate API; in the documentation, we can see that it is possible to consume the API in two ways: either with some JavaScript code, or by making a REST request directly to the service and get in return the result in JSON format almost ready to use – this is what we are going to do.

First of all, you’ll need a Google Account -if you don’t own an account already, create one- and you’ll also need to request an API key to be able to consume the service. You can request a key for free here. The documentation tells us that we should issue requests with URLs in the format

https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=TEXT TO TRANSLATE&source=SOURCE LANGUAGE CODE&target=TARGET LANGUAGE CODE

So for example, to translate the word “hello” from English to French, with wget we can issue a request like

wget -qO- --user-agent firefox "https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=hello&source=en&target=fr"

As for the previous example, we need to specify a user agent otherwise Google will return an empty response. You should see this result:

{
"data": {
"translations": [
{
"translatedText": "bonjour"
}
]
}
}

which is a JSON response, as expected. Next step is to parse this response and get the value of the translatedText property in the nested object data->translations.

However, jsawk seems to expect a JSON array of objects, therefore we’ll need to first manipulate this response and wrap it into [ ] brackets to obtain an array with a single item.

echo "[`wget -qO- --user-agent firefox \"https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=hello&source=en&target=fr\"`]"

the JSON response becomes:

[ {
"data": {
"translations": [
{
"translatedText": "bonjour"
}
]
}
} ]

and is now ready to be parsed by jsawk. We need to get the value of the property data.translations[0].translatedText of the first item of the array (as you can see, data.translations is also an array):

echo "[`wget -qO- --user-agent firefox \"https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=hello&source=en&target=fr\"`]" \
| jsawk -a "return this[0].data.translations[0].translatedText"

You should see just the word “bonjour” rather than the JSON response. One last step we have already seen in the previous example with the define search feature, is to make sure any HTML entities in the translated text can be correctly displayed in the terminal:

echo "[`wget -qO- --user-agent firefox \"https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=hello&source=en&target=fr\"`]" \
| jsawk -a "return this[0].data.translations[0].translatedText" \
| perl -MHTML::Entities -pe 'decode_entities($_)'

At this point, you can automate this command by wrapping it within a shell function that accepts as arguments the source language ($1), the target language ($2) and the text you want to translate ($3):

translate() {
echo "[`wget -qO- --user-agent firefox \"https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=$3&source=$1&target=$2\"`]" \
| jsawk -a "return this[0].data.translations[0].translatedText" \
| perl -MHTML::Entities -pe 'decode_entities($_)'
}

so you can use this function like this:

translate en fr "hello"
=> bonjour

You may also set up some aliases for the couples of languages you translate from/to the most:

alias en2fr='translate en fr "$@"'

So the example for English->French translation simply becomes:

en2fr "hello"
=> bonjour

As seen in this last example, remember to wrap the text to translate within double quotes if you are translating a phrase rather than a single word.

Now, to test that all works, guess what “Hyvää Joulua kaikille” means, in Finnish. :)

Update Jan 14, 2011: I noticed today that the trick I described in this post was no longer working as it was; it looks like Google have updated the version 2 of the Translate API, which is the version I have used in the commands above. I hadn’t noticed, actually, that this version was still a “lab” version and not yet a release, as highlighted in the documentation:

Important: This version of the Google Translate API is in Labs, and its features might change unexpectedly until it graduates.

Author Attribution

Funnily enough, the documentation itself doesn’t yet reflect some changes they’ve already made to the API. In particular, a request made to the same URL,

wget -qO- --user-agent firefox "https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=hello&source=en&target=fr"

now yields a JSON response in a slightly different format:

{
"data": {
"translations": [
{
"translated_text": "bonjour"
}
]
}
}

As you can see, the array’s gone and they’ve renamed the translatedText property to translated_text.So the new version of the translate function, still using the Google Translate API v2, would be:

translate() {
wget -qO- --user-agent firefox "https://www.googleapis.com/language/translate/v2?key=YOUR KEY&q=$3&source=$1&target=$2" \
| jsawk "return this.data.translations[0].translated_text" \
| perl -MHTML::Entities -pe 'decode_entities($_)'
}

which is also a little bit easier. However, since they’ve made it clear that the API may still change while in the labs, it’s perhaps more convenient to stick to the Google Translate API v1 in the meantime – the result, in the end, should be the same. So the translate function, using v1 instead according to its documentation, becomes:

translate() {
wget -qO- --user-agent firefox "https://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=$3&langpair=$1|$2" \
| jsawk "return this.responseData.translatedText" \
| perl -MHTML::Entities -pe 'decode_entities($_)'
}

It doesn’t look to me like this version requires a key, as it seems to working just fine without any. Quick test:

en2fr 'Thanks, Google!'
Merci, Google!
© Vito Botta