Translate Hinglish to English with Google Translate API
Google Translate is a powerful tool that can help you quickly and easily translate text from one language to another. It is one of the most popular and widely used translation services globally, and it is available for free both on the web and as an API.
If you want to use this API, you'll need to know the source language and target language.
Sample code and setup instructions are available : Quickstart: Translate text with Cloud Translation Advanced | Google Cloud
# Translate text from English to French
# Detail on supported types can be found here:
# https://cloud.google.com/translate/docs/supported-formats
response = client.translate_text(
request={
"parent": parent,
"contents": [text],
"mime_type": "text/plain", # mime types: text/plain, text/html
"source_language_code": "en-US",
"target_language_code": "fr",
}
)
In this example, the source language code is en-US
, and the target language code is fr
.
However, it's not always possible for you to know the source language code since users can write in various languages.
To solve this problem, Google has another API to detect the source language.
Detecting languages (Basic) | Cloud Translation | Google Cloud
def detect_language(text):
"""Detects the text's language."""
from google.cloud import translate_v2 as translate
translate_client = translate.Client()
# Text can also be a sequence of strings, in which case this method
# will return a sequence of results for each text.
result = translate_client.detect_language(text)
print("Text: {}".format(text))
print("Confidence: {}".format(result["confidence"]))
print("Language: {}".format(result["language"]))
Now, you can combine both of these APIs to translate any language into any target language.
Throughout this blog, we will use the following text as an example for all translations.
hello dost tum kaise ho kya haal chal
This piece of text in written in Hinglish (Hindi written in Latin script (instead of the traditional Devanagari), often also mixed with English words or phrases),
Google translate website is able to translate this text as expected.
So let's try with the API.
First, let's try to detect the language
python3 detect_lamguage.py
Text: hello dost tum kaise ho
Confidence: 1
Language: hi-Latn
It's working perfectly fine! The Detect Language API is able to detect that the text is in Hindi and written in Latin script.
Now, let's try to run the Translate Text API with the source language set as hi-Latn
and the target language as en-US
.
"source_language_code": "hi-latn",
"target_language_code": "en-US",
But for the same text, code throws an exception
details = "Source language is unsupported."
If we change the source language to hi, the Translate API will work, but the results may be very poor.
Result from API
Input text: hello dost tum kaise ho kya haal chal
Translated text: hello friend how are you
But the google translate website is able to translate text written in Hinglish properly, looks like we are missing something.
I tried to find out how Google translate website works, and which API they are using, I attached Charles's web proxy to see the traffic, and all I could see was this URL getting called
https://translate.google.co.in/_/TranslateWebserverUi/data/batchexecute?
Nothing useful was found here.
I also found out, that if we write proper Hindi, then the translate API returns proper results. Let's try with text written in Hindi(Devanagari)
Input text: हेलो दोस्त तुम कैसे हो क्या हाल चाल
Translated text: hello friend how are you doing
So we need to convert Hinglish to Hindi written in Devanagari, and then use the translate API.
How to convert Hinglish to Hindi
This step is called Transliterate
There is an API from google for this
But this is a deprecated API https://developers.google.com/transliterate/v1/getting_started
The Google Transliterate API has been officially deprecated as of May 26, 2011. It will continue to work as per our deprecation policy.
Deprecation doesn't means that it will be stopped, and you can't use it.
So let's plug in everything, These are the steps to use a working google translate API, that works.
Detect language
Perform transliteration, if required
Call the google translate API.
python3 translate_tansliterate.py
Please enter something: hello dost tum kaise ho kya haal chal
Text: hello dost tum kaise ho kya haal chal
Confidence: 1
Language: hi-Latn
['हेलो दोस्त तुम कैसे हो क्या हाल चाल', 'हेल्लो दोस्त तुम कैसे हो क्या हाल चाल', 'हेलो दोस्त तुम कैसे हो क्या हाल चल', 'हेल्लो दोस्त तुम कैसे हो क्या हाल चल', 'हेलो दोस्त तुम कैसी हो क्या हाल चाल', 'हेलो दोस्त तुम कइसे हो क्या हाल चाल']
Translated text: hello friend how are you doing
Finally, we have a working script that can translate Hinglish into English.
If you enjoyed reading this blog, you can follow me on Twitter to learn something new with me.
Subscribe to my newsletter
Read articles from Nitin Kalra directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Nitin Kalra
Nitin Kalra
Working as a Principal Software Engineer. I have experience working on Java, Spring Boot, Go, Android Framework, OS, Shell scripting, and AWS. Experienced in creating scalable and highly available systems.