Google Translate Is Supporting More Than 110 New Languages



Google is adding support for 110 new languages to Google Translate, the company announced on Thursday. Before now, Google Translate supported 133 languages, so this expansion — which the company says its its biggest ever — marks a significant jump, The Verge reported.

Google’s PaLM 2 AI language model helped Translate learn these new languages. It was especially good at learning ones that were related to one another, such as languages “close to Hindi, like Awadhi and Marwadi, and French creoles like Seychellois Creole and Mauritanian Creole,” Google’s Isaac Casswell says in a blog post.

Google posted the following on their Keyword Blog:

Google Translate breaks down language barriers to help people connect and better understand the world around them. We’re always applying the latest technologies so more people can access this tool: in 2022, we added 24 new languages using Zero-Shot Machine Translation, where a machine learning model learns to translate into another language without ever seeing an example. And we announced the 1,0000 Languages initiative, a commitment to build AI models that will support the 1,000 most spoken languages around the world.

Here are some of the newly supported language in Google Translate:

Afar is a tonal language spoken in Djibouti, Eritrea and Ethiopia. Of all the languages in this launch, Afar had the most volunteer community contributions.

Cantonese has long been one of the most requested languages for Google Translate. Because Cantonese often overlaps with Mandarin in writing, it’s tricky to find data and train models.

Manx is the Celtic language of the Isle of Man. It almost went extinct with the death of its last native speaker in 1974. But thanks to an island-wide revival movement, there are now thousands of speakers.

NKo is a standardized form of the West African languages that unifies many dialects into a common language. It’s unique alphabet was invented in 1949, and it has an active research community that develops resources and technology for it today.

Punjabi (Shahmukhi) is the variety of Punjabi written in Perso-Arabic script (Shahmukhi), and is the most spoken language in Pakistan.

Tamzight (Amazigh) is a Berber language spoken across North Africa. Although there are many dialects, the written form is generally mutually understandable. It’s written in Latin script and Tifingah script, both of which Google Translate supports.

Tok Pisin is an English-based creole and the lingua franca of Papua New Guinea. If you speak English, try translating to Tok Pisin — and you might be able to make out the meaning!

ArsTechnica reported that, in a blog post, Google Senior Software Engineer Isaac Caswell claimed that the newly added languages are spoken by more than 614 million people, or about 8 percent of the global population.

In my opinion, Google has created an easy way for people to learn a second language. This reminds me of the Rosetta Stone, an archeological find that had three of the most prominent languages at the time copied onto it.