Translated's Research Center

Indigenous Language Revitalization: A Journey of Localization

Culture + Technology, Localization

Eugenia Urrere founder of Indigenius provides an overview of the current state of indigenous languages globally, with a focus on Latin America,  examining community-led efforts to revitalize endangered languages, and how technology helps in these successes, challenges, and lessons learned. 

As key players in localization, we wield significant influence globally, enabling content accessibility to millions, thus transforming experiences on a global scale. The rapid advancement of Machine Translation has expanded the volume of content accessible worldwide. However, there are still areas where access remains limited.
There are over 7,000 languages globally, yet only 20 represent half of the world’s speakers. In Latin America the most widespread languages are Spanish and Portuguese, but there are many other languages of equal richness and value. According to the World Bank, there are 560 indigenous languages in Latin America. In Brazil alone, there are 160 indigenous languages and dialects. Note that many of these languages are underrepresented on the internet, but not outside of it, as we are talking about millions and millions of speakers in Latin America.
Community members see themselves as bicultural, blending postmodern Western culture with their ancestral traditions. Thanks to the Internet and new technologies, they affirm their presence and promote languages online. Global indigenous communities are leading a resurgence of linguistic and cultural diversity.

Community members see themselves as bicultural, blending postmodern Western culture with their ancestral traditions.

Indigenous communities have shown signs of vitality for the first time in recent history thanks to the possibilities of creating one’s own content and the sharing facilities of the new technologies, which make it possible to promote languages without too many resources and by their native speakers. In the voice of Mario Fernández, a Qom speaker from Chaco, Argentina: “For centuries we had to remain silent to survive, now we have to speak so as not to become extinct.”
Government support is essential to preserving linguistic diversity through initiatives such as research and development of a linguistic corpus for each indigenous language, as for example, the Bolivian government did.

Challenges and Lessons Learned

Regardless of where they live or what languages they speak, the challenges that indigenous languages face are universal.

Recurring Challenges

  1. Linguistic: Language variability, lack of standardization.
  2. Unreliable or contradictory data: Ethnolinguistic data found online may be ambiguous, incomplete, or contradictory.
  3. Resources: Few professional translators with limited English proficiency.
  4. Technical: Translation software tools are uncommon.
  5. Infrastructure: Unstable electrical supply and Internet access.
  6. Social and political instability: Can unexpectedly disrupt work.

1 & 2.
The lack of standardization arises when governments do not grant official recognition to the various languages spoken by indigenous peoples within their respective territories. This gap in government recognition prevents the development of policies and programs aimed at preserving and promoting these languages, which in turn hinders their digitalization and access to educational and technological resources. Another consequence of the lack of officialization is unreliable and even contradictory data.

3 & 4.
It is a region where the predominant language is Spanish and Portuguese, however the source language of the vast majority of the texts is English, which implies finding people who speak at least 3 languages.
Many native speakers working in the localization industry are professionals in their native language, university professors, professional writers, or speakers with an admirable academic track record, but with a basic level of English. Most of them are professionals, but not localization professionals. Which implies a basic or no knowledge of localization technologies.

Regarding internet access and technology, we still find regions where internet access is only available through mobile devices, because access to Wi-Fi is not common in some regions.

Social and political instability can unpredictably disrupt workflow, causing delays, and challenges in maintaining productivity. These disruptions can be due to economic crises, civil unrest, government transitions, or regional conflicts, all of which can significantly affect the stability and functioning of organizations.

Google, Meta, Motorola and Lenovo for Indigenous Language Digitalization

The International Decade of Indigenous Languages (IDIL 2022-2032) declared by UNESCO (UN) is an initiative aimed at promoting global efforts to support and revitalize indigenous languages. This decade-long effort seeks to raise awareness, foster collaboration, and implement concrete actions to preserve and promote the linguistic diversity of indigenous communities around the world.

Google Initiative

Google reports that millions of new internet users join the online market every week globally, leading to a significant rise in the number of individuals browsing, consuming content, making purchases, and seeking entertainment, especially via mobile devices. Regarding this, Google has expanded its online translation tool, Google Translate, to include Aymara, Guaraní, and Quechua, benefiting a total of 20 million users across Latin America.

Meta Projects

Meta AI is a relevant player when it comes to advancing language and machine translation tools on a global scale. With the “No Language Left Behind” initiative, they are taking a giant leap towards facilitating high-quality translations in numerous languages, overcoming challenges related to limited data and intricate modeling. Meta’s “Universal Speech Translator” project is another significant milestone in achieving real-time speech-to-speech translation, bridging communication gaps for languages lacking standardized writing systems.

Motorola and Lenovo

Motorola and Lenovo’s initiative for indigenous languages involves a comprehensive approach, including translation and training in CAT tools for native speakers to achieve consistency and respect in digitalization.
I want to highlight that the resources created through this initiative are open source, enabling other technology companies to localize their devices and empower underrepresented populations with access to technology. Moreover, open source products allow communities to manage revitalization efforts themselves.

Final words

Language revitalization requires personalized approaches for each community, respecting its unique culture and needs. Localization, likewise, requires absolute contextualism; It cannot be generalizable or replicable.
Engaging local communities in all projects involves not only providing them with training and mentoring to improve their localization skills, but also establishing an ongoing collaborative dialogue to understand their specific needs and priorities. This approach seeks to close communication gaps by ensuring that local voices and perspectives are taken into account at all stages of the project.
Even for more complex project topics and tasks, such as AI gaps, this community inclusion can significantly enrich the process by providing unique contextual insights and facilitating more effective and ethical implementation of technology solutions.

Symbiotic Connections

Symbiotic Connections

Imminent’s Annual Report 2024

A journey through neuroscience, localization, technology, language, and research. An essential resource for leaders and a powerful tool for going deeper in knowing and understanding the perceived trade-off between artificial intelligence and humans and on their respective role in designing socio-technical systems.

Secure your copy now!

Initiatives such as the creation of a multilingual translation evaluation dataset contribute significantly to advancing multilingual translation with limited resources. Collaborative efforts with the AI research community and global commitment to continuous responsible development are integral to ensuring that translation technologies meet real-world demands while respecting the linguistic and cultural diversity of users.
Additionally, we must be creative and develop sustainable models that respect the data and information that communities provide. Data is common and, being a product of communities, it should be owned by the community, or at least the community should receive proportional reward in return, as a way to compensate for feelings of neocolonialism or extractivism.

Language revitalization requires personalized approaches for each community, respecting its unique culture and needs. Localization, likewise, requires absolute contextualism; It cannot be generalizable or replicable.

The choice of whether or not to localize a product or service is strongly influenced by the objectives and vision of the company. It depends on the message that the company wants to convey to its clients or users. If you truly seek to establish an emotional connection, even though indigenous communities are bicultural, their cultural identity continues to be the native language, the language in which their dreams are expressed.
To finish, let me share the words of Elías Caurey, Guaraní speaker and UNESCO representative for Latin America: “Each language serves as a key that opens a door to a different cultural realm in our diverse world. Mastery of several languages improves our understanding and connection with others.”

Moseley, Christopher [editor], Nicolas, Alexandre [cartographer], Atlas of the world’s languages in danger
Next Billion Users, Official Google Blog
Motorola and Lenovo Foundation. Motorola and Lenovo Foundation announce the next phase of the initiative to revitalize endangered indigenous languages.
Sergey Edunov, Paco Guzman, Juan Pino, Angela Fan. Teaching AI to translate 100s of spoken and written languages in real time

Eugenia Urrere

Eugenia Urrere

Founder at Indigenius | Localization Project and Vendor Manager | Degree in Social Communications

Eugenia Urrere is a social communicator, a self-taught and tech-loving entrepreneur. Since 2018, she has been dedicated to the revitalization and digitization of indigenous languages in Latin America through her work in an NGO. In 2020, Eugenia founded Indigenius, a company committed to fostering fair employment opportunities for communities, digitizing indigenous languages, and raising awareness about underrepresented linguistic diversity.