Typically AI is seen as something to be feared. As Elon Musk said, “With artificial intelligence, we are summoning the demon.” Instead, AI is being used as a way to preserve endangered languages through documentation and translation.
Despite endangered languages facing incredible risks, artificial intelligence is providing powerful tools that preserve languages.
“When [linguists] collect data from a speaker of an endangered language, they might have hours and hours and hours of recorded speech,” linguistic professor at the University of Washington Andrew Hedding said. “I know of some people that have used automatic speech recognition tools to try and take a sample of a language and then use that to train a model that could recognize the speech of additional data that you’re giving it.”
Other than automatic speech recognition tools, large language models (LLM) are trained to perform translations among many languages. In fact, Google launched a program to build AI models that help younger generations explore linguistic heritage by learning indigenous languages.
Languages become endangered from the changing demographic factors and limited speakers, specifically children.
“The main thing we look for to determine whether a language is endangered is how many children are learning the language,” Hedding said. “Children are learning them in the home and speaking them with their families and then they’re also speaking them out in the community.”
According to language database Ethnologue, 3,170 languages are endangered worldwide. Languages go extinct because of the use of English and other dominant languages, due to colonialism as well as other factors.
“Latin itself is no longer spoken due to the governing structure that propagated Latin,” Latin teacher Kyle McGimsey said. “With no more central government structures in Rome, you have all of these places where Latin was spoken, starting to evolve independently.”
Usually, languages pass from generation to generation but limited communication can put a language at risk of extinction. However, certain methods allow for the preservation of these languages.
“In endangered languages that transmission is starting to be broken causing the language to have endangerment in future,” Hedding said.
“For some languages that are severely endangered, linguists will make recordings of stories and then preserve those.”
Preserving endangered languages plays a significant role in promoting the diversity of languages worldwide, important markers of identity and crucial for preserving one’s identity.
“[Latin] can be a unifier of that people as a way to hold onto a history or a culture that helps them feel that sense of identity,” McGimsey said. “The study of that dead language does have the potential to stamp out other languages because it’s being seen as a shared inheritance of Western Europe.”
In spite of its tremendous potential, these AI models have encountered privacy issues, inaccurate conclusions and lack of trust among users.
“I think there’s just a question of who has access to that data and are the AI companies able to use that data to train other models,” Hedding said. “Another issue that might come up … is AI making untrue statements. Also … some communities might be skeptical about a non-community member machine generating data and that being taken as a source of data.”
While tools like AI can make understanding endangered and extinct languages easier, they still remain challenging to learn, especially for the younger generation.
“All these languages are heavily inflected, meaning that they have cases and endings and are much less dependent on the order of words in a sentence than they are on what those words look like,” McGimsey said. “Another challenge can sometimes be that … you’re not always learning the words that feel like you’re relevant to your life.”
Still, speakers have recognized the great potential of AI for preserving endangered languages that are at serious risk.
“Documenting endangered languages is a bit of a race against the clock,” Hedding said. “If these tools are able to increase the speed at which linguists and community members are able to document languages … it might allow us to improve our analysis of languages.”
It is only a matter of time before these endangered languages go extinct. The development of these advanced AI tools will ensure that such progress will not be lost.