Presenting Language Research Projects funded by Imminent
Watch the Imminent Grants Ceremony to discover some fantastic projects we funded. An occasion to better understand the business opportunities brought about by innovations in language technology.
Research always needs to align closely with its applications. This is why Imminent aims to support the studies of researchers in areas related to its field. Each year, Imminent allocates €100,000 to fund five original research projects with grants of €20,000 each to explore the most advanced frontiers in the world of language services.
Luca De Biase
Imminent Research Director
Director of AI at Translated
Economic Complexity research group by Luciano Pietronero, Andrea Zaccaria and Giordano De Marzo
Understanding which countries and languages dominate online sales is a key question for any company wishing to translate its website. The goal of this research project is to complement the T-Index by developing new tools capable of identifying emerging markets and opportunities, thus predicting which languages will become more relevant in the future for a specific product in a specific country. As a first step, we will rely on the Economic Fitness and Complexity algorithm to determine which countries will undergo major economic expansion in the next few years. We will then leverage network science and machine learning techniques to predict the products and services that growing economies will start to import.
NEUROSCIENCE OF LANGUAGE
Martina Ardizzi and Valentina Cuccio
The neuroscience of translation. Novel and dead metaphor processing in native and second-language speakers.
The NET project aims to investigate the embodied nature of a second language, focusing on a specific linguistic element that merges abstract and concrete conceptual domains: metaphors. The idea behind the project fits within the embodied simulation approach to language, which has been poorly applied in the field of translation despite being widely confirmed in the study of native languages. Specifically, during the project the brain activities of native Italian speakers and second-language Italian speakers will be recorded while they read dead or novel Italian metaphors. The researchers expect to show a different involvement of the sensorimotor cortices of the two groups in response to the different types of metaphors. The results of NET may provide new insights on how to improve disembodied AI translations.
Kọ́lá Túbọ̀sún team
Collection of speech data (50 hours) in a crowdsourced version for the Yorùbá language.
Yoruba is one of the most widely spoken language in Africa with 46 million first- and second language speakers. Yet there is hardly any language technology available in Yoruba to help them, especially illiterate or visually impaired people who would benefit most. Translated’s vision is to build a world where everyone can understand and be understood. In this project, the team will work on the “everyone”, developing speech technology in Yorùbá. The team is headed by Kọ́lá Túbọ̀sún , the founder of the YorubaNames, and 4 computer scientists and language enthusiasts with an excellent scientific track record, with publications at Interspeech, ACL, EMNLP, LREC, ICLR. As a first action, aligned voice and text resources will be recorded professionally in a quality usable to produce text-to-speech systems. After donating this data under a Creative Commons licence to the Mozilla Common Voice repository, further speech data will be collected from volunteers online. To increase the quality of the text, the team has already developed a diacritic restoration engine.
MACHINE LEARNING ALGORITHMS FOR TRANSLATION
Incremental Parallel Inference for Machine Translation
Machine translation works with a de facto standard neural network called Transformer, published in 2017 by a team at Google Brain. The traditional way of producing new sentences from the Transformer is one word at a time, left to right; this is hard to speed up and parallelize. Machine translation works with a de facto standard neural network called Transformer, published in 2017 by a team at Google Brain. The traditional way of producing new sentences from the Transformer is one word at a time, left to right; this is hard to speed up and parallelize. Andrea Santilli and his PhD supervisor Emanuele Rodolà, at the Sapienza University of Rome, are specialists in neural network architectures. They spotted that a similar problem is solved in image generation by using “incremental parallel processing”, a technique which refines an image progressively rather than generating it pixel by pixel, yielding speedups of 2-24×.
They are proposing to port this method to Transformers, using clever linear algebra tricks to make it happen. At Translated, we hope that this technique and other similar ones can make machine translation less expensive, and therefore accessible to a larger number of use cases, and ultimately people.
Investigating the potential of speech technologies – synthesis and recognition – to improve the quality of professional and trainee translators’ work.
Translators carry out a cognitively demanding, repetitive task which requires continuous high concentration. When they post-edit neural draft translations, a known source of errors is called the “NMT fluency trap”, where the target sentence sounds very fluent and error-free, but this might hide infidelities or alterations with respect to the source. Prof Dragos Ciobanu and his colleagues at the Center for Translation Studies at the University of Vienna have some promising experimental results showing that this situation can be helped by reading the source side out loud, using speech synthesis. In this project work, they will evaluate the practicality and cognitive impact of this new modality, verify that it does not slow down the overall translation process. To do this they’ll track the translator’s gaze while they work, their focus and cognitive load. Our hope is that this idea could make the translator’s work easier and reduce errors.