News

M&E Journal: Don’t Neglect Glossaries When Automating Localization

September 12, 2023

M&E Journal: Don’t Neglect Glossaries When Automating Localization

What's the importance of glossaries in the localization process? Iyuno's Chief Technology Officer, Allan Dembry, and Iyuno's Director, Research & Innovation, Bartosz Adamczewski share their insights in a special edition of the M&E Journal.

Speech synthesis is a technology that has been in the forefront of many discussions recently in the media and entertainment industry, with a strong media presence and funding spree.

It is already a large part of our lives in the form of Siri, Alexa, and other digital assistants, found in call centers, embedded in modern automobiles, or as assistive technology for the blind. Automating glossary creation and management offers efficiency and consistency benefits, reduces linguistic ambiguity, enables information extraction, and enhances the quality and accuracy of automated translations in fields with specialized terminology.

A glossary is a document that contains a list of terms and phrases and their definitions, often used in a particular brand, show, or field. Glossaries serve as a reference guide for key terms and phrases, ensuring that everyone involved in the project is on the same page and understands the terminology being used. In the field of automation, glossaries are essential in streamlining processes, improving accuracy, minimizing errors, and enabling suggestions in subtitling and translation systems. They enhance and improve the quality of all downstream processes that must do any processing with text and dialogue. Of course, one of the primary benefits of using a glossary is that it can help to eliminate ambiguity. If different people use different terminology or have different definitions of the same term, this can lead to confusion and errors in the translation.

By using a glossary, everyone involved in the project can refer to the same set of terms and definitions, eliminating confusion and ensuring that everyone uses the same language, thus making it essential in providing consistency in the localization process. Another key benefit of using a glossary is that it can improve the accuracy of the script translation process, and this is true of both traditional translation and machine translation. When working with scripts, it’s easy to make spelling and form mistakes, and clients often require specific terms to be used in a translation or editing process. However, if everyone involved in the project uses the same set of terms and definitions, this can help reduce the risk of errors and rejections.

Enforcing a glossary and providing validations and suggestions can in turn be used to return greater accuracy and more effective automated translation. Historically machine translation engines had a lot of trouble with translating terminology correctly; however, an NMT engine can use a glossary such that when doing the translation process, it uses the correct term spelling and translation from the glossary. By using tools that integrate the glossary into the script review and translation process, you can gain the benefit of automatically highlighting all inflected terms in the script by using a set of algorithms and machine learning models. Other types of suggestions then become possible whereby the system will inform the user that a better or more suitable term or phrase exists and should be used instead, and that a term should be applied in a specific piece of dialogue but was not.  This not only accelerates the process but delivers greater accuracy at the same time.

Glossaries can also have metadata; this provides insights so that if the user is unsure whether to use a specific phrase or term in the text, the system can also provide rich metadata and context-specific descriptions of the term or phrase for research purposes as illustrated in figure 2. Glossaries are also an integral part of the script approval process; when clients want to use specific terms and phrases, glossaries aid in highlighting franchise/brand/title specific terms, alerting the user to pre-approved phrases and translations, as well as alerting users when the terms and phrases are missing or used incorrectly. Maintaining and keeping glossaries up to date can be a complicated and time-consuming process, and one that is prone to error. Glossary management systems can help with this task, but it’s quite a challenge to keep all the terms, phrases, translations, and metadata filled in and up to date; fortunately, again, automation can help here.

Historically, people created glossaries and maintained them in excel spreadsheets and other file types. This presents a challenge with using the correct glossary and correct metadata in the editing or translation systems. Glossary management systems help with this task, but even then, data must be imported into the system. This is where machine learning models can be leveraged to automate glossary imports as well as picking the correct glossary for a given task. Machine learning models can be used for automatic glossary creation, both for generic glossaries and client and brand-specific glossaries, from a set of scripts and documents; this reduces the time it takes to create a glossary and helps with consistency and quality. Similarly, relationship extraction models can be applied to extract term metadata and relationships to other terms to enhance the existing glossaries, something that is far more challenging to achieve manually in a timely manner. This automatically generated information can then be used for research when editing or translating.

In summary, the most beneficial aspect of having glossaries is that nearly every piece of automation starts by having an initial glossary, then subsequently building machine learning models and other automation tools based on these data sets. Glossary automation becomes a self-referencing feedback loop of operational and technical improvements and brings even more efficiencies to what can be a long, complex process.

By Bartosz Adamczewski, Director, Research, Innovation, and Allan Dembry, Chief Technology Officer, Iyuno

ABOUT IYUNO 

Iyuno (www.iyuno.com) is the premier provider of comprehensive localization solutions for the media and entertainment industry. With an exceptional 75-year collective legacy, Iyuno has earned the trust of renowned entertainment studios, streaming platforms, and creators worldwide. Its diverse range of services includes industry-leading dubbing, precise subtitling, closed captioning, audio description, media management, quality control, and secure distribution. Operating through a global network of 67 offices across 35 countries, Iyuno combines top-tier talent, innovative technologies, and deep industry expertise to ensure seamless multilingual content delivery that resonates with global audiences. Committed to connecting content and connecting people, Iyuno is delivering unparalleled services to its valued clients. For more information, please follow @IyunoHQ and #WeAreIyuno on social media platforms.