Advanced machine translation (MT), housed securely and maintained by professional language service providers, is changing the face of translation. However, machine translation isn’t a one size fits all solution, nor should it be seen as a cheap alternative to human translation. Professional, mature machine translation providers should be prioritising the development of their translation engines and processes. Here are 5 questions you should ask your MT provider:
First and foremost, it’s important to understand that not all content is appropriate for machine translation. MT is a tool that can be used independently to produce a fully automated translation, or can be combined with professional post-editing to produce ‘human’ quality translations. That being said, it is certainly not a one size fits all solution. Beware of translation providers offering machine translation for creative content, as it won’t sound as natural in the target language. That being said, MT is appropriate for a variety of content types, including structured or technical content, which contains repeated phrases, such as manuals and user guides. It can also be used for ‘gist’ purposes, especially in the eDiscovery process in the legal sector.
Evaluating translation quality is all about understanding the end use or purpose of the material, and setting expectations from the beginning. If the intended purpose of the translation is to just get a general understanding of a text, then this should be evaluated in a different way to a translation that needs to be error free and fluent. For example, if 8 out of 10 segments (or sentences) were considered adequate (accurate), this would be a success for a gist translation, but a fail for a translation that needed to be 100% accurate (such as marketing material).
At Capita TI we make use of the TAUS DQF (Dynamic Quality Framework) tools for MT evaluation, which allows us to quickly compare and rank various MT engines by evaluating the quality of the ‘raw’ translation output based on different QA models including adequacy, fluency and error typology scoring, as well as tracking post-editing productivity. Using a combination of these tools and techniques, tweaked to each MT requirement, the performance of MT engines can be evaluated in a much more focused way – putting end use and quality expectations at the forefront.
One of the key questions arising from the growth of machine translation is the concern around the input of human linguists. One of the key message Capita TI supports is that it’s “MT and translators not MT vs. translators”.
For many MT projects, post-editing is a key stage, involving input from a human translator, to ensure accuracy and a high-quality output. This level of input can vary, from ‘light’ to ‘full’. We recruit and train linguists specialised in post-editing, in order to guarantee a high-quality translation, whilst keeping costs down for the customer.
Here’s an example of how we evaluate and benchmark our translation engines:
Capita Domain Engine
Capita Customised Engine
|Domain||Language pair||Productivity rating||Corpora size (million words)||BLEU score||PE rates (WPH)||Productivity rating||Corpora size (million words)||BLEU score||PE rates (WPH)||Productivity rating|
|Retail||EN > DE||66||8||55||835||70||2.5||65||1247||80|
|Retail||EN > FR||72||6.4||60||900||75||3||65||1350||82|
|IT||EN > IT||74||8.5||75||950||80||4.5||75||990||80|
From your account manager, to file engineers, to post-editing linguists, your documents and content will need to be reviewed, checked and proofread at several stages. Experienced, mature MT providers will ensure that the entire process remains secure and confidential by adopting a centralised and protected environment, where users log in to performs tasks, rather than sending unencrypted emails to the various individuals.
Neither your organisation nor your MT provider should be using free/open-source machine translation engine, where you can upload documents without a secure login, as the confidentiality of your content becomes uncertain. Ensure that you’re using a secure machine translation environment, which is only available to you and the MT provider.
In order to guarantee machine translation that is not only accurate, but also reflects your brand and tone of voice, your language assets need to be continually updated with the latest data, including glossaries and translation memory. Typically, translation memories are in the form of bi-lingual files created as the result of the post-editing process. The frequency of these updates is up to you as the customer, but your MT provider should suggest timescales, based on the volume of work.
Glossaries are kept separate from the machine translation engine and can be maintained and updated independently. Maintaining a glossary ensures that your brand specific terminology is reproduced in other languages by the machine translation service.
As with most technologies, machine translation is evolving all the time. The more advanced MT providers have embraced this change and have started testing and deploying neural machine translation engines across a range of language pairs and subject matters. The advantage of this technology is that the output is more fluent and easier to read, having said that, it isn’t a case of ‘neural is better than statistical’, nor vice versa – the choice of technology should be driven by the results gained, and a blended approach could deliver the best results.
At Capita TI, our Language Solutions team is made up of language technology experts who constantly develop and improve our machine translation engines, processes and systems. To find out more or to request a quote, leave your details in the below form.