Customization

LexWorks offers two different customization packages:

    1. Customization as part of a full localization process in which we deliver ready-to-use translation from our MT process in the quality you require (post-edited or not, depending on your needs).

    2. Stand-alone customization services where we train and maintain engines that you employ on your premises.

An example of the first service package, customization as part of a full localization process, is an engineering software publisher who simply wishes to have the benefits of MT without changing their traditional outsourced localization process.

An example of the second service package, stand-alone customization services, is a global bank that has MT installed behind their firewall, for the use of their staff around the world, and simply asks us to customize and maintain the engines on their behalf.

Both packages are popular for different reasons, and depend on whether a company wishes an outsourced solution, or whether it has the internal resources to manage the full MT process.

The Importance of Customization

MT customization is the single most important activity for high-performance MT. Whether rules-based or statistical, the engine must be tuned to your domain. (“Magazine” for example, has quite a different translation in a military context compared to a media context.) Customizing the engine on your domain and also your specific content and terminology improves quality significantly. This in turn improves productivity because post-editors can work more quickly to bring the results up to a fully human level of quality. Higher post-editing speed means that your localized content can reach your international markets faster than ever before. And at less cost.

In 2010, LexWorks conducted a study comparing the impact of a trained engine with an untrained engine. Published in Multilingual Magazine, our study demonstrated for the first time that a customized engine doubles post-editing productivity over a generic one.

Depending on your content type, and on other factors such as whether you already have extensive translation memories or not, we can:

  • customize rules-based hybrid engines, or
  • build custom statistical engines from scratch

Customizing an RBMT System

Customizing a rules-based machine translation engine involves creating domain-specific dictionaries, and encoding each entry with semantic information including parts of speech and inflections. Dictionary building is an ongoing process that encodes, tests, adjusts and tests again. Additionally, dictonaries for sub-domains may be arranged in separate profiles and ranked differently according to the content to be translated.

Customizing an SMT System

While training a rules-based engine requires considerable linguistic expertise, training a statistical engine calls mostly for data (the more the better, as long as it is “clean”) and processing resources. For example, data required may run from 2 million to 5 million or more words, both in bilingual format (e.g. Translation Memories) and in source language monolingual text. Processing resources required is on the order of one server per language.

Customizing a Hybrid System

When customizing a hybrid system, SMT capabilities are used to automatically build a language and domain model while RBMT techniques ensure terminology employed has all the correct inflections.