Post-Editing Best Practices Identified by ACCEPT

accept-logo

The EU-funded ACCEPT Project brings together five leaders in machine translation: the University of Edinburgh (inventors of the Moses engine), the University of Geneva, Symantec, Acrolinx and Lexcelera, the LexWorks parent company. A recent ACCEPT report, prepared by Symantec, identifies three sources of post-editing best practices. The resulting guidelines for post-editors are summarized below.

The Challenge of Post-Editing

Post-editing machine translated content is rapidly becoming an industrial process, and, with the growth of MT usage in the translation industry, the need for post-editing services is growing. However, the process is not well understood, the tools are limited, and, outside the use cases where the post editing process is conducted by professional translators, relatively little work has been published.

Post-Editing Best Practices

The ACCEPT (Automated Community Content Editing PorTal) project has identified three sources of post editing best practice guidelines.

I. Post Editing Guidelines for GALE Machine Translation Evaluation (National Institute of Standards and Technology)

Goal: Make the MT output have the correct meaning, using understandable English, in as few edits as possible.

1. Make the MT output have the same meaning as the reference human translation. No more and no less.

2. Make the MT output be as understandable as the reference.

3. Capture the meaning in as few edits as possible using understandable English. If words/phrases/punctuation in the MT output are completely acceptable, use them (unmodified) rather than substituting something new and different.

4. Punctuation must be understandable, and sentence-like units must have sentence ending punctuation and proper capitalization. Do not insert, delete, or change punctuation merely to follow traditional optional rules about what is “proper.”

II. Machine Translation Post-Editing Guidelines (TAUS)

1. Aim for semantically correct translation.

2. Ensure that no information has been accidentally added or omitted.

3. Edit any offensive, inappropriate or culturally unacceptable content.

4. Use as much of the raw MT output as possible.

5. Basic rules regarding spelling apply.

6. No need to implement corrections that are of a stylistic nature only.

7. No need to restructure sentences solely to improve the natural flow of the text.

III. Post-Editing Machine Translated Text in a Commercial Setting (Midori Tatsumi)

Goal: The post-edited text needs to be easily understandable by the readers. In order to achieve that goal, the text needs to convey the correct meaning of the source text, and conform to Japanese grammar.

However, speed is another important requirement for post-editing processes. Therefore, it is not necessary to spend time aesthetically refining the text; please avoid editing for stylistic sophistication.

A. What needs to be fixed:

1. Non-translatable items, such as command and variable names, that have been translated. Please put them back to English.

2. Inappropriately translated general IT terms.

Guidelines for both Monolingual and Bilingual Post-Editing

From the guidelines expressed above, Symantec selected the following minimum set of best practice advice for their forum users.

Guidelines for Monolingual Post-Editing

  • Try and edit the text by making it more fluent and clearer based on how you interpret its meaning.
  • For example, try to rectify word order and spelling when they are inappropriate to the extent that the text has become impossible or difficult to comprehend
  • If words, phrases, or punctuation in the text are completely acceptable, try and use them (unmodified) rather than substituting them with something new and different.

Guidelines for Bilingual Post-Editing

  • Aim for semantically correct translation.
  • Ensure that no information has been accidentally added or omitted.
  • If words, phrases, or punctuation in the text are completely acceptable, try to use them (unmodified) rather than substituting them with something new and different.