Feature
Evaluation of Machine Translation
by Mike Unwalla
To promote a new service, I want to show potential customers that machine translation can give good results. Because I did not find a suitable evaluation on the Internet, I designed a small evaluation. The evaluation confirms that if text is optimised for machine translation, machine translation gives good results.
Many guidelines explain how to write for machine translation. The evaluation covered the following guidelines:
- Use a word with its primary meaning. See Table 1, sentence pairs 1, 2, 3, and 6
- Use syntactic cues. See Table 1, sentence pair 4
- Use short sentences. See Table 1, sentence pair 5.
- Keep the parts of a phrasal verb together. See Table 1, sentence pair 6.
There are six pairs of texts. The first text in a pair is standard English. The second text in a pair is equivalent to the standard English text, but it is optimised for machine translation. I used Google Translate to translate the texts into Bulgarian and into Spanish.
Professional translators evaluated the translations. First, each translator evaluated the translations for fluency. Next, each translator evaluated the translations for accuracy.
Fluency
Each translator evaluated each translation for fluency. The translators did not look at the source text. The translators chose one of the following levels of fluency:
- Perfect fluency
- Acceptable fluency
- Low-quality fluency
- Incomprehensible
A text can be both fluent, and without meaning. For example, the sentence "sleeping green clouds dream furiously" is fluent, because it obeys all the rules of English grammar. However, the sentence has no meaning in the real world. In the evaluation of fluency, the translators were interested only in fluency.
Figure 1 summarises the results of the evaluation for fluency.

Figure 1. Fluency of Machine Translation for Standard English and for Optimized English
The increase in fluency for machine translation of optimised English is small. For one translation in Bulgarian, there is a decrease in fluency.
Accuracy
Each translator compared each translation with the English source text, and evaluated the translation for accuracy compared with the source text. The translators chose one of the following levels of quality:
- Correct meaning
- Ambiguous meaning. The translation has more than one meaning. One meaning is correct. Other meanings are not correct.
- Incorrect meaning. The translation has one meaning, and the meaning is not correct.
- Nonsense
Figure 2 summarises the results of the evaluation for accuracy.

Figure 2. Accuracy of Machine Translation for Standard English and for Optimised English
Table 1 shows the source text and the translators' evaluations.
Table 1. Evaluation of Machine Translation for Accuracy
| Pair | English Source Text | Bulgarian Accuracy | Spanish Accuracy |
|---|---|---|---|
| 1a | The request came out of the blue. | Nonsense | Correct |
| 1b | The request came unexpectedly. | Correct | Correct |
| 2a | If the bid falls through, we will lose much money. | Nonsense | Incorrect |
| 2b | If the bid is not successful, we will lose much money. | Correct | Correct |
| 3a | If the material is hard, do not study it. | Nonsense | Correct |
| 3b | If the material is difficult, do not study it. | Nonsense | Correct |
| 4a | The machine on the left is broken. | Correct | Ambiguous |
| 4b | The machine that is on the left is broken. | Correct | Correct |
| 5a | Our education zone gives you an insight into how various people within the organisation work, from the people known as the uniformed officers who check your bags at ports and airports, to the ‘The VAT Man’. | Nonsense | Nonsense |
| 5b | Our education zone tells you about the work of people in the organisation, such as the uniformed officers who check your bags at ports and airports, and the ‘The VAT Man’. | Correct | Nonsense |
| 6a | They brought the matter of expenses up. | Nonsense | Incorrect |
| 6b | They brought up the matter of expenses. | Correct | Correct |
Comments
The sample size is too small for the evaluation to have statistical validity. To increase the quality of the evaluation, we must do the following things:
- Increase the sample size.
- Evaluate many languages.
- For each language, use many translators to evaluate the translations.
- Compare different machine translation tools.
For full details of the evaluation, see http://www.international-english.co.uk/mt-evaluation-details.html.
This article is optimised for machine translation. If you use a machine translation tool, you will probably get a sufficiently good translation.
References and resources
Note: All electronic documents were retrieved on 2nd March 2009.
Bernth, A. &, Gdaniec C. (2000). MTranslatability AMTA-2000 Tutorial. Information Sciences Institute http://www.isi.edu/natural-language/organizations/amta/sig-mtranslatability-tutorial.htm.
Kohl, J. R. (2008). The Global English style guide: writing clear, translatable documentation for a global market. Cary, NC: SAS Institute Inc. (For a review of the book, see http://www.techscribe.co.uk/ta/global-english-style-guide.htm.)
Machine Translation Archive. Repository and bibliography of articles, books and papers on topics in machine translation and computer-based translation tools http://www.mt-archive.info.
Muegge, U. Rules for machine translation. http://www.muegge.cc/controlled-language.htm.
O’Connell, T. (2001). Preparing your Web site for machine translation. IBM developerWorks® http://www.ibm.com/developerworks/library/us-mt/.
Wells Akis, J. & Sisson, W. R. (2002). Improving Translatability: A Case Study at Sun Microsystems, Inc. The Globalisation Insider http://www.lisa.org/globalizationinsider/2002/12/improving_trans.html.
Some machine translation websites
****************
of TechScribe writes user documentation for computer software. He uses controlled language in the documentation that he writes. To learn about the services that TechScribe offers, see http://www.techscribe.co.uk. To learn more about writing for machine translation, see http://www.international-english.co.uk (a TechScribe website).
