Opis
Feasibility of DeepL, Google and Microsoft MT systems implementation into the translation process
Przekładając Nieprzekładalne IX
Autor: Maciej Kur
Wydawnictwo Uniwersytetu Gdańskiego
This book presents the methodology and results of a study designed to determine the feasibility of implementation of some of the most popular, widely available and relatively cheap machine translation systems into translation processes carried out in the English -> Polish language pair. A set of outputs provided by DeepL Translator, Google Neural Machine Translation System and Microsoft Translator engines was collected and processed together with translated and post-edited segments produced by a group of Polish translators. The data were then analyzed with the use of various MT evaluation metrics to determine whether the quality of the outputs and amount of post-editing effort was adequate enough to enable efficient implementation of the analyzed engines into the professional environment.
Table of contents
Acknowledgements . .. . 7
Introduction . .. . 9
Chapter 1
The status of research . . . 15
1.1. Basic terms and defi nitions . . 16
1.2. Fundamental models . . . 18
1.2.1. Direct model . . 18
1.2.2. Indirect models . . . . 20
1.2.3. Statistical Machine Translation (SMT) . . . 23
1.2.3.1. Word-based SMT . . 23
1.2.3.2. N-gram-based SMT . .. 26
1.2.3.3. Phrase-based SMT . . . 28
1.2.3.4. Context-based SMT . . . 29
1.2.4. Neural machine translation . . . . 32
1.2.4.1. Discourse in NMT . . .. 37
1.3. Methods of evaluation . .. . 39
1.3.1. Human evaluation . . . 40
1.3.1.1. Early methods . .. . 40
1.3.1.2. Accuracy, comprehension, fl uency . 42
1.3.1.3. Segment ranking metrics . .. . 44
1.3.1.4. HTER . . . 45
1.3.2. Automatic evaluation metrics . . . 48
1.3.2.1. Word-matching metrics . . 49
1.3.2.2. BLEU . .. . 51
1.3.2.3. NIST and METEOR . . . 53
1.4. Post-editing . . . 56
1.4.1. Defi nition and PE-related tasks . .. . 57
1.4.2. Post-editing eff ort . . . 60
1.4.3. Automatic post-editing . . . 63
Chapter 2
Description of the study . .. . 65
2.1. Preparatory stage . . . 66
2.1.1. Data preparation . . . . 66
2.1.2. Workstation preparation . . .. . 68
2.2. Experiment . . . . . 69
2.2.1. Participants . . . . 69
2.2.2. Task 1 – translation . . . . 70
2.2.3. Task 2 – post-editing . . . 71
2.3. Data analysis . . .. 73
2.3.1. Edit time analysis . . . . 73
2.3.2. HTER analysis . . .. 74
2.3.3. Error analysis . . . .. 76
2.3.4. Quality rankings . . .. . 82
Chapter 3
Results . . . . 87
3.1. Post-editing eff ort measurement . .. . 87
3.1.1. Edit time analysis . . . . 87
3.1.2. HTER scores . . .. . 91
3.2. Quality evaluation . . . 96
3.2.1. Error analysis . . . 96
3.2.1.1. “Missing Words” category errors . . 98
3.2.1.2. “Word Order” category errors . . . 108
3.2.1.3. “Incorrect Words” category errors . . .. 111
3.2.1.3.1. “Sense” subcategory errors . .. 112
3.2.1.3.2. “Incorrect Form” subcategory . . . 116
3.2.1.3.3. “Style” subcategory errors . . . 121
3.2.1.3.4. “Extra Words” and “Idioms” subcategories . . 123
3.2.1.4. “Unknown Words” category errors . .. 125
3.2.1.5. “Punctuation” category errors . . . 128
3.2.1.6. “Spelling” category errors . . . 133
3.2.2. Quality rankings . . . 136
3.2.2.1. Ranking A (traditional translation) . . 137
3.2.2.2. Ranking B (post-editing) . . . 141
3.2.3. Duplicated errors . . . 143
Conclusions . . . . 149
References . . .. . 155
Opinie
Na razie nie ma opinii o produkcie.