Designing Controlled Chinese Rules for MT Pre-Editing of Product Description Text

Designing Controlled Chinese Rules for MT Pre-Editing of Product Description Text

Ying Zheng, Chang Peng, Yuanyuan Mu
DOI: 10.4018/IJTIAL.313919
Article PDF Download
Open access articles are freely available for download

Abstract

The study aims to investigate how pre-editing based on controlled Chinese rules can be an effective approach to improving Chinese-to-English machine translation (MT) output. Based on the analysis of comparable texts, and by considering the rules of modern business writing and the differences in sentence structure between Chinese and English languages, four controlled Chinese rules for product description text are proposed: 1. Every sentence should have an explicit subject; 2. There are no repetitive complimentary expressions; 3. Sentences should be short; and 4. The Chinese sentence structure should be complete, with clear logical relationships within and between sentences. In accordance with the four rules, five corresponding pre-editing methods are introduced. Then, taking Xiaomi Air2 SE earphones' product description as an experimental text, the study examines the influence of pre-editing in accordance with controlled Chinese rules on MT output quality. The results show that such pre-editing can significantly improve MT output in dimensions of adequacy, fluency, and style.
Article Preview
Top

Pre-Editing Based On Controlled Language

Controlled language (CL) refers to “subset of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity”(ISO, 2017). Controlled language, commonly controlled English, has been mainly applied to technical documents including technical guides and maintenance manuals in technology industries such as aviation since the 1970s, with the aim of improving the readability and translatability of technical texts. Since 2003, the research on CL has turned to controlled translation, i.e., the combination of CL and MT. CL can effectively eliminate the ambiguity of the source language content, which is the biggest challenge for MT, greatly improve the accuracy of MT and reduce post-translation editing workload (Yuan, 2003). Studies on CL have focused on translation between European languages (such as English, German and French), and the results show a generally positive impact of CL rules on the MT output (Bernth & Gdaniec, 2001; Reuther, 2003; Marzouk, 2021). Marzouk’s recent study (2021) indicates that neural MT offers a promising solution that no longer requires CL rules for improving the MT output in the context of technical texts. But we wonder if the idea of CL can be introduced to the MT of non-technical texts, and we aim to answer that question in this paper.

Complete Article List

Search this Journal:
Reset
Volume 6: 1 Issue (2024)
Volume 5: 1 Issue (2023)
Volume 4: 2 Issues (2022)
Volume 3: 2 Issues (2021)
Volume 2: 2 Issues (2020)
Volume 1: 2 Issues (2019)
View Complete Journal Contents Listing