Development of a model and software solution for the problem of determining unknown words in post-editing machine translation

Authors

  • D.R. Rakhimova al-Farabi Kazakh National University
  • N.M. Pazylkhan al-Farabi Kazakh National University
  • A.A. Kulzhanova al-Farabi Kazakh National University
  • Zh.G. Alen al-Farabi Kazakh National University

DOI:

https://doi.org/10.51301/vest.su.2021.v143.i1.08

Keywords:

machine translation, NLTK, morphological analysis, unknown words, machine translation post-editing.

Abstract

Machine translation is the technology of consecutive translation of texts from one language to another by a computer program. As a result of machine translation, there are always certain disadvantages that can be solved by post-editing. Post-editing-human processing of text after machine translation. Today, many language providers are actively developing this field, developing methods of training editors and post-editing methods. The article provides an overview of existing methods for finding unknown words in post-editing machine translation. In this paper, we consider the problem of determining unknown words in post-editing machine translation for the Kazakh language. The analysis of existing methods for finding unknown words in post-editing machine translation is carried out. A model for the development of unknown words in post-editing machine translation for the English-Kazakh and Russian-Kazakh languages, practical results and software implementation are presented.

Published

2021-02-28

How to Cite

Рахимова, Д. ., Пазылхан, Н. ., Кульжанова A. ., & Ален , Ж. . . . (2021). Development of a model and software solution for the problem of determining unknown words in post-editing machine translation. Engineering Journal of Satbayev University, 143(1), 46–53. https://doi.org/10.51301/vest.su.2021.v143.i1.08

Issue

Section

Physics and Mathematics