Development of an automated marker corpus of the kazakh language
DOI:
https://doi.org/10.51301/vest.su.2021.v143.i1.06Keywords:
corpus, labeled corpus, Linguistics, corpus linguistics, corpus technology, tokenization, lemmatization.Abstract
Article about the convergence of the Kazakh language with technologies. Because in the future, all the world around us will be closely connected to technology. It is as if new words in everyday life, new positions being formed, are the messenger of transformation.Information technologies and the development of the Internet strengthen communication links between members of society. This, in turn, led to the consolidation and accumulation of highly developed digital information. In fact, information exchange is not only a technological connection, but also a complex linguistic phenomenon.Problems such as people use of lingual means tongue, the use of phrases, understanding the structural data environment, have become a significant field of linguistic knowledge, combined with linguistics and computer science arose the subject area of computational linguistics.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 VESTNIK KAZNRTU
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
<div class="pkpfooter-son">
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/80x15.png"></a><br>This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.
</div>