CORPUS ANALYSIS IN THE EXPRESSION OF STATUS VERBS

Karimov Nodirbek  Nosirjon o‘g’li

Authors

Karimov Nodirbek Nosirjon o‘g’li student of Namangan State University

Keywords:

EXPRESSION, CORPUS ANALYSIS

Abstract

A corpus is a language resource consisting of a large and systematized set of texts. In corpus linguistics, they are used to perform statistical analyses, to test views, linguistic phenomena or theoretical rules within a specific language or a specific section of the language. A corpus can consist of textual data in one language or several languages. A corpus usually means a textual corpus, but nowadays corpora are no longer just texts. Therefore, instead of the word corpus, we use the concept of text corpus. Corpora are annotated to make language research more efficient. For example, one type of corpus annotation is word tagging (POS-tagging). This means tagging based on the category of the word and the categories of this category. That is, the word "kutdim" carries the following information: verb, singular, tense, person-number. The same information is attached to the word through tags. Another form of annotation is lemmatization, which is to indicate the base form of a word. For example, the base of the words "kutdim", "kutgandim", "kutganimga" is the same - the verb "to wait". This is called lemmatization. The concepts of root and base should not be confused here. For example, the word "bostirma" is formed in the form "bostir+ma", but we cannot consider the word “bostir” as a lemma in its rooting, “bostirma” is a single word. If we need to root the words "bostirmada", "bostirmaga", "bostirmaning", then it will be correct to take the word suppression. In simple terms, a lemma is a part of a word that omits form-forming suffixes.

CORPUS ANALYSIS IN THE EXPRESSION OF STATUS VERBS

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

additional menu

IF

INFORMATION

ADDRESS

COPYRIGHT