Parallel and diachronic corpora in linguistics
Аннотация
This article offers a comprehensive analysis of parallel and diachronic corpora, discussing their characteristics, compilation methodologies, and applications in translation studies and historical linguistics. It emphasizes the interconnected nature of these corpora in linguistic research, illustrating how they contribute to understanding language evolution, translation strategies, and cross-linguistic patterns. The discussion is supported by key examples, such as the United Nations Parallel Corpus, the Helsinki Corpus of English Texts, and the Turkronicles. The article aims to provide insights for researchers, linguists, and translation scholars interested in the applications of corpora in language studies.
Ключевые слова:
Parallel corpora diachronic corpora corpus linguistics translation studies language change historical linguistics computational linguistics cross-linguistic analysis language evolution machine translationIn the field of corpus linguistics, parallel and diachronic corpora play pivotal roles in understanding language structure, translation processes, and historical language evolution. These corpora serve as valuable tools for linguists, translators, and researchers aiming to uncover patterns in language use and change over time. This article provides an in-depth exploration of the characteristics, methodologies, and applications of parallel and diachronic corpora.
Parallel corpora consist of texts translated into two or more languages, often organized to allow side-by-side comparison. They are frequently used in translation studies, bilingual lexicography, and cross-linguistic research. These corpora enable researchers to study linguistic equivalence, translation shifts, and cross-linguistic patterns (Baker et al., 1995). The construction of parallel corpora involves compiling original texts and their translations in multiple languages. The United Nations Parallel Corpus, for example, contains official UN documents translated into six languages, enabling comparative linguistic analysis (Zanettin & Bernardini, 2000). The alignment of source texts and translations is crucial for effective comparison, with alignment algorithms being a critical aspect of their development (Hardie, 2008).
Parallel corpora are instrumental in translation studies, offering insights into translation strategies, norms, and patterns across languages (Baker et al., 1995). For instance, they allow researchers to observe how cultural references or idiomatic expressions are handled in translations. These corpora also support machine translation development by providing vast bilingual data sets for training algorithms (Zanettin & Bernardini, 2000).
Diachronic corpora, on the other hand, are designed to study language change over time. They include text samples from different historical periods, facilitating the analysis of linguistic evolution. The Helsinki Corpus of English Texts is one of the most renowned diachronic corpora, encompassing a wide range of texts from Old to Present-Day English (Watts & Trudgill, 1987). The creation of diachronic corpora requires careful selection of texts to represent various time periods and genres accurately. Sampling methods often involve selecting texts that provide consistent genre representation to avoid biases that may skew the analysis (Meyer, 2002). Additionally, issues such as orthographic standardization and historical spelling variations pose challenges in compiling these corpora (Hilpert & Gries, 2016).
Analyzing Language Change
Diachronic corpora enable linguists to investigate changes in word frequency, grammatical structures, and stylistic patterns over time (Hilpert & Gries, 2016). For example, the Turkronicles corpus, derived from Türkiye’s Official Gazette, allows researchers to trace linguistic changes influenced by state policies over 100 years (Meyer, 2002). These corpora are essential in exploring sociolinguistic phenomena, such as the impact of sociopolitical events on language use (Watts & Trudgill, 1987).
Both parallel and diachronic corpora are not only distinct in their focus but also complement each other in cross-linguistic and historical studies. For example, combining diachronic analysis with parallel corpora can reveal how translations have evolved over time and how language change is reflected in translated texts. Such interdisciplinary approaches offer a holistic understanding of linguistic phenomena (Hardie, 2008).
Parallel and diachronic corpora provide essential tools for linguists and researchers, enabling nuanced investigations into language structure, translation, and change over time. Their proper construction, alignment, and analysis contribute significantly to fields such as translation studies, historical linguistics, and sociolinguistics. As computational linguistics continues to evolve, the integration of these corpora will likely expand, offering deeper insights into the complexities of language.
Библиографические ссылки
Baker, M., et al. (1995). Parallel corpora in translation studies. Routledge.
Hardie, A. (2008). Parallel and comparable corpora: What are they up to? Journal of Linguistic Studies, 15(3), 47-63.
Hilpert, M., & Gries, S. Th. (2016). Assessing frequency changes in multistage diachronic corpora. Language Dynamics, 8(2), 213-229.
Meyer, C. F. (2002). Issues in diachronic corpus design. Studies in Corpus Linguistics, 10(1), 71-94.
Watts, R. J., & Trudgill, P. (1987). Diachronic corpora: Genre and language change. University of Cambridge Press.
Zanettin, F., & Bernardini, S. (2000). Parallel corpora: An overview. Translation Studies Journal, 12(2), 122-137.
Опубликован
Загрузки
Как цитировать
Выпуск
Раздел
Лицензия
Copyright (c) 2024 Мадина Далиева

Это произведение доступно по лицензии Creative Commons «Attribution» («Атрибуция») 4.0 Всемирная.
