A Corpus-Based Investigation of the Distributional Patterns of English and Chinese Pronouns
Hengbin Yan, Yinghui Li
Abstract
In this study, we adopt a corpus-based approach to the analysis of the distributional patterns of major types of pronouns across different genres in two comparable balanced corpora in English and Chinese. Utilizing results from state-of-the-art grammatical parsers, we find considerable variation in the distribution of pronouns in different genres. While English tends to employ consistently more pronouns in every genre than Chinese, the distributional patterns of pronouns in the two languages across the genres are highly patterned and significantly correlated with one another, suggesting that pronouns can play similar functional roles in varying contextual situations in the two languages. Of the subtypes of pronouns in the two languages, five are found to be directly comparable. Personal pronouns are found to have the most similar (correlated) genre distribution in the two languages, while demonstrative pronouns share the least similarity. The distributional patterns for each pronoun type are investigated and their underlying functional and cultural implications discussed. Our study suggests that the identification of these classes of pronouns can in large part be automated with the help of state-of-the-art part-of-speech taggers and dependency parsers. The results of this study can inform future research and application involving pronouns, with implications ranging from cross-linguistic studies of grammatical features to second language acquisition.
Full Text: PDF