Article ID Journal Published Year Pages File Type
1110972 Procedia - Social and Behavioral Sciences 2015 5 Pages PDF
Abstract

Recurring sequences of words have long been considered as a signifier of different genres and registers by corpus linguists. The previous research mainly focused on lexical n-grams. Nevertheless, n-grams of other linguistic features, such as part-of-speech, have been less studied. The current study is expected to examine whether n-grams of part-of-speech tags extracted from a large corpus can be a discriminator of different genres. The results show that a strong correlation exists between the information about n-grams of part-of-speech tags and the genre of the text.

Related Topics
Social Sciences and Humanities Arts and Humanities Arts and Humanities (General)