کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1128360 | 954883 | 2013 | 25 صفحه PDF | دانلود رایگان |
• Sub-corpus topic modeling (STM) allows discovery of passages from “great unread”.
• STM increases ability to discuss aspects of influence and intellectual movements.
• We detail the STM dashboard—an intuitive interface for navigating sub-corpus topics.
• We illustrate benefits of STM via three experiments applicable to the Humanities.
• We discuss future of STM, proposing it as tool for Humanities in the “big data” era.
Given a small, well-understood corpus that is of interest to a Humanities scholar, we propose sub-corpus topic modeling (STM) as a tool for discovering meaningful passages in a larger collection of less well-understood texts. STM allows Humanities scholars to discover unknown passages from the vast sea of works that Moretti calls the “great unread” and to significantly increase the researcher's ability to discuss aspects of influence and the development of intellectual movements across a broader swath of the literary landscape. In this article, we test three typical Humanities research problems: in the first, a researcher wants to find text passages that exhibit similarities to a collection of influential non literary texts from a single author (here, Darwin); in the second, a researcher wants to discover literary passages related to a well understood corpus of literary texts (here, emblematic texts from the Modern Breakthrough); and in the third, a researcher hopes to understand the influence that a particular domain (here, folklore) has had on the realm of literature over a series of decades. We explore these research challenges with three experiments.
Journal: Poetics - Volume 41, Issue 6, December 2013, Pages 725–749