Samenvatting
This chapter presents a multimethod, multidisciplinary analysis of genre in a large dataset of 9,800 English novels, in order to deepen our understanding of aspects of fiction genres and subgenres. We specifically focus on applying well-established, interpretable methods, in order to benefit scholars from a variety of disciplines. Objects of our analysis are written texts and the linguistic features of the texts. We approach the analysis from two directions: data-driven, with topic modeling of content words, and theory-driven, with features Douglas Biber selected for his research on register, for example, in his 1988 book Variation across Speech and Writing, as well as simple readability metrics. We illustrate these methods by applying them to a corpus of fiction (novels). The texts in our corpora are English, but our methods aim to be also applicable to corpora in other languages. The research questions we try to answer with the proposed methods are whether different kinds of novels (“subgenres”) can be distinguished from each other in their use of linguistic features, and what the results of the computational methods can reveal to researchers to assist them in a renewed qualitative analysis of the texts or in phrasing new hypotheses for further research. Code and data are available at https://github.com/andreasvc/fictiongenres/
Originele taal-2 | English |
---|---|
Titel | Multidisciplinary Views on Discourse Genre |
Subtitel | A Research Agenda |
Redacteuren | Ninke Stukker, John A. Bateman, Danielle McNamara, Wilbert Spooren |
Uitgeverij | Routledge |
Hoofdstuk | 6 |
Pagina's | 135-167 |
Aantal pagina's | 33 |
ISBN van elektronische versie | 9781003335603 |
ISBN van geprinte versie | 9781032371610 |
DOI's | |
Status | Published - 30-sep.-2024 |