A dataset of Dutch Novels 1800-2000

OnderzoeksoutputAcademic

Samenvatting

Why is one novel still read, while another is forgotten? Literary scholars answer that the first is part of the canon, while the other is not. But what determines “canonicity”? Canonicity is a contentious topic. Literary scholars make lists of the most important authors of all time², but are such lists completely subjective? Do they systematically exclude important books by authors that do not fit a preconceived pattern such as white male? Could there be objective textual features that partly explain the value judgments leading to this demarcation? In this project I created a dataset for exploring such questions (as well as code and a tool). In this post I describe the corpus composition, metadata, and textual features that are part of the dataset. In the next post we will use the dataset to determine to what extent canonicity can be classified using textual features.
Originele taal-2English
Mijlpalentype toekennenResearch blog post
Outputmedialab.kb.nl
UitgeverKoninklijke Bibliotheek
StatusPublished - 24-jan.-2022

Vingerafdruk

Duik in de onderzoeksthema's van 'A dataset of Dutch Novels 1800-2000'. Samen vormen ze een unieke vingerafdruk.

Citeer dit