Exploring Machine Learning to Study the Long-Term Transformation of News: Digital newspaper archives, journalism history, and algorithmic transparency

Marcel Broersma*, Frank Harbers

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

    45 Downloads (Pure)

    Abstract

    The labour-intensive nature of manual content analysis and the problematic accessibility of source material make quantitative analyses of news content still scarce in journalism history. However, the digitization of newspaper archives now allows for innovative digital methods for systematic longitudinal research beyond the scope of incidental case studies. We argue that supervised machine learning offers promising approaches to analyse abundant source material, ground analyses in big data, and map the structural transformation of journalistic discourse longitudinally. By automatically analysing form and style conventions, that reflect underlying professional norms and practices, the structure of news coverage can be studied more closely. However, automatically classifying latent and period-specific coding categories is highly complex. The structure of digital newspaper archives (e.g. segmentation, OCR) complicates this even more, while machine learning algorithms are often a black box. This paper shows how making classification processes transparent enables journalism scholars to employ these computational methods in a reliable and valid way. We illustrate this by focusing on the issues we encountered with automatically classifying news genres, an illuminating but particularly complex coding category. Ultimately, such an approach could foster a revision of journalism history, particularly the often hypothesized but understudied shift from opinion-based to fact-centred reporting.
    Original languageEnglish
    Title of host publicationJournalism History and Digital Archives
    EditorsHenrik Bødker
    Place of PublicationLondon
    PublisherTaylor & Francis Group
    Chapter3
    Number of pages15
    ISBN (Electronic)9781003098843
    ISBN (Print)9780367566616
    DOIs
    Publication statusPublished - 2021

    Fingerprint

    Dive into the research topics of 'Exploring Machine Learning to Study the Long-Term Transformation of News: Digital newspaper archives, journalism history, and algorithmic transparency'. Together they form a unique fingerprint.

    Cite this