Exploring Machine Learning to Study the Long-Term Transformation of News: Digital newspaper archives, journalism history, and algorithmic transparency

    Research output: Contribution to journalArticleAcademicpeer-review

    9 Citations (Scopus)
    396 Downloads (Pure)

    Abstract

    The labour-intensive nature of manual content analysis and the problematic accessibility of source material make quantitative analyses of news content still scarce in journalism history. However, the digitization of newspaper archives now allows for innovative digital methods for systematic longitudinal research beyond the scope of incidental case studies. We argue that supervised machine learning offers promising approaches to analyse abundant source material, ground analyses in big data, and map the structural transformation of journalistic discourse longitudinally. By automatically analysing form and style conventions, that reflect underlying professional norms and practices, the structure of news coverage can be studied more closely. However, automatically classifying latent and period-specific coding categories is highly complex. The structure of digital newspaper archives (e.g. segmentation, OCR) complicates this even more, while machine learning algorithms are often a black box. This paper shows how making classification processes transparent enables journalism scholars to employ these computational methods in a reliable and valid way. We illustrate this by focusing on the issues we encountered with automatically classifying news genres, an illuminating but particularly complex coding category. Ultimately, such an approach could foster a revision of journalism history, particularly the often hypothesized but understudied shift from opinion-based to fact-centred reporting.
    Original languageEnglish
    Pages (from-to)1150-1164
    Number of pages15
    JournalDigital Journalism
    Volume6
    Issue number9
    Early online date11-Oct-2018
    DOIs
    Publication statusPublished - 2018

    Keywords

    • Journalism history
    • Machine learning
    • (automatic) content analysis
    • digital newspaper archives
    • digitization
    • news genres
    • algorithmic transparency
    • BIG DATA
    • TEXT

    Fingerprint

    Dive into the research topics of 'Exploring Machine Learning to Study the Long-Term Transformation of News: Digital newspaper archives, journalism history, and algorithmic transparency'. Together they form a unique fingerprint.

    Cite this