Abstract
The hype and hope surrounding big data has resulted in articles warning the scientific community about their pitfalls: problems surrounding subjects’ privacy, difficult access reducing reproducibility, non-representative sampling, lack of theory, and the acceptance of accidental results as substantively significant. Still, these articles remain undecided in their conclusions about what big data actually mean for sociology. Could they change the process of doing research, tempting the scientists to depart from the current advocated practices of their field? This paper seeks to answer this question by describing the current main scientific paradigm in the field of sociology and using the issues of access, privacy, sampling, theory and multiple testing to observe the extent of this departure, defined as scholarly negligence. We analyse sociological papers applying big data published from 2008 until March 2017, identified through a systematic literature review. A growing popularity of big data within sociology and most of its subfields is observed (52 articles, 0.7% of all in sociology, used big data as of 2016), together with a positive association between researchers’ experience with big data and lower levels of scholarly negligence for all the issues, except for multiple testing. We also find an association between scholarly negligence and articles being published more recently.
Original language | English |
---|---|
Publisher | OSF Preprints |
DOIs | |
Publication status | Submitted - 18-Dec-2022 |
Externally published | Yes |