TY - JOUR
T1 - A white paper on good research practices in benchmarking
T2 - The case of cluster analysis
AU - Van Mechelen, Iven
AU - Boulesteix, Anne Laure
AU - Dangl, Rainer
AU - Dean, Nema
AU - Hennig, Christian
AU - Leisch, Friedrich
AU - Steinley, Douglas
AU - Warrens, Matthijs J.
N1 - Funding Information:
The work on this paper has been supported in part by the Research Foundation – Flanders (grants G080219N and K802822N to Iven Van Mechelen), by the Research Fund of KU Leuven (grant C14/19/054, Co‐PI: Iven Van Mechelen), by the German Research Foundation (grants BO3139/7–1 and /9‐1 to Anne‐Laure Boulesteix), by the Bundesministerium für Bildung und Forschung (grant 01IS18036A, Co‐PI: Anne‐Laure Boulesteix) and by the Engineering and Physical Sciences Research Council (grant EP/K033972/1 to Christian Hennig).
Publisher Copyright:
© 2023 The Authors. WIREs Data Mining and Knowledge Discovery published by Wiley Periodicals LLC.
PY - 2023/11
Y1 - 2023/11
N2 - To achieve scientific progress in terms of building a cumulative body of knowledge, careful attention to benchmarking is of the utmost importance, requiring that proposals of new methods are extensively and carefully compared with their best predecessors, and existing methods subjected to neutral comparison studies. Answers to benchmarking questions should be evidence-based, with the relevant evidence being collected through well-thought-out procedures, in reproducible and replicable ways. In the present paper, we review good research practices in benchmarking from the perspective of the area of cluster analysis. Discussion is given to the theoretical, conceptual underpinnings of benchmarking based on simulated and empirical data in this context. Subsequently, the practicalities of how to address benchmarking questions in clustering are dealt with, and foundational recommendations are made based on existing literature. This article is categorized under: Fundamental Concepts of Data and Knowledge > Data Concepts Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Technologies > Structure Discovery and Clustering.
AB - To achieve scientific progress in terms of building a cumulative body of knowledge, careful attention to benchmarking is of the utmost importance, requiring that proposals of new methods are extensively and carefully compared with their best predecessors, and existing methods subjected to neutral comparison studies. Answers to benchmarking questions should be evidence-based, with the relevant evidence being collected through well-thought-out procedures, in reproducible and replicable ways. In the present paper, we review good research practices in benchmarking from the perspective of the area of cluster analysis. Discussion is given to the theoretical, conceptual underpinnings of benchmarking based on simulated and empirical data in this context. Subsequently, the practicalities of how to address benchmarking questions in clustering are dealt with, and foundational recommendations are made based on existing literature. This article is categorized under: Fundamental Concepts of Data and Knowledge > Data Concepts Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Technologies > Structure Discovery and Clustering.
KW - conceptual underpinnings
KW - foundational recommendations
KW - method comparison
UR - https://www.scopus.com/pages/publications/85165932317
U2 - 10.1002/widm.1511
DO - 10.1002/widm.1511
M3 - Article
AN - SCOPUS:85165932317
SN - 1942-4787
VL - 13
JO - Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
JF - Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
IS - 6
M1 - e1511
ER -