Head-to-head comparison of 14 prediction models for postoperative delirium in elderly non-ICU patients: an external validation study

Chung Kwan Wong, Barbara C van Munster, Athanasios Hatseras, Else Huis In 't Veld, Barbara L van Leeuwen, Sophia E de Rooij, Rick G Pleijhuis*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)
31 Downloads (Pure)


OBJECTIVES: Delirium is associated with increased morbidity, mortality, prolonged hospitalisation and increased healthcare costs. The number of clinical prediction models (CPM) to predict postoperative delirium has increased exponentially. Our goal is to perform a head-to-head comparison of CPMs predicting postoperative delirium in non-intensive care unit (non-ICU) elderly patients to identify the best performing models.

SETTING: Single-site university hospital.

DESIGN: Secondary analysis of prospective cohort study.

PARTICIPANTS AND INCLUSION: CPMs published within the timeframe of 1 January 1990 to 1 May 2020 were checked for eligibility (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). For the time period of 1 January 1990 to 1 January 2017, included CPMs were identified in systematic reviews based on prespecified inclusion and exclusion criteria. An extended literature search for original studies was performed independently by two authors, including CPMs published between 1 January 2017 and 1 May 2020. External validation was performed using a surgical cohort consisting of 292 elderly non-ICU patients.

PRIMARY OUTCOME MEASURES: Discrimination, calibration and clinical usefulness.

RESULTS: 14 CPMs were eligible for analysis out of 366 full texts reviewed. External validation was previously published for 8/14 (57%) CPMs. C-indices ranged from 0.52 to 0.74, intercepts from -0.02 to 0.34, slopes from -0.74 to 1.96 and scaled Brier from -1.29 to 0.088. Based on predefined criteria, the two best performing models were those of Dai et al (c-index: 0.739; (95% CI: 0.664 to 0.813); intercept: -0.018; slope: 1.96; scaled Brier: 0.049) and Litaker et al (c-index: 0.706 (95% CI: 0.590 to 0.823); intercept: -0.015; slope: 0.995; scaled Brier: 0.088). For the remaining CPMs, model discrimination was considered poor with corresponding c-indices <0.70.

CONCLUSION: Our head-to-head analysis identified 2 out of 14 CPMs as best-performing models with a fair discrimination and acceptable calibration. Based on our findings, these models might assist physicians in postoperative delirium risk estimation and patient selection for preventive measures.

Original languageEnglish
Article numbere054023
Number of pages11
JournalBMJ Open
Issue number4
Publication statusPublished - 8-Apr-2022


  • Aged
  • Delirium/diagnosis
  • Humans
  • Prospective Studies

Cite this