Analytical validation of a standardised scoring protocol for Ki67 immunohistochemistry on breast cancer excision whole sections: an international multicentre collaboration

BIG-NABCG, Samuel C. Y. Leung*, Torsten O. Nielsen, Lila A. Zabaglo, Indu Arun, Sunil S. Badve, Anita L. Bane, John M. S. Bartlett, Signe Borgquist, Martin C. Chang, Andrew Dodson, Anna Ehinger, Susan Fineberg, Cornelia M. Focke, Dongxia Gao, Allen M. Gown, Carolina Gutierrez, Judith C. Hugh, Zuzana Kos, Anne-Vibeke LaenkholmMauro G. Mastropasqua, Takuya Moriya, Sharon Nofech-Mozes, C. Kent Osborne, Frederique M. Penault-Llorca, Tammy Piper, Takashi Sakatani, Roberto Salgado, Jane Starczynski, Tomoharu Sugie, Bert van der Vegt, Giuseppe Viale, Daniel F. Hayes, Lisa M. McShane, Mitch Dowsett

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

38 Citations (Scopus)


Aims The nuclear proliferation marker Ki67 assayed by immunohistochemistry has multiple potential uses in breast cancer, but an unacceptable level of interlaboratory variability has hampered its clinical utility. The International Ki67 in Breast Cancer Working Group has undertaken a systematic programme to determine whether Ki67 measurement can be analytically validated and standardised among laboratories. This study addresses whether acceptable scoring reproducibility can be achieved on excision whole sections. Methods and results Adjacent sections from 30 primary ER+ breast cancers were centrally stained for Ki67 and sections were circulated among 23 pathologists in 12 countries. All pathologists scored Ki67 by two methods: (i) global: four fields of 100 tumour cells each were selected to reflect observed heterogeneity in nuclear staining; (ii) hot-spot: the field with highest apparent Ki67 index was selected and up to 500 cells scored. The intraclass correlation coefficient (ICC) for the global method [confidence interval (CI) = 0.87; 95% CI = 0.799-0.93] marginally met the prespecified success criterion (lower 95% CI >= 0.8), while the ICC for the hot-spot method (0.83; 95% CI = 0.74-0.90) did not. Visually, interobserver concordance in location of selected hot-spots varies between cases. The median times for scoring were 9 and 6 min for global and hot-spot methods, respectively. Conclusions The global scoring method demonstrates adequate reproducibility to warrant next steps towards evaluation for technical and clinical validity in appropriate cohorts of cases. The time taken for scoring by either method is practical using counting software we are making publicly available. Establishment of external quality assessment schemes is likely to improve the reproducibility between laboratories further.

Original languageEnglish
Pages (from-to)225-235
Number of pages11
Issue number2
Publication statusPublished - Aug-2019


  • analytical validity
  • immunohistochemistry
  • interobserver reproducibility
  • interobserver variability
  • Ki67
  • pathology
  • scoring protocol

Cite this